Re: I2C, what does "clocked out" mean? - i2c

I'm not trained in EE.
I'm programming a master-receiver device which controls a MAX11644/MAX11645. The datasheet explains the read cycle, saying:
A read cycle must be initiated to obtain conversion results. Read cycles begin with the bus master issuing a START condition followed by seven address bits and a read bit (R/W = 1). If the address byte is successfully received, the MAX11644/MAX11645 (slave) issues an acknowledge. The master then reads from the slave. The result is transmitted in 2 bytes; first 4 bits of the first byte are high, then MSB through LSB are consecutively clocked out.
All of this I understand, except the very last part: "MSB through LSB are consecutively clocked out". Most significant bit? Isn't this the first bit? We already know the first bit in the first byte is hi. And what does "clocked out" mean?

Most significant bit? Isn't this the first bit?
It may or may not be. There's no unambiguous definition of "first". RS232, for example, outputs the least significant bit first. If you mean the one that happens to be output first, then yes, that's what the next part is saying.
We already know the first bit in the first byte is hi.
Right. But the device outputs it anyway.
And what does "clocked out" mean?
It means that they are produced as output on consecutive clock cycles. That is, each time the clock advances, the next bit (in the order defined there) is placed on the output pin.

Related

How can I learn the size of the EEPROM on a chip if documentation is unavailable?

If I have an EEPROM integrated circuit but documentation is not available for it, how can I find out how much memory is available to me?
My first thought was to write some distinct bytes to the first several sequential addresses and then loop through the memory reading each byte until I read my distinct bytes and count how many bytes exist between reading the distinct bytes the first time and the second time. But then I realised that my unsigned data type could be too small and wrap from its largest value back to zero before the last address in the EEPROM was actually reached.
Any software or hardware tricks to learn this information about an unidentified EEPROM integrated circuit would be very much appreciated.
My solution to this problem ends up being pretty close to my theory stated in my question, where I write some recognisable pattern of bytes starting at byte zero of the EEPROM. I then loop through the EEPROM memory, starting at byte zero, and keep track of how many bytes exist between the first time we read our "recognisable pattern of bytes" and the second time. To ensure that I don't read from byte zero a second time before reading every other byte in the EEPROM memory once (due to our counting variable being too small to count up to the size of the EEPROM memory), I then increase the size of my counting variable datatype to be able to count to a higher number if needed. If the number of bytes between the first read of the "recognisable pattern of bytes" and the second is the same with the two different sized counting variable data types, then I know that I have found the correct size of the EEPROM in bytes.

How to calculate branch instruction offset?

Calculating offset of branch instruction:
Hey guys,
My professor sent us a study guide with answers. He never actually went over how he got the answers though. I've searched online but I did not have any luck finding an explanation so at this point I'm a bit desperate.
Does anyone know how the author arrived at those answers?
0xfffb is the 16-bit signed two's complement representation of -5. So in this machine, the offset is scaled by the (presumably fixed) instruction length to get a byte address. (It could have byte sized instructions, but that is not possible since the offset itself is 16-bits.) The architecture is such that by the time the branch is executed the PC is already incremented so a 0 offset is a NOP, a -1 offset branches to the branch itself, -2 branches to the instruction before the branch, etc. Count backwards until you get to loop:. (Either there is some more info attached to the question, or known context, giving the details of the architecture I have used in making the answer, or it is a fairly badly written question.)
For the cache question, you mostly just have to know the names used to describe cache architectures (or "cache geometry", "cache shape" etc.). "2-way set associative" means there are two places in the cache any given address can be placed. There are 128 sets, each of which can hold two blocks because it is 2-way associative, and each block is 32-bytes. (I usually call the 32-byte structure a "cache line", though it appears here the word "block" refers specifically to the data that is stored there and "line" also includes the valid bit and tag, etc.) You then break down the address starting from the least significant bit going outwards in the cache geometry.
It looks like this is an instruction cache so we're going to insist the bottom two bits are 0 and organize the cache in 32-bit items. The block is 32-bytes or 5-bits. 2 are the "byte offset" which probably should just be 0, and then 3 bits to complete the 5-bit part which gets called a "block offset" (really offset within the block). (This subdivison of the 5 low bits doesn't really change anything here.) 128 entries in the set gives a 7-bit "index." The rest of the address, 20 bits, has to be used to tag the block to make sure it holds the address being looked up. (I.e. to determine cache hit or miss.) Plus we need one more bit to say whether there's actually data in the block.
Then we just add it all up -- 32-bytes or 256 bits for the data, plus 20 bits of tag and 1 valid bit, multiply by 128 sets and 2 ways.

Synchronization in Manchester coding

Lately I have been reading about Manchester encoding and I think I'm beginning to understand most of it now, but still I have got some whys that need addressing. Mainly 3 for the moment:
1) Most articles on Internet when introducing Manchester coding start by telling how bad NRZI really was and one of the disadvantages that gets mentioned is that synchronization becomes a problem when lengthy 1's or 0's get sent. Why is that a problem, since most places where NRZI is used have got separate clock and data lines. As long as the clock signal is there why should that ever be a problem?
2) Also, is Manchester supposed to work on a fixed frequency? Or can it work like I2C where clock frequency can be variable?
3) The good thing that gets mentioned about Manchester encoding is that it does not require separate clock line and that clock is embedded in the data and can be recovered by the receiver. Frequent transitions in Manchester help in synchronization and that the transitions happen in the middle and so clock can be recovered from transition. But my question is, if there are repeated 1's or 0's transition can happen in the middle and in the end as well (see attached waveform pic, look at the transitions when sending 111). So when a receiver sees a transition how does it figure out whether it is in the middle or at the end?
If I'm talking rubbish I would love to be corrected.
regarding your third question: I'm also brushing up on manchester and it appears that to recover a clock you need a differential signal:
Reference: "Data Communications, Computer Networks and Open Systems" by Fred Halsall, page 104, figure 3.8
For the 3 question,
Whenever a signal is transmitted, initially a few redundant bits which contain info about clock are sent.
For example, 1111, now the receiver knows the real data will arrive next, and through those redundant bits clock signal is extracted as well as the “notification “ that a signal is going to come.
As for question 1, NRZ scheme can send lengthy 1’s and lengthy 0’s.... but here the problem is actually with lengthy 1’s, if you could check sending lengthy 1’s with some modulation scheme and a dipole antenna, you could observe that the power of carrier signal will start decaying exponentially.
And the other reason would be the power needed to send that many lengthy 1’s, which is not favourable!
For question 2, yes it is possible to use it variable clock frequency but the condition is you should send redundant bits before you could change the clock frequency so that the receiver understands that the clock is changed from this point onwards.
Hope it’s clear now ;)

RISC-V: Immediate Encoding Variants

In the RISC-V Instruction Set Manual, User-Level ISA, I couldn't understand section 2.3 Immediate Encoding Variants page 11.
There is four types of instruction formats R, I, S, and U, then there is a variants of S and U types which are SB and UJ which I suppose mean Branch and Jump as shown in figure 2.3. Then there is the types of Immediate produced by RISC-V instructions shown in figure 2.4.
So my questions are, why the SB and UJ are needed? and why shuffle the Immediate bits in that way? what does it mean to say "the Immediate produced by RISC-V instructions"? and how are they produced in this manner?
To speed up decoding, the base RISC-V ISA puts the most important fields in the same place in every instruction. As you can see in the instruction formats table,
The major opcode is always in bits 0-6.
The destination register, when present, is always in bits 7-11.
The first source register, when present, is always in bits 15-19.
The second source register, when present, is always in bits 20-24.
The other bits are used for the minor opcode or other data for the instruction (funct3 in bits 12-14 and funct7 in bits 25-31), and for the immediate. How many bits can be used for the immediate depends on how many register numbers are present in the instruction:
Instructions with one destination and two source registers (R-type) have no immediate, for instance adding two registers (ADD);
Instructions with one destination and one source register (I-type) have 12 bits for the immediate, for instance adding one register with an immediate (ADDI);
Instructions with two source registers and no destination register (S-type), for instance the store instructions, have also 12 bits for the immediate, but they have to be in a different place since the register numbers are also in a different place;
Finally, instructions with only a destination register and no minor opcode (U-type), for instance LUI, can use 20 bits for the immediate (the major opcode and the destination register number together need 12 bits).
Now think from the other point of view, of the instructions which will use these immediate values. The simplest users, I-immediate and S-immediate, need only a sign-extended 12-bit value. The U-immediate instructions need the immediate in the upper 20 bits of a 32-bit value. Finally, the branch/jump instructions need the sign-extended immediate in the lower bits of the value, except for the lowest bit which will always be zero, since RISC-V instructions are always aligned to even addresses.
But why are the immediate bits shuffled? Think this time about the physical circuit which decodes the immediate field. Since it's a hardware implementation, the bits will be decoded in parallel; each bit in the output immediate will have a multiplexer to select which input bit it comes from. The bigger the multiplexer, the costlier and slower it is.
The "shuffling" of the immediate bits in the instruction encoding, therefore, is to make each output immediate bit have as little input instruction bit options as possible. For instance, immediate bit 1 can only come from instruction bits 8 (S-immediate or B-immediate), 21 (I-immediate or J-immediate), or constant zero (U-immediate or R-type instruction which has no immediate). Immediate bit 0 can come from instruction bits 7 (S-immediate), 20 (I-immediate), or constant zero. Immediate bit 5 can only come from instruction bit 25 or constant zero. And so on.
Instruction bit 31 is a special case: for RV-64, bits 32-63 of the immediate are always copies of instruction bit 31. This high fan-out adds a delay, which would be even bigger if it also needed a multiplexer, so it only has one option (other than constant zero, which can be treated later in the pipeline by ignoring the whole immediate).
It's also interesting to note that only the major opcode (bits 0-6) is needed to know how to decode the immediate, so immediate decoding can be done in parallel with decoding the rest of the instruction.
So, answering the questions:
SB-type doubles the range of branches, since instructions are always aligned to even addresses;
UJ-type has the same overall instruction format as U-type, but the immediate value is in the lower bits instead of the upper bits;
The immediate bits are shuffled to reduce the cost of decoding the immediate value, by reducing the number of choices for each output immediate bit;
The "immediate produced by RISC-V instructions" table shows the different kinds of immediate values which can be decoded from a RISC-V instruction, and from where in the instruction each bit comes from;
They are produced by, for each output immediate bit, using the major opcode (bits 0-6) to chose an input instruction bit.
The encoding is done to try and make the actual hardware implementation as simple as possible, rather than make it easy for the reader to understand at a glance.
In practice the compiler will generate the output and so it does not matter if it is not easy for the user to understand.
When possible the SB type tries to use the same bits for the same immediate bit positions as type S, that minimizes the hardware design complexity. So imm[4:1] and imm[10:5] are in the same place for both. The top most bit of the immediate values is always at position 31 so that you can use that bit to decide if a sign extension is needed. Again, this makes the hardware easier because for multiple types of instruction the top bit is used to decide on sign extension.
The RISC-V instruction encoding is chosen to simplify the decoder
2.2 Base Instruction Formats
The RISC-V ISA keeps the source (rs1 and rs2) and destination (rd) registers at the same position in all formats to simplify decoding. Except for the 5-bit immediates used in CSR instructions(Chapter 9), immediates are always sign-extended, and are generally packed towards the left most available bits in the instruction and have been allocated to reduce hardware complexity. In particular, the sign bit for all immediates is always in bit 31 of the instruction to speed sign-extension circuitry.
2.3 Immediate Encoding Variants
The only difference between the S and B formats is that the 12-bit immediate field is used to encode branch offsets in multiples of 2 in the B format. Instead of shifting all bits in the instruction-encoded immediate left by one in hardware as is conventionally done, the middle bits (imm[10:1]) and sign bit stay in fixed positions, while the lowest bit in S format (inst[7]) encodes a high-order bit in B format.
Similarly, the only difference between the U and J formats is that the 20-bit immediate is shiftedleft by 12 bits to form U immediates and by 1 bit to form J immediates. The location of instructionbits in the U and J format immediates is chosen to maximize overlap with the other formats andwith each other.
https://riscv.org/technical/specifications/
The reason for the shuffling of the immediate in SB/UL formats has also been explained in the RISC-V spec
Although more complex implementations might have separate adders for branch and jump calculations and so would not benefit from keeping the location of immediate bits constant across types of instruction, we wanted to reduce the hardware cost of the simplest implementations. By rotating bits in the instruction encoding of B and J immediates instead of using dynamic hard-ware muxes to multiply the immediate by 2, we reduce instruction signal fanout and immediate mux costs by around a factor of 2. The scrambled immediate encoding will add negligible timeto static or ahead-of-time compilation. For dynamic generation of instructions, there is some small additional overhead, but the most common short forward branches have straight forward immediate encodings.

redundant encoding?

This is more of a computer science / information theory question than a straightforward programming one, so if anyone knows of a better site to post this, please let me know.
Let's say I have an N-bit piece of data that will be sent redundantly in M messages, where at least M-1 of those messages will be received successfully. I am interested in different ways of encoding the N-bit piece of data in fewer bits per message. (this is similar to RAID but at a much smaller level, where N = 8 or 16 or 32)
Example: suppose N = 16 and M = 4. Then I could use the following algorithm:
1st and 3rd message: send "0" + bits 0-7
2nd and 4th message: send "1" + bits 8-15
If I can guarantee that 3 messages of the 4 will get through, then at least one message from each group will get through. Thus I can make this work with 9 bits or less, there's probably a way to do this with fewer total bits but I'm not sure how.
Are there some simple encoding/decoding algorithms to do this kind of thing? Does this problem have a name? (if I know what it's called, I can google it!)
note: in my particular case, the messages either arrive correctly or do not arrive at all (no messages arrive with errors).
(edit: moved 2nd part to a separate question)
(Incomplete answer follows. I may add more later.)
The term you may be interested in is channel coding: adding redundancy to a source in order to make it robust during transmission over a noisy channel. In information theory, the complementary problem to channel coding is source coding: reducing the redundancy in a source to represent it using fewer bits. (The combination of these two problems is called joint source-channel coding.)
Your first question asks to find a channel code. The simple example you give is similar to a repetition code, i.e., you send the same message more than twice (usually an odd number of times), and then the message which is received most often is accepted as the original message.
This code is inefficient. To use standard notation, let k = number of bits in original message, and n = number of bits in the transmitted message. For your example, k = 16 and n = 36. A measure of coding efficiency is k/n, where higher means more efficient. In your case, k/n = 0.44. This is low.
The repetition code is a simple kind of block code, i.e., redundancy is added to each block of k bits to create a codeword of n bits. So are the Hamming and Reed-Solomon codes as others mentioned. Hamming codes are relatively easy to understand with some basic linear algebra.
These should be enough terms for you to search on your own. Good luck.
I'm not sure if I understood all the details of your question correctly, but your problem is definitely aboud designing some kind of error correcting code. This is a vast area of computer science and thick tomes have been written about it. Start with wikipedia and see if you can get any simple schemes (like Hamming or Reed-Solomon codes) to work in your case.
If you want to deal not only with symbol corruption, but also deletion of symbols, you should look at erasure codes, this is definitely a more difficult task but good methods exist in many cases.
EDIT: This material from hackersdelight.org seems a nice introduction.
See erasure codes.
You're looking for a packet erasure code. There are only two useful packet erasure codes that are not totally encumbered by patents, and there's only one open-source library to implement those. Find it here: http://planete-bcast.inrialpes.fr/rubrique.php3?id_rubrique=5
Here's a trivially simple scheme that's almost twice as efficient as your example.
You chopped the message into blocks of (N/M)*2 bits. Instead, chop it into N/(M-1)-bit blocks. (Round it up if necessary.) The first block, src[0], encodes as itself: enc[0]=src[0]. The same for the last block: enc[M-1]=src[M-1]. Each of the other blocks gets XORed with its left neighbor: enc[i]=src[i-1]^src[i].
Prefix each encoded block with a log(M)-bit sequence number, essentially as you did, so the receiver can tell which was dropped. (If you can be sure that whichever blocks arrive will arrive in order, then a 1-bit sequence number will do. Just alternate 0 and 1.)
To decode, successively XOR from the left and the right until you hit the dropped block. E.g. src[1] == enc[0]^enc[1]. (Dropping one of the endpoint blocks isn't a special case -- e.g. if the first block is dropped, the scan from the right recovers it, and the scan from the left is of length 0.)