Does DBI (data bus inversion) conserve Entropy? - encoding

I have been reading up on DBI on Wikipedia, which references this research paper: http://www.cs.columbia.edu/~cs4823/handouts/stan-burleson-tvlsi-95.pdf
The paper says:
While the maximum number of transitions is reduced by half the
decrease in the average number of transitions is not as good. For an
8-bit bus for example the average number of transitions per time-slot
by using the Bus-invert coding becomes 3.27 (instead of 4), or 0.41
(instead of 0.5) transitions per bus-line per time-slot.
However this would suggest it reduces the entropy of the 8 bit message, no?
So the entropy of a random 8 bit message is 8 (duh). Add a DBI bit which shifts the probability distribution to the left, but it (I thought) wouldn't reduce the area under the curve. You should still be left with a minimum of 8 bits of entropy, but spread over 9 bits. But they claim the average is now 0.41, instead of 0.5, which suggests the entropy is now -log( (0.59)^9) = ~6.85. I would have assumed the average would (at best) become 0.46 (-log(0.54 ^9) = ~8).
Am I misunderstanding something?

Related

Why are both Viterbi and Reed-Solomon used in DVB-T?

From my understanding, DVB-T packets go through two FEC systems, that are, Viterbi, with a data loss up to 50%, and RS, with a data loss up to 10%. Those are called external and internal coding.
I can't understand the need for the second RS coding (in that case, MPEG-TS packets 188 bytes long are added an additional 20 bytes).
More specifically, what happens to packets that are corrupted, say, 55%? Are 50% of the errors fixed by the Viterbi decoder and the remaining 5% from the RS?
Sorry for my dumbness.
The abilities and targets of Viterbi / RS differ considerably: Viterbi coding is done next to baseband/analog level, where each bit has a high probability of being corrupted. This is combated with a scheme, where not all combinations of e.g. '00000' through '11111' are possible, but where every other or 1/3 or 2/3 bits are correction bits calculated from the history of some N previous bits transferred.
This causes a comparably high expansion of data with the possibility of correcting typically one half of individual bit errors. One has to notice that the bit errors can occur for the correction bits as well...
This kind of bit error correction can mitigate errors mostly on AWGN channels and somewhat on Rayleigh fading (simulation model for signal fading due to moving vehicle with multi-path propagation, i.e. same signal coming from multiple paths).
Because the "window" of the Viterbi encoder is small, and when there's a burst error over the complete window (e.g. 7 bits), the encoder is not able to correct any errors. Thus a secondary coder is needed: Reed Solomon (in DVB or CD) coder works with codewords of size 8 bits, i.e. when a single bit in the codeword is corrupted, the complete codeword needs to be fixed.
The idea thus is, that the outer coder can reduce sporadic single bit errors to a manageable level, leaving basically burst errors (long period of unreceived signal) to the inner coding.

Why In Manchester encoding, the bit rate is half of the baud rate?

I think baud rate is the rate of the symbols, and if each symbol contains n bit, then the bit rate should be n x baud rate
In Ethernet( Manchester encoding) ,if bit rate is half of the baud rate, then a symbol contains 1/2 bit ? As far as I know, bit rate should at least not less than symbol rate (baud rate).
About the relationship of baud rate and bit rate, my understanding have no problems, yet when it comes to the Manchester code, it's totally counterintuitive, could anyone explain about these?
Bit rate is related to the speed of the transmission of the digital bit, while baudrate is related to the speed of change of symbols, which are significancies in analog signal. These can be either in amplitude, frequency or phase or more complex modulation methods. In manchester encoding, one bit is reprsented by two different levels of voltage. Therefore, lets say if you want to transfer 1Mbit digital data in one second, then you will need to make ~ 2 million changes in the level of the analogous signal. That is why, your bit rate will be 1Mbs, while your baud rate will be 2M bauds.
In NRZ encoding, one bit is represented by one symbol. Therfore rates will be equal.
The Wikipedia article for Baud says that it can be defined as pulses per second. In the case of Manchester Encoding, this results in the baud rate being defined as "clock transitions".
A transition is what occurs when the signaling voltage goes from a low voltage to a high voltage, or vice versa. If you look at this diagram:
You will notice that the Manchester wave always makes a transition from either low to high or high to low when the clock transitions from high to low. The bits are encoded in that transition; a transition from low to high indicates a 1, and a transition from high to low indicates a 0. The low-to-high clock transitions are used to get the Manchester wave in a position where it can make the correct transition for the next bit. As you can see, there are never more than two clock transitions between one Manchester transition and the next; the clock is effectively encoded in the Manchester wave itself.
If the bits were encoded in a single clock transition (i.e. high being 1 and low being 0), then the clock (baud) rate and the bit rate would be the same, but then you would have to run a separate line for the clock. Because Manchester guarantees a transition every
You can think of Manchester encoding not only transmitting the actual data, but also the clock (meta data) due to its self clocking characteristic.
http://en.wikipedia.org/wiki/Self-clocking_signal
All you need to understand is that WITHIN any ONE state in Manchester encoding ( i.e either 1 or 0 )
there would be a transition . . as depicted in DIAGRAM above. . .the sole reason for transition
being for reciever to synchronize
This being said, it means if we compare this encoding scheme to others. . Like NRZ. there would be double the transitions in manchester encoding as compared to other techniques ( for a sequence of 10101
manchester will have 10 transitions while NRZ would have 5 ). . there may be exceptions. This means baud rate for manchester would be 10 while for NRZ would be 5. .
In designing we use to say that if any recvr is capable of syncing to baud rate of 10 . . . that means with manchester it transmts five Bits while with NRZ it would transmit 10 bis

Enhancing 8 bit images to 16 bit

My objective is to enhance 8 bit images to 16 bit ones. In other words, I want to increase the dynamic range of an 8 bit image. And to do that, I can sequentially take multiple images of 8 bit with fixed scene and fixed camera. To simplify the issue, let's assume they are grayscale images
Intuitively, I think I can achieve the goal by
Multiplying two 8 bit images
resimage = double(img1) .* double(img2)
Averaging specified number of 8 bit images
resImage = mean(images,3)
assuming images(:,:,i) contains ith 8 bit image.
After that, I can make the resulting image to 16 bit one.
resImage = uint16(resImage)
But before testing these methods, I wonder there is another way to do this - except for buying 16 bit camera, or literature for this subject might be better.
UPDATE: As comments below display, I got great information on drawbacks of simple averaging above and on image stacks for the enhancement. So it may be a good topic to study after all. Thank all for your great comments.
This question appears to relate to increasing the Dynamic Range of an image by integrating information from multiple 8 bit exposures into a 16 bit image. This is related to the practice of capturing and combining "image stacks" in astronomical imaging among other fields. An explanation of this practice and how it can both reduce image noise, and enhance dynamic range is available here:
http://keithwiley.com/astroPhotography/imageStacking.shtml
The idea is that successive captures of the same scene are subject to image noise, and this noise leads to stochastic variation of the pixel values captured. In the simplest case these variations can be leveraged by summing and dividing i.e. mean averaging the stack to improve its dynamic range but the practicality would depend very much on the noise characteristics of the camera.
You want to sum many images together, assuming there is no jitter and the camera is steady. Accumulate a large sum and then divide by some amount.
Note that to get a reasonable 16-bit image from an 8 bit source, you'd need to take hundreds of images to get any kind of reasonable result. Note that jitter will distort edge information and there is some inherent noise level of the camera that might mean you are essentially 'grinding metal'. In a practical sense, you might get 2 or 3 more bits of data from image summing, but not 8 more. To get 3 bits more would require at least 64 images (6 bits) to sum. Then divide by 8 (3 bits), as the lower bits are garbage.
Rule of thumb is to get a new bit of data, you need the squared(bits) of images, so 3 bits (8) means 64 images, 4 bits would be 256 images, etc.
Here's a link that talks about sampling:
http://electronicdesign.com/analog/understand-tradeoffs-increasing-resolution-averaging
"In fact, it can be shown that the improvement is proportional to the square root of the number of samples in the average."
Note that SNR is a log scale so equating it to bits is reasonable.

Amdahl's law example

Can someone help me with this example please and show me how to work the second part?
the question is :
If one third of a weather prediction algorithm is inherently serial and the remainder
parallelizable, what is the minimum number of cores needed to guarantee a 150% speedup over a
single core implementation?
ii. Your boss revises the figure to 200%. What is your new answer?
Thanks very much in advance !!
Guess: If the algorithm is 1/3 serial and 2/3 parallel...I would think that each core you added would give you a 66% increase in performance...So for 150% increase, you'd need 3 more cores, and for a 200% increase, you'd need 4.
This is a guess. Your textbook might be more helpful :)
If the algorithm runs on a single core and takes 90 minutes then 30 minutes is for the serial part and 60 minutes for the parallel part.
Add a CPU:
30 is for the serial part and 30 for the parallel part(half of the 60 overlaps with the serial part).
90 / 60 = 150% increase.
I am a bit late, but here are the answers:
1) 150% increase -> 2 cores at least required as dbasnett said;
2) 200% increase -> 4 cores at least required basing on the Amahld's law:
Here, 90 minutes overall required to perform the calculation. P is the actually enhanced part of the algorithm (the parallelizable part) which is 2/3 of 90, N is the number of cores, so when there's a core only:
You get 1, which means 100%, which is how the algorithm performs the standard way (without multi-core acceleration and therefore no parallelization speedup).
Now, we must find N number of cores for which the previous equation equals 2, where 2 means that the algorithm performs in half time (45 minutes instead of 90 when there's no parallelization) and therefore with a 200% speedup:
Since:
We see that:
So with 4 cores computing in parallel the 2/3 of the algoritm you get 200% speedup. The same goes for 150%, you will get 2, as dbasnett already told you.
Pretty simple.
Note that a complex algorithm may imply further divisions of its parallelizable parts (and in theory you can have a different number of processing units per parallelizable part concurrently):
You can further look at Wikipedia (there's also an example):
http://en.wikipedia.org/wiki/Amdahl%27s_law#Description
Anyway, the principle is the same:
Let T be the time an algorithm needs to execute in order to complete, A be the serial part of it, B its parallelizable part and N the number of parallel CPUs, you can divide B in further small sections and perform calculations on each part:
You may for C, D, G e.g. adopt M CPUs instead of N (the speedup will of course differ if M != N).
And at the end, you will arrive at a point when having more CPUs doesn't matter anymore, since:
And your algorithm speedup will at most tend to total execution time (T) divided by the execution time of the Serial part only (A).
Therefore parallel calculation comes really handy only when you have low execution time for the serial part of your algorithm.

Help designing a hash function to detect duplicate records?

Let me explain my program thus far. It is a rubiks cube solver. I am given a scrambled cube (this is the initial state). This becomes the root node of a graph. I am using iterative deepening depth first search to "brute force" this scrambled cube to a recognizable state which I can then use pattern recognition to solve.
As you can imagine, this is a very large graph, so I would like to come up with some sort of hashing functionality to detect duplicate nodes in this graph (thus speeding up the traversal).
I am largely unfamiliar with hashing functions, but here is what I am thinking... Each node is essentially a different state of the rubik's cube. So if I come to a cube state (node) that has already be seen, I want to skip over it. So I need a hashing function that takes me from the state variable to a checksum, where the state variable is a 54-character string. The only allowed characters are y, r, g, o, b, w (which correspond to colors).
Any help designing this hash function would be greatly appreciated.
For the fastest duplicate detection and removal - avoid generating many of the repeated positions in the first place. This is easy to do and quicker than generating and then finding the repeats. So for example if you have moves like F and B, if you allow the sub sequence FB don't also allow BF, which gives the same result. If you've just done 3F, don't follow it with F. You can generate a small look-up table for allowed next moves, given the last three moves.
For the remaining duplicates you want a fast hash because there are a lot of positions. To make your hash go fast, as others have commented, you want what it hashes from, the representation of the position, to be small. There are 12 edge cubies and there are 8 corner cubies. Representing each cubies position and orientation need take only five bits per cubie, i.e. 100 bits (12.5 bytes) total. For edges its four bits for position and one for flip. For corners its three bits for position and 2 for spin. You can ignore the last edge cubie since its position and flip is fixed by the others. With this representation you are already down to 12 bytes for the position.
You have about 70 real bits of information in a rubik cube position, and 96 bits is close enough to 70 to make it actually counter productive hashing those bits further. I.e. treat this representation of the board as your hash. That may sound a bit strange, but from your question I'm envisaging you at the same time experimenting with a less compact representation of the cube that is more amenable to your pattern matching. In that case the 12 byte value can be treated as if it were a hash, with the advantage that it's a hash that never has a collision. That makes the duplicate testing code and new value insertion shorter and simpler and faster. It's going to be cheaper than the MD5 solutions suggested so far.
There are many other tricks you could use to cut down the work in searching for repeated positions. Have a look at http://cube20.org/ for ideas.
You can always try a cryptographic hash function. Since your problem is not a question of security (there is no attacker purposely trying to find distinct states which hash to the same value), you can use a broken hash function. I recommend trying MD4, which is quite fast. Your 54-character string is quite appropriate for MD4 input (MD4 can process inputs up to 55 bytes as a single block).
A basic 2.4 GHz PC can hash about 12 millions such strings per second, using a single core, with a simple unrolled C implementation (e.g. one which would look like the MD4Transform() function in the sample code included in RFC 1320). This may be enough for your needs.
1) Don't Use A Hash
You have 9*6 = 54 separate faces on a rubik cube. Even wastefully using 1 byte per face this is 432 bits, so hashing won't save you too much space. A better packing of 3 bits per face comes to 162 bits (21 bytes). It sounds to me like you need a compact way to represent the rubik.
OTOH, if you are looking to store a set of many many previously-visited states then I've found that using a bloom filter instead of a true set gets me decent results (but often non-optimal) with much lower space utilization.
2) If you are married to the idea of a hash:
Just use MD5, its slightly more compact than the proposed rubik states, rather fast, and has good collision properties - it's not like you have a malicious adversary trying to cause rubik cube hash collisions ;-).
EDIT: Using cryptographic hash functions, such as MD4/MD5, is usually simple once you have a library or function implementing the algorithm (ex: OpenSSL, GNU TLS, and many stand-alone implementations exist). Usually the function is something like void md5(unsigned char *buf, size_t len, unsigned char *digest) where digest points to a pre-allocated 16 byte buffer and buf is the data to be hashed (your rubik cube structure). Here is some untested C code:
#include <openssl/md5.h>
void main()
{
unsigned char digest[16];
unsigned char buf[BUFLEN];
initializeBuffer(buf);
MD5(buf,BUFLEN,digest); // This is the openssl function
printDigest(digest);
}
And be sure to compile/link with -lssl.
8 corner cubes:
You can assign each of these corners to 8 positions which each require 3 bits to determine which corner cube is at which position for a total of 24 bits.
You can further reduce this to just recording 7-of-8 positions as you can easily use a process of elimination to determine what the 8th corner is (for 21 bits).
However, this can be reduced further as the 8 corners can only be arranged in 8! = 40320 permutations and 40320 can be represented in 16 bits.
Each corner cube can be orientated correctly or be rotated 120° clockwise or anti-clockwise to be in three different positions (represented as 0, 1 and 2 respectively).
This requires 2 bits per corner to represent.
However, the sum of the orientations (modulo 3) is always 0; so, if you know 7-of-8 orientations then (assuming you have a solvable cube) you can calculate the orientation of the 8th corner (giving a total of 14 bits).
Or for a further reduction, seven ternary (base 3) digits can represent the orientation of the corners and this can be represented in 12 binary digits (bits).
So the corners cubes can be represented in 28 bits, if you want to decode the permutations, or in 33 bits, if you want to directly record the positions of 7-of-8 corners.
12 edge cubes:
Each can be represented in 4 bits (for a total of 48 bits) which can be reduced to 44 bits by only recording the position of 11-of-12 edges (for a total of 44 bits).
However, the 12! = 479001600 permutations of the edges can be stored in 29 bits.
Each edge can be either be oriented correctly or flipped:
This requires 1 bit to represent.
However, edges are always flipped in pairs so the parity of the flipped edges will always be zero (again, meaning that you only need to record 11-of-12 orientations for the edges) giving a total of 11 bits required.
So edge cubes can be represented in 40 bits, if you want to decode the permutations, or in 55 bits if you want to record all the positions and flips of 11-of-12 edges.
6 centre cubes
You do not need to record any information about the centre cubes - they are fixed relative to the ball at the centre of the Rubik's cube (so assuming you are not worried about the orientation of any logos on the cube) are immobile.
Total:
Using permutations: 68 bits
Using positions: 88 bits
Just to establish the theoretical minimum representation - the state space of a valid Rubik's cube is about 4.3*10^19. Log2(4.3*10^19) will then determine how many bits you need to represent that full space, the ceiling of which is 66. So in theory, if you could number every valid state, any given state could be uniquely represented in 66 bits.
While you may want to follow others' advice and find a more compact way of representing the cube, consider representing the state in terms of edge, corner, and face pieces. Due to the swapping laws of legal cube moves, you should be able to concatenate a sequence of 12 4-bit edge locations, 8 3-bit corner locations, and 6 3-bit face locations. This should result in a unique representation using 90 bits.
This representation may not be conducive to the way you are creating your tree, but it is unique, easily comparable, and should be possible to find given a state in your existing representation.