I am learning about encoding and decoding and my instructor have a question for my homework?
That is: Are we still able to decode the Huffman compressed files without adding Huffman tree or the
frequency table to the files or not?
You need some sort of description of the Huffman code. However that description does not have to be the frequencies or the tree. Typically a canonical Huffman code is used, and all that is needed to describe the code is the bit lengths of the code for each symbol.
Related
I've been trying to implement a JPEG compression algorithm on Matlab.
The only part I'm having trouble implementing is the huffman encoding. I do understand the DCT into quantization and zig-zag'ing that 8x8 matrix. I also understand how does huffman encoding work, in general.
What I do not understand is, after I have an output bitstream and a dictionary that translates consecutive bits to their original form, what do I do with the output? How can I tell a computer to translate that output bitstream using the dictionary I created for it?
In addition, each 8x8 matrix will have its own output and dictionary. How can all these outputs be combined into one? Because at the end of the day, the result is supposed to be an image.
I might have misunderstood some of the steps, in which case my apologies for any confusion caused by this.
Any help would be extremely appriciated!
EDIT: I'm sorry, my question appearntly hasn't been clear enough. Say I use Matlabs built in huffman functions (huffmanenco and huffmandict), what am I supposed to do with the value the huffmanenco returns?
The part of what to do with the output string of bits hasn't been clear to me as far as huffman encoding goes in other IDE's and programming languages aswell.
You have two choices with the huffman coding.
Use a pre-canned huffman table.
Make two passes over the data where the first pass generates the huffman tables and the second pass encode.
You cannot have a different dictionary for each MCU.
You say you have the run length encoded values. You huffman encode those and write to the output stream.
EDIT:
You need to be sure that the matlab huffman endocoder is JPEG-compatible. There are different ways to huffman encode.
You need to write the bits from the encoder to the JPEG stream. This means you need a bit level I/O routine. PLUS you need to convert FF values in the compressed data into FF00 values in the JPEG stream.
I suggest getting a copy of
http://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=pd_sim_14_1?ie=UTF8&dpID=41XJBED6RCL&dpSrc=sims&preST=_AC_UL160_SR127%2C160_&refRID=1DYN5VCQQP0N88E64P5Q
to show how the encoding is done.
A Huffman table, in JPEG standard, is generated from a collection of statistics in two steps. One of the steps is implementing function/method given by this picture: (This picture is given in Annex K of JPEG standard):
Problem is here. Previously in standard (Annex C) says this sentence:
Huffman tables are specified in terms of a 16-byte list (BITS) giving the number of codes for each code length from
1 to 16. This is followed by a list of the 8-bit symbol values (HUFFVAL), each of which is assigned a Huffman code.
Obviously BITS is list of 16 elements. But in picture above, i is first set to 32 (i=32) then we want to access BITS[i]. Probably I misunderstood something, so please let someone gives me an answer.
Here is JPEG standard description of picture:
Figure K.3 gives the procedure for adjusting the BITS list so that no code is longer than 16 bits. Since symbols are paired
for the longest Huffman code, the symbols are removed from this length category two at a time. The prefix for the pair
(which is one bit shorter) is allocated to one of the pair; then (skipping the BITS entry for that prefix length) a code word
from the next shortest non-zero BITS entry is converted into a prefix for two code words one bit longer. After the BITS
list is reduced to a maximum code length of 16 bits, the last step removes the reserved code point from the code length
count.
Here is code for picture above:
void adjustBitLengthTo16Bits(vector<char>&BITS){
int i=32,j=0;
while(1){
if(BITS[i]>0){
j=i-1;
j--;
while(BITS[j]<=0)
j--;
BITS[i]=BITS[i]-2;
BITS[i-1]=BITS[i-1]+1;
BITS[j+1]=BITS[j+1]+2;
BITS[j]=BITS[j]-1;
continue;
}
else{
i--;
if(i!=16)
continue;
while(BITS[i]==0)
i--;
BITS[i]--;
return;
}
}
}
This code is only for encoders that want to generate their own custom Huffman tables. The majority of JPEG encoders just use fixed tables that are reasonable approximations of the statistics of most images. In this particular case, the first step in generating a Huffman table for the AC coefficients produces a table up to 32 entries (bits) long. Since there are only 256 unique symbols to encode (skip/length pairs), there should never be more than 32 bits needed to specify all of the Huffman codes. After the first pass has produced a set of codes (up to 32-bits in length), the second pass takes the least frequent (longest) codes and "moves" them into shorter length slots so that the maximum code length is 16-bits. In an ideal Huffman table, the frequency distributions correspond to the code lengths. In this case, the table is being made to fit by squeezing the longest codes into slots reserved for shorter codes. This can be done because the 14/15/16 bit length Huffman codes have "room" for more permutations of bits and can "fit" the longer codes in them.
Update:
There is limited benefit to "optimizing" the Huffman tables in JPEG. Most of the compression occurs because of the quantization and DCT transform of the pixels. Switching to arithmetic coding has a measurable benefit (~10% size reduction), but then it limits the audience since most JPEG decoders don't support arithmetic coding due to past patent issues.
I have read and understood huffman coding to some extent.
I did some googling but was unable to find other coding TREES.
I need to make comparision of HuffMan coding TREE with other coding TREES.
All I need are names of coding trees and if u can provide there small description it would be really helpful..
regards,
Aqif
Shannon-Fano was a predecessor of Huffman coding (Fano was Huffman's professor).
Hu-Tucker is another method.
Context tree weighting method (CTW) is a lossless compression and prediction algorithm.
WBTC (wavelet block tree coder) algorithm is used for image coding.
Block-zero tree coding (BZTC) algorithm is also used for image coding.
Some of these are in Wikipedia, and some from plain old Google searching.
I really need help with Huffman Coding for Lossless compression. I have an exam coming up and need to understand this, does anyone know of easy tutorials made to understand this, or could someone explain.
The questions in the exam are likely to be:
Suppose the alphabet is [A, B, C], and the known probability distribution is P(A)=0.6,
P(B)=0.2 and P(C)=0.2. For simplicity, let’s also assume that both encoder and decoder know
that the length of the messages is always 3, so there is no need for a terminator.
How many bits are needed to encode the message ACB by Huffman Coding? You need to
provide the Huffman tree and Huffman code for each symbol. (3 marks)
How many bits are needed to encode the message ACB by Arithmetic Coding? You need to
provide details of the encoding process. (3 marks)
Using the above results, discuss the advantage of Arithmetic Coding over Huffman coding.
(1 mark)
Answers:
Huffman Code: A - 1, B - 01, C - 00.
The encoding result is 10001, so 5 bits are needed. (3 marks)
The encoding process of Arithmetic Coding:
Symbol Low high range
0.0 1.0 1.0
A 0.0 0.6 0.6
C 0.48 0.6 0.12
B 0.552 0.576 0.024
The final binary codeword is 0.1001, which is 0.5625. Therefore 4 bits are needed. (3 marks)
In Huffman Coding, the length of the codeword for each symbol has to be an integer. But it
can be fractional in Arithmetic Coding. Therefore Arithmetic Coding is often more efficient
than Huffman Coding, as the results shown above. (1 mark)
http://en.wikipedia.org/wiki/Huffman_coding
If you look at the tree (top right) you'll see that each parent node is the sum of the two below it. The values at the nodes are the frequencies of the letters. Each bit in the binary sequence is a right/left branch in the tree.
Does that help?
I don't really have a clue about Arithmetic coding, but it looks quite clever.
A Huffman tree is a binary tree with the nodes representing the values with the highest distribution in the stream being compressed near the root and the values with decreasing distribution further and further away from the root, thus allowing more common values to be encoded in shorter bit strings while less common values are encoded in longer strings.
A Huffman tree is constructed as follows:
Build a table of entities in the source stream, with their distribution.
Pick the two entries in the table that have the lowest distribution.
Make a tree node out of these two entries.
Remove the entries just used from the table.
Add a new entry to the table with the combined distribution of the nodes just removed, as well as the tree node.
if there is more than one entry left in the table, go to step 2.
The entry left in the table is your root.
Basic huffmann implementation can be quite ok. But, if you are building from scratch you may need more than 1 other datastructure in your toolbox to make things easier such as a minHeap and a bit vector. The basic algorithms for encoding and decoding are pretty simple. No info on comparison with arithmetic coding.
An implementation example
Is there an efficient way to get a fingerprint of an image for duplicate detection?
That is, given an image file, say a jpg or png, I'd like to be able to quickly calculate a value that identifies the image content and is fairly resilient to other aspects of the image (eg. the image metadata) changing. If it deals with resizing that's even better.
[Update] Regarding the meta-data in jpg files, does anyone know if it's stored in a specific part of the file? I'm looking for an easy way to ignore it - eg. can I skip the first x bytes of the file or take x bytes from the end of the file to ensure I'm not getting meta-data?
Stab in the dark, if you are looking to circumvent meta-data and size related things:
Edge Detection and scale-independent comparison
Sampling and statistical analysis of grayscale/RGB values (average lum, averaged color map)
FFT and other transforms (Good article Classification of Fingerprints using FFT)
And numerous others.
Basically:
Convert JPG/PNG/GIF whatever into an RGB byte array which is independent of encoding
Use a fuzzy pattern classification method to generate a 'hash of the pattern' in the image ... not a hash of the RGB array as some suggest
Then you want a distributed method of fast hash comparison based on matching threshold on the encapsulated hash or encoding of the pattern. Erlang would be good for this :)
Advantages are:
Will, if you use any AI/Training, spot duplicates regardless of encoding, size, aspect, hue and lum modification, dynamic range/subsampling differences and in some cases perspective
Disadvantages:
Can be hard to code .. something like OpenCV might help
Probabilistic ... false positives are likely but can be reduced with neural networks and other AI
Slow unless you can encapsulate pattern qualities and distribute the search (MapReduce style)
Checkout image analysis books such as:
Pattern Classification 2ed
Image Processing Fundamentals
Image Processing - Principles and Applications
And others
If you are scaling the image, then things are simpler. If not, then you have to contend with the fact that scaling is lossy in more ways than sample reduction.
Using the byte size of the image for comparison would be suitable for many applications. Another way would be to:
Strip out the metadata.
Calculate the MD5 (or other suitable hashing algorithm) for the
image.
Compare that to the MD5 (or whatever) of the potential dupe
image (provided you've stripped out
the metadata for that one too)
You could use an algorithm like SIFT (Scale Invariant Feature Transform) to determine key points in the pictures and match these.
See http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
It is used e.g. when stitching images in a panorama to detect matching points in different images.
You want to perform an image hash. Since you didn't specify a particular language I'm guessing you don't have a preference. At the very least there's a Matlab toolbox (beta) that can do it: http://users.ece.utexas.edu/~bevans/projects/hashing/toolbox/index.html. Most of the google results on this are research results rather than actual libraries or tools.
The problem with MD5ing it is that MD5 is very sensitive to small changes in the input, and it sounds like you want to do something a bit "smarter."
Pretty interesting question. Fastest and easiest would be to calculate crc32 of content byte array but that would work only on 100% identical images. For more intelligent compare you would probably need some kind of fuzy logic analyzis...
I've implemented at least a trivial version of this. I transform and resize all images to a very small (fixed size) black and white thumbnail. I then compare those. It detects exact, resized, and duplicates transformed to black and white. It gets a lot of duplicates without a lot of cost.
The easiest thing to do is to do a hash (like MD5) of the image data, ignoring all other metadata. You can find many open source libraries that can decode common image formats so it's quite easy to strip metadata.
But that doesn't work when image itself is manipulated in anyway, including scaling, rotating.
To do exactly what you want, you have to use Image Watermarking but it's patented and can be expensive.
This is just an idea: Possibly low frequency components present in the DCT of the jpeg could be used as a size invariant identifier.