Matlab - JPEG Compression. Huffman Encoding - matlab

I've been trying to implement a JPEG compression algorithm on Matlab.
The only part I'm having trouble implementing is the huffman encoding. I do understand the DCT into quantization and zig-zag'ing that 8x8 matrix. I also understand how does huffman encoding work, in general.
What I do not understand is, after I have an output bitstream and a dictionary that translates consecutive bits to their original form, what do I do with the output? How can I tell a computer to translate that output bitstream using the dictionary I created for it?
In addition, each 8x8 matrix will have its own output and dictionary. How can all these outputs be combined into one? Because at the end of the day, the result is supposed to be an image.
I might have misunderstood some of the steps, in which case my apologies for any confusion caused by this.
Any help would be extremely appriciated!
EDIT: I'm sorry, my question appearntly hasn't been clear enough. Say I use Matlabs built in huffman functions (huffmanenco and huffmandict), what am I supposed to do with the value the huffmanenco returns?
The part of what to do with the output string of bits hasn't been clear to me as far as huffman encoding goes in other IDE's and programming languages aswell.

You have two choices with the huffman coding.
Use a pre-canned huffman table.
Make two passes over the data where the first pass generates the huffman tables and the second pass encode.
You cannot have a different dictionary for each MCU.
You say you have the run length encoded values. You huffman encode those and write to the output stream.
EDIT:
You need to be sure that the matlab huffman endocoder is JPEG-compatible. There are different ways to huffman encode.
You need to write the bits from the encoder to the JPEG stream. This means you need a bit level I/O routine. PLUS you need to convert FF values in the compressed data into FF00 values in the JPEG stream.
I suggest getting a copy of
http://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=pd_sim_14_1?ie=UTF8&dpID=41XJBED6RCL&dpSrc=sims&preST=_AC_UL160_SR127%2C160_&refRID=1DYN5VCQQP0N88E64P5Q
to show how the encoding is done.

Related

What type of entropy encoder does the MATLAB save() function use? I.e. how does that function work?

I am working on a compression project, and I used the default save() function in Matlab for the purpose of lossless (entropy) encoding. The transform module is all figured out.
I used the save() function to encode a 3d array that includes a bunch of zeros. I am sure that Matlab is using some kind of lossless compression with the save() function since, when I save that array, it ends up taking far less space than an array, say, containing no zeros at all. I had no success finding out what type of entropy encoding schemes are behind the function. Because it is a core part of the algorithm, I think I must at least know what is behind the function.
Plus, if you know any other type of entropy encoder that would do a better job in compressing a 3d array that contains zeros, I would really appreciate you sharing. Or, if you think I could easily write the code for that myself, then please let me know.
The v7 format uses deflate.
The v7.3 format uses the HDF5 format, which supports gzip (deflate) and szip compression. It also has an option to not compress.
The MATLAB save function supports compression for some of the formats that are available. Specifically, -v7 (default format) and -v7.3 support compression. The details of the compression are not documented.

Is adding Huffman tree or frequency table needed?

I am learning about encoding and decoding and my instructor have a question for my homework?
That is: Are we still able to decode the Huffman compressed files without adding Huffman tree or the
frequency table to the files or not?
You need some sort of description of the Huffman code. However that description does not have to be the frequencies or the tree. Typically a canonical Huffman code is used, and all that is needed to describe the code is the bit lengths of the code for each symbol.

Huffman coding for Lossless Compression

I really need help with Huffman Coding for Lossless compression. I have an exam coming up and need to understand this, does anyone know of easy tutorials made to understand this, or could someone explain.
The questions in the exam are likely to be:
Suppose the alphabet is [A, B, C], and the known probability distribution is P(A)=0.6,
P(B)=0.2 and P(C)=0.2. For simplicity, let’s also assume that both encoder and decoder know
that the length of the messages is always 3, so there is no need for a terminator.
How many bits are needed to encode the message ACB by Huffman Coding? You need to
provide the Huffman tree and Huffman code for each symbol. (3 marks)
How many bits are needed to encode the message ACB by Arithmetic Coding? You need to
provide details of the encoding process. (3 marks)
Using the above results, discuss the advantage of Arithmetic Coding over Huffman coding.
(1 mark)
Answers:
Huffman Code: A - 1, B - 01, C - 00.
The encoding result is 10001, so 5 bits are needed. (3 marks)
The encoding process of Arithmetic Coding:
Symbol Low high range
0.0 1.0 1.0
A 0.0 0.6 0.6
C 0.48 0.6 0.12
B 0.552 0.576 0.024
The final binary codeword is 0.1001, which is 0.5625. Therefore 4 bits are needed. (3 marks)
In Huffman Coding, the length of the codeword for each symbol has to be an integer. But it
can be fractional in Arithmetic Coding. Therefore Arithmetic Coding is often more efficient
than Huffman Coding, as the results shown above. (1 mark)
http://en.wikipedia.org/wiki/Huffman_coding
If you look at the tree (top right) you'll see that each parent node is the sum of the two below it. The values at the nodes are the frequencies of the letters. Each bit in the binary sequence is a right/left branch in the tree.
Does that help?
I don't really have a clue about Arithmetic coding, but it looks quite clever.
A Huffman tree is a binary tree with the nodes representing the values with the highest distribution in the stream being compressed near the root and the values with decreasing distribution further and further away from the root, thus allowing more common values to be encoded in shorter bit strings while less common values are encoded in longer strings.
A Huffman tree is constructed as follows:
Build a table of entities in the source stream, with their distribution.
Pick the two entries in the table that have the lowest distribution.
Make a tree node out of these two entries.
Remove the entries just used from the table.
Add a new entry to the table with the combined distribution of the nodes just removed, as well as the tree node.
if there is more than one entry left in the table, go to step 2.
The entry left in the table is your root.
Basic huffmann implementation can be quite ok. But, if you are building from scratch you may need more than 1 other datastructure in your toolbox to make things easier such as a minHeap and a bit vector. The basic algorithms for encoding and decoding are pretty simple. No info on comparison with arithmetic coding.
An implementation example

Simulink: Bit extraction from 1-Byte Hex

I'm relatively new to Simulink and I am looking for a possibility to extract 1-3 specific bits from one byte.
As far as I know the input format (bin, dec, hex) of the constant is irrelevant for the following!? But how can I say that the constant "1234" is hex and not dec?
In my model I use the "Constant Block" as my source (will be parametrised by a MATLAB variable which comes from a m-file).
A further processing with the "Extract Bits Block" causes an error on incompatible data types.
Can someone help me to deal with this issue?
Greets, poeschlorn
You should probably do the conversion hex->dec in your .m initialization file and use this value in Simulink.
Maybe this is not the most elegant solution, but I converted my input to decimal and then created a BCD representation of it via OR and AND logic blocks for further use.
If you have the Communications Toolbox/Blockset then you can use the Integer to Bit Converter block to do a conversion to a vector of binary digits then just extract the "bits" that you want. The Bit to Integer Converter block will do the reverse transformation.
If you don't have the Communicatins Blockset then it wouldn't be hard to do a similar thing to this using a plain MATLAB Function block.

Efficient way to fingerprint an image (jpg, png, etc)?

Is there an efficient way to get a fingerprint of an image for duplicate detection?
That is, given an image file, say a jpg or png, I'd like to be able to quickly calculate a value that identifies the image content and is fairly resilient to other aspects of the image (eg. the image metadata) changing. If it deals with resizing that's even better.
[Update] Regarding the meta-data in jpg files, does anyone know if it's stored in a specific part of the file? I'm looking for an easy way to ignore it - eg. can I skip the first x bytes of the file or take x bytes from the end of the file to ensure I'm not getting meta-data?
Stab in the dark, if you are looking to circumvent meta-data and size related things:
Edge Detection and scale-independent comparison
Sampling and statistical analysis of grayscale/RGB values (average lum, averaged color map)
FFT and other transforms (Good article Classification of Fingerprints using FFT)
And numerous others.
Basically:
Convert JPG/PNG/GIF whatever into an RGB byte array which is independent of encoding
Use a fuzzy pattern classification method to generate a 'hash of the pattern' in the image ... not a hash of the RGB array as some suggest
Then you want a distributed method of fast hash comparison based on matching threshold on the encapsulated hash or encoding of the pattern. Erlang would be good for this :)
Advantages are:
Will, if you use any AI/Training, spot duplicates regardless of encoding, size, aspect, hue and lum modification, dynamic range/subsampling differences and in some cases perspective
Disadvantages:
Can be hard to code .. something like OpenCV might help
Probabilistic ... false positives are likely but can be reduced with neural networks and other AI
Slow unless you can encapsulate pattern qualities and distribute the search (MapReduce style)
Checkout image analysis books such as:
Pattern Classification 2ed
Image Processing Fundamentals
Image Processing - Principles and Applications
And others
If you are scaling the image, then things are simpler. If not, then you have to contend with the fact that scaling is lossy in more ways than sample reduction.
Using the byte size of the image for comparison would be suitable for many applications. Another way would be to:
Strip out the metadata.
Calculate the MD5 (or other suitable hashing algorithm) for the
image.
Compare that to the MD5 (or whatever) of the potential dupe
image (provided you've stripped out
the metadata for that one too)
You could use an algorithm like SIFT (Scale Invariant Feature Transform) to determine key points in the pictures and match these.
See http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
It is used e.g. when stitching images in a panorama to detect matching points in different images.
You want to perform an image hash. Since you didn't specify a particular language I'm guessing you don't have a preference. At the very least there's a Matlab toolbox (beta) that can do it: http://users.ece.utexas.edu/~bevans/projects/hashing/toolbox/index.html. Most of the google results on this are research results rather than actual libraries or tools.
The problem with MD5ing it is that MD5 is very sensitive to small changes in the input, and it sounds like you want to do something a bit "smarter."
Pretty interesting question. Fastest and easiest would be to calculate crc32 of content byte array but that would work only on 100% identical images. For more intelligent compare you would probably need some kind of fuzy logic analyzis...
I've implemented at least a trivial version of this. I transform and resize all images to a very small (fixed size) black and white thumbnail. I then compare those. It detects exact, resized, and duplicates transformed to black and white. It gets a lot of duplicates without a lot of cost.
The easiest thing to do is to do a hash (like MD5) of the image data, ignoring all other metadata. You can find many open source libraries that can decode common image formats so it's quite easy to strip metadata.
But that doesn't work when image itself is manipulated in anyway, including scaling, rotating.
To do exactly what you want, you have to use Image Watermarking but it's patented and can be expensive.
This is just an idea: Possibly low frequency components present in the DCT of the jpeg could be used as a size invariant identifier.