What does robustness mean in the encoding context? - encoding

I learn about the methods to encrypt multimedia files. One of them happens before the compression and it is the AES encoding of the pixel blocks. As an observation the article says that the disadvantage of this method is:
the need of robustness against the encoder (alternatively communication with the encoder so that the encoded areas will compress lossless)
I understand what the alternative in the parenthesis is but I can't figure it out what is the first method, the robustness is. Which are the methods / technologies to achieve this robustness?

Related

What type of entropy encoder does the MATLAB save() function use? I.e. how does that function work?

I am working on a compression project, and I used the default save() function in Matlab for the purpose of lossless (entropy) encoding. The transform module is all figured out.
I used the save() function to encode a 3d array that includes a bunch of zeros. I am sure that Matlab is using some kind of lossless compression with the save() function since, when I save that array, it ends up taking far less space than an array, say, containing no zeros at all. I had no success finding out what type of entropy encoding schemes are behind the function. Because it is a core part of the algorithm, I think I must at least know what is behind the function.
Plus, if you know any other type of entropy encoder that would do a better job in compressing a 3d array that contains zeros, I would really appreciate you sharing. Or, if you think I could easily write the code for that myself, then please let me know.
The v7 format uses deflate.
The v7.3 format uses the HDF5 format, which supports gzip (deflate) and szip compression. It also has an option to not compress.
The MATLAB save function supports compression for some of the formats that are available. Specifically, -v7 (default format) and -v7.3 support compression. The details of the compression are not documented.

Matlab - JPEG Compression. Huffman Encoding

I've been trying to implement a JPEG compression algorithm on Matlab.
The only part I'm having trouble implementing is the huffman encoding. I do understand the DCT into quantization and zig-zag'ing that 8x8 matrix. I also understand how does huffman encoding work, in general.
What I do not understand is, after I have an output bitstream and a dictionary that translates consecutive bits to their original form, what do I do with the output? How can I tell a computer to translate that output bitstream using the dictionary I created for it?
In addition, each 8x8 matrix will have its own output and dictionary. How can all these outputs be combined into one? Because at the end of the day, the result is supposed to be an image.
I might have misunderstood some of the steps, in which case my apologies for any confusion caused by this.
Any help would be extremely appriciated!
EDIT: I'm sorry, my question appearntly hasn't been clear enough. Say I use Matlabs built in huffman functions (huffmanenco and huffmandict), what am I supposed to do with the value the huffmanenco returns?
The part of what to do with the output string of bits hasn't been clear to me as far as huffman encoding goes in other IDE's and programming languages aswell.
You have two choices with the huffman coding.
Use a pre-canned huffman table.
Make two passes over the data where the first pass generates the huffman tables and the second pass encode.
You cannot have a different dictionary for each MCU.
You say you have the run length encoded values. You huffman encode those and write to the output stream.
EDIT:
You need to be sure that the matlab huffman endocoder is JPEG-compatible. There are different ways to huffman encode.
You need to write the bits from the encoder to the JPEG stream. This means you need a bit level I/O routine. PLUS you need to convert FF values in the compressed data into FF00 values in the JPEG stream.
I suggest getting a copy of
http://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=pd_sim_14_1?ie=UTF8&dpID=41XJBED6RCL&dpSrc=sims&preST=_AC_UL160_SR127%2C160_&refRID=1DYN5VCQQP0N88E64P5Q
to show how the encoding is done.

PNG: deflate and zlib

I'm trying to understand compression in PNG - but I seem to
find a lot of contradictory information online ...
I would like to understand
- how is searching done in the LZ77-part: hash table with linked lists? is this defined in deflate? or implemented in zlib? is there a choice of the search method?
- can PNG encoders/decoders set some parameters for the compression (strategy, filter, etc.) or is there a default for PNG?
- does the LZ77-part do greedy or lazy evaluation? or is this an option too?
- and finally: the 2 Huffman trees, are they compressed in a third tree, and all three of them are encoded? or are the 2 trees encoded with only their codelengths?
Is the zlib implementation different from other deflate implementations?? Perhaps that is where all my confusion comes from?
Thank you for any help!! I need this for my new job
LuCu
PNG compression is in the zlib format. The zlib format uses deflate. The code used is commonly the zlib library.
The algorithm used for compression is not specified by the format. The zlib library deflate algorithm uses hash chains to search for matching strings in a sliding window. zlib's deflate takes several parameters for compression tuning -- see deflateInit2().
The deflate format specifies the compression of the Huffman codes at the front of dynamic blocks. The literal/length and distance code code lengths are run-length and Huffman coded themselves.
There are other implementations of deflate compressors in the LZMA SDK and Google's zopfli, where both of those use more intensive approaches that take more time for small gains in compression.

How to convert between UTF-16 LE <-> BE?

If I want to convert UTF-16 BE <-> LE, what should I consider? Can I treat them just a plain 2-byte integer array? Or should I follow some special Unicode algorithm to handle some exceptional case?
You just need to byte-reorder the code units, thus taking two bytes, swapping them and writing them back. That is all there is to consider.
But usually there is a simple way of reading a stream in one encoding and writing it back in another encoding. Often with negligible performance drawbacks (especially in the UTF-16 case). So to make your code clearer you should probably opt for such a solution. But the trivial way should work regardless
if you know the input encoding precisely.

Efficient way to fingerprint an image (jpg, png, etc)?

Is there an efficient way to get a fingerprint of an image for duplicate detection?
That is, given an image file, say a jpg or png, I'd like to be able to quickly calculate a value that identifies the image content and is fairly resilient to other aspects of the image (eg. the image metadata) changing. If it deals with resizing that's even better.
[Update] Regarding the meta-data in jpg files, does anyone know if it's stored in a specific part of the file? I'm looking for an easy way to ignore it - eg. can I skip the first x bytes of the file or take x bytes from the end of the file to ensure I'm not getting meta-data?
Stab in the dark, if you are looking to circumvent meta-data and size related things:
Edge Detection and scale-independent comparison
Sampling and statistical analysis of grayscale/RGB values (average lum, averaged color map)
FFT and other transforms (Good article Classification of Fingerprints using FFT)
And numerous others.
Basically:
Convert JPG/PNG/GIF whatever into an RGB byte array which is independent of encoding
Use a fuzzy pattern classification method to generate a 'hash of the pattern' in the image ... not a hash of the RGB array as some suggest
Then you want a distributed method of fast hash comparison based on matching threshold on the encapsulated hash or encoding of the pattern. Erlang would be good for this :)
Advantages are:
Will, if you use any AI/Training, spot duplicates regardless of encoding, size, aspect, hue and lum modification, dynamic range/subsampling differences and in some cases perspective
Disadvantages:
Can be hard to code .. something like OpenCV might help
Probabilistic ... false positives are likely but can be reduced with neural networks and other AI
Slow unless you can encapsulate pattern qualities and distribute the search (MapReduce style)
Checkout image analysis books such as:
Pattern Classification 2ed
Image Processing Fundamentals
Image Processing - Principles and Applications
And others
If you are scaling the image, then things are simpler. If not, then you have to contend with the fact that scaling is lossy in more ways than sample reduction.
Using the byte size of the image for comparison would be suitable for many applications. Another way would be to:
Strip out the metadata.
Calculate the MD5 (or other suitable hashing algorithm) for the
image.
Compare that to the MD5 (or whatever) of the potential dupe
image (provided you've stripped out
the metadata for that one too)
You could use an algorithm like SIFT (Scale Invariant Feature Transform) to determine key points in the pictures and match these.
See http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
It is used e.g. when stitching images in a panorama to detect matching points in different images.
You want to perform an image hash. Since you didn't specify a particular language I'm guessing you don't have a preference. At the very least there's a Matlab toolbox (beta) that can do it: http://users.ece.utexas.edu/~bevans/projects/hashing/toolbox/index.html. Most of the google results on this are research results rather than actual libraries or tools.
The problem with MD5ing it is that MD5 is very sensitive to small changes in the input, and it sounds like you want to do something a bit "smarter."
Pretty interesting question. Fastest and easiest would be to calculate crc32 of content byte array but that would work only on 100% identical images. For more intelligent compare you would probably need some kind of fuzy logic analyzis...
I've implemented at least a trivial version of this. I transform and resize all images to a very small (fixed size) black and white thumbnail. I then compare those. It detects exact, resized, and duplicates transformed to black and white. It gets a lot of duplicates without a lot of cost.
The easiest thing to do is to do a hash (like MD5) of the image data, ignoring all other metadata. You can find many open source libraries that can decode common image formats so it's quite easy to strip metadata.
But that doesn't work when image itself is manipulated in anyway, including scaling, rotating.
To do exactly what you want, you have to use Image Watermarking but it's patented and can be expensive.
This is just an idea: Possibly low frequency components present in the DCT of the jpeg could be used as a size invariant identifier.