Recovering and Reordering Lost Bytes - encoding

When an image is sent to an application (e.g WhatsApp) over the network, the image is compressed to an extent.
How can I recover these lost bytes and when I do, how can I regain the order in which they were, originally?
Use case for this is in the application of Steganography. if I encode a message into a png, send it over WhatsApp, and download it back (comes back as jpeg in WhatsApp's case), convert it back to PNG, I cannot seem to decode the message again as I would with the picture that never went over the network.

You're dealing with a noisy channel, which may intentionally or unintentionally alter your data in transit, so you need to ensure your algorithm is robust to that. In this case you want an algorithm robust to lossy recompression, assuming nothing else takes place, e.g., resizing, cropping, etc.
I would start with a literature review to find an algorithm that fits any other lower priority criteria you may have. Keep in mind that the algorithm will probably end up being more complex compared to simply altering pixel values directly, which can be done in a few lines of code. Especially if the algorithm is only applicable to jpeg images. And it's likely it'll implement some kind of error correction, which will decrease your message capacity.

Related

Image based steganography that survives resizing?

I am using a startech capture card for capturing video from the source machine..I have encoded that video using matlab so every frame of that video will contain that marker...I run that video on the source computer(HDMI out) connected via HDMI to my computer(HDMI IN) once i capture the frame as bitmap(1920*1080) i re-size it to 1280*720 i send it for processing , the processing code checks every pixel for that marker.
The issue is my capture card is able to capture only at 1920*1080 where as the video is of 1280*720. Hence in order to retain the marker I am down scaling the frame captured to 1280*720 which in turn alters the entire pixel array I believe and hence I am not able to retain marker I fed in to the video.
In that capturing process the image is going through up-scaling which in turn changes the pixel values.
I am going through few research papers on Steganography but it hasn't helped so far. Is there any technique that could survive image resizing and I could retain pixel values.
Any suggestions or pointers will be really appreciated.
My advice is to start with searching for an alternative software that doesn't rescale, compress or otherwise modify any extracted frames before handing them to your control. It may save you many headaches and days worth of time. If you insist on implementing, or are forced to implement a steganography algorithm that survives resizing, keep on reading.
I can't provide a specific solution because there are many ways this can be (possibly) achieved and they are complex. However, I'll describe the ingredients a solution will most likely involve and your limitations with such an approach.
Resizing a cover image is considered an attack as an attempt to destroy the secret. Other such examples include lossy compression, noise, cropping, rotation and smoothing. Robust steganography is the medicine for that, but it isn't all powerful; it may be able to provide resistance to only specific types attacks and/or only small scale attacks at that. You need to find or design an algorithm that suits your needs.
For example, let's take a simple pixel lsb substitution algorithm. It modifies the lsb of a pixel to be the same as the bit you want to embed. Now consider an attack where someone randomly applies a pixel change of -1 25% of the time, 0 50% of the time and +1 25% of the time. Effectively, half of the time it will flip your embedded bit, but you don't know which ones are affected. This makes extraction impossible. However, you can alter your embedding algorithm to be resistant against this type of attack. You know the absolute value of the maximum change is 1. If you embed your secret bit, s, in the 3rd lsb, along with setting the last 2 lsbs to 01, you guarantee to survive the attack. More specifically, you get xxxxxs01 in binary for 8 bits.
Let's examine what we have sacrificed in order to survive such an attack. Assuming our embedding bit and the lsbs that can be modified all have uniform probabilities, the probability of changing the original pixel value with the simple algorithm is
change | probability
-------+------------
0 | 1/2
1 | 1/2
and with the more robust algorithm
change | probability
-------+------------
0 | 1/8
1 | 1/4
2 | 3/16
3 | 1/8
4 | 1/8
5 | 1/8
6 | 1/16
That's going to affect our PSNR quite a bit if we embed a lot of information. But we can do a bit better than that if we employ the optimal pixel adjustment method. This algorithm minimises the Euclidean distance between the original value and the modified one. In simpler terms, it minimises the absolute difference. For example, assume you have a pixel with binary value xxxx0111 and you want to embed a 0. This means you have to make the last 3 lsbs 001. With a naive substitution, you get xxxx0001, which has a distance of 6 from the original value. But xxx1001 has only 2.
Now, let's assume that the attack can induce a change of 0 33.3% of the time, 1 33.3% of the time and 2 33.3%. Of that last 33.3%, half the time it will be -2 and the other half it will be +2. The algorithm we described above can actually survive a +2 modification, but not a -2. So 16.6% of the time our embedded bit will be flipped. But now we introduce error correcting codes. If we apply such a code that has the potential to correct on average 1 error every 6 bits, we are capable of successfully extracting our secret despite the attack altering it.
Error correction generally works by adding some sort of redundancy. So even if part of our bit stream is destroyed, we can refer to that redundancy to retrieve the original information. Naturally, the more redundancy you add, the better the error correction rate, but you may have to double the redundancy just to improve the correction rate by a few percent (just arbitrary numbers here).
Let's appreciate here how much information you can hide in a 1280x720 (grayscale) image. 1 bit per pixel, for 8 bits per letter, for ~5 letters per word and you can hide 20k words. That's a respectable portion of an average novel. It's enough to hide your stellar Masters dissertation, which you even published, in your graduation photo. But with a 4 bit redundancy per 1 bit of actual information, you're only looking at hiding that boring essay you wrote once, which didn't even get the best mark in the class.
There are other ways you can embed your information. For example, specific methods in the frequency domain can be more resistant to pixel modifications. The downside of such methods are an increased complexity in coding the algorithm and reduced hiding capacity. That's because some frequency coefficients are resistant to changes but make embedding modifications easily detectable, then there are those that are fragile to changes but they are hard to detect and some lie in the middle of all of this. So you compromise and use only a fraction of the available coefficients. Popular frequency transforms used in steganography are the Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT).
In summary, if you want a robust algorithm, the consistent themes that emerge are sacrificing capacity and applying stronger distortions to your cover medium. There have been quite a few studies done on robust steganography for watermarks. That's because you want your watermark to survive any attacks so you can prove ownership of the content and watermarks tend to be very small, e.g. a 64x64 binary image icon (that's only 4096 bits). Even then, some algorithms are robust enough to recover the watermark almost intact, say 70-90%, so that it's still comparable to the original watermark. In some case, this is considered good enough. You'd require an even more robust algorithm (bigger sacrifices) if you want a lossless retrieval of your secret data 100% of the time.
If you want such an algorithm, you want to comb the literature for one and test any possible candidates to see if they meet your needs. But don't expect anything that takes only 15 lines to code and 10 minutes of reading to understand. Here is a paper that looks like a good start: Mali et al. (2012). Robust and secured image-adaptive data hiding. Digital Signal Processing, 22(2), 314-323. Unfortunately, the paper is not open domain and you will either need a subscription, or academic access in order to read it. But then again, that's true for most of the papers out there. You said you've read some papers already and in previous questions you've stated you're working on a college project, so access for you may be likely.
For this specific paper, table 4 shows the results of resisting a resizing attack and section 4.4 discusses the results. They don't explicitly state 100% recovery, but only a faithful reproduction. Also notice that the attacks have been of the scale 5-20% resizing and that only allows for a few thousand embedding bits. Finally, the resizing method (nearest neighbour, cubic, etc) matters a lot in surviving the attack.
I have designed and implemented ChromaShift: https://www.facebook.com/ChromaShift/
If done right, steganography can resiliently (i.e. robustly) encode identifying information (e.g. downloader user id) in the image medium while keeping it essentially perceptually unmodified. Compared to watermarks, steganography is a subtler yet more powerful way of encoding information in images.
The information is dynamically multiplexed into the Cb Cr fabric of the JPEG by chroma-shifting pixels to a configurable small bump value. As the human eye is more sensitive to luminance changes than to chrominance changes, chroma-shifting is virtually imperceptible while providing a way to encode arbitrary information in the image. The ChromaShift engine does both watermarking and pure steganography. Both DRM subsystems are configurable via a rich set of of options.
The solution is developed in C, for the Linux platform, and uses SWIG to compile into a PHP loadable module. It can therefore be accessed by PHP scripts while providing the speed of a natively compiled program.

Bit operators issues and Steganography in image processing

Steganography link shows a demonstration of steganography. My question is when the number of bits to be replaced, n =1, then the method is irreversible i.e the Cover is not equal to Stego (in ideal and perfect cases the Cover used should be identical to the Steganography result). It only works perfectly when the number of bits to be replaced is n=4,5,6!! When n=7, the Stego image becomes noisy and different from the Cover used and the result does not become inconspicuous. So, it is evident that there has been an operation of steganography. Can somebody please explain why that is so and what needs to be done so as to make the process reversible and lossless.
So let's see what the code does. From the hidden image you extract the n most significant bits (MSB) and hide them in the n least significant bits (LSB) in the cover image. There are two points to notice about this, which answer your questions.
The more bits you change in your cover image, the more distorted your stego image will look like.
The more information you use from the hidden image, the closer the reconstructed image will look to the original one. The following link (reference) shows you the amount of information of an image from the most to the least significant bit.
If you want to visually check the difference between the cover and stego images, you can use the Peak Signal-to-Noise-Ratio (PSNR) equation. It is said the human eye can't distinguish differences for PSNR > 30 dB. Personally, I wouldn't go for anything less than 40 but it depends on what your aim is. Be aware that this is not an end-all, be-all type of measurement. The quality of your algorithm depends on many factors.
No cover and stego images are supposed to be the same. The idea is to minimise the differences so to resist detection and there are many compromises to achieve that, such as the size of the message you are willing to hide.
Perfect retrieval of a secret image requires hiding all the bits of all the pixels, which means you can only hide a secret 1/8th of the cover image size. Note though that this is worst case scenario, which doesn't consider encryption, compression or other techniques. That's the idea but I won't provide a code snippet based on the above because it is very inflexible.
Now, there are cases where you want the retrieval to be lossless, either because the data are encrypted or of sensitive nature. In other cases an approximate retrieval will do the job. For example, if you were to encode only the 4 MSB of an image, someone extracting the secret would still get a good idea of what it initially looked like. If you still want a lossless method but not the one just suggested, you need to use a different algorithm. The choice of the algorithm depends on various characteristics you want it to have, including but not restricted to:
robustness (how resistant the hidden information is to image editing)
imperceptibility (how hard it is for a stranger to know the existence of a secret, but not necessarily the secret itself, e.g. chi-square attack)
type of cover medium (e.g., specific image file type)
type of secret message (e.g., image, text)
size of secret

In an A/V stream, is the amount of data streamed constant or fluctuating?

The amount of activity in an A/V stream can vary. For instance, if the data being streamed is from an empty, silent room, there is much less going on than if the data is something like a loud and explosive video game.
What I am wondering is whether the actual amount of data going up and down differs depending on this subjective interpretation of "activity". In other words, am I downloading less data when watching a stream of the empty room versus the active video game? My hunch has always been a resounding "no"; after all, how would the program know the difference between the two?
I'm asking now, though, because I've noticed a difference when streaming video in the past. The video always seems to be fine during periods of subjectively "low" activity, and it begins to lag or skip during periods of "high" activity. Is this just coincidence, or is there actually some kind of algorithm or service in place which dilutes data in periods of low activity or something like that?
Well, the thing is that audio and video streams are compressed. They can be compressed with any one of a whole range of formats. Some formats will aim for a % reduction in size, some will set a quality value, others will perform the same steps whether the data is simple or complex.
Take for example the jpg and png formats. Open up your favourite editor and create a 640x480px image, filled with pure white. Now save that file and look at it's size. Now apply noise to the image and save it as a new file. Compare the two - see the huge difference in size..
I got 1.37kb for the white image, 331kb for the noisy one. (a single 8x8 or 16x16 tile may be repeated for the entire white image, unique 8x8 or 16x16 blocks must be used for the noisy one)
VBR (variable bit rate) and CBR (constant bit rate) are two frequently used terms when video transcoding (changing from one format to another)
Anyway - the answer is 'it depends on the format' - some formats do work like that, some don't.
The video card is always sending the same quantity of data to the screen each frame, even if there is very little information in it - it's uncompressed. Transmitted audio and video on the other hand are (almost) always compressed, so when there's less information, it takes less data to convey it.

array data compression that is holding 13268 bits(1.66kBytes)

i.e array is having 100*125 bits of data for each aircraft+8 ascii messages each of 12 characters
what compression technique should i apply to such data
Depends mostly on what those 12500 bits look like, since that's the biggest part of your data. If there aren't any real patterns in it, or if they aren't byte-sized or word-sized patterns, "compressing" it may actually make it bigger, since almost every compression algorithm will add a small amount of extra data just to make decompression possible.

Efficient way to fingerprint an image (jpg, png, etc)?

Is there an efficient way to get a fingerprint of an image for duplicate detection?
That is, given an image file, say a jpg or png, I'd like to be able to quickly calculate a value that identifies the image content and is fairly resilient to other aspects of the image (eg. the image metadata) changing. If it deals with resizing that's even better.
[Update] Regarding the meta-data in jpg files, does anyone know if it's stored in a specific part of the file? I'm looking for an easy way to ignore it - eg. can I skip the first x bytes of the file or take x bytes from the end of the file to ensure I'm not getting meta-data?
Stab in the dark, if you are looking to circumvent meta-data and size related things:
Edge Detection and scale-independent comparison
Sampling and statistical analysis of grayscale/RGB values (average lum, averaged color map)
FFT and other transforms (Good article Classification of Fingerprints using FFT)
And numerous others.
Basically:
Convert JPG/PNG/GIF whatever into an RGB byte array which is independent of encoding
Use a fuzzy pattern classification method to generate a 'hash of the pattern' in the image ... not a hash of the RGB array as some suggest
Then you want a distributed method of fast hash comparison based on matching threshold on the encapsulated hash or encoding of the pattern. Erlang would be good for this :)
Advantages are:
Will, if you use any AI/Training, spot duplicates regardless of encoding, size, aspect, hue and lum modification, dynamic range/subsampling differences and in some cases perspective
Disadvantages:
Can be hard to code .. something like OpenCV might help
Probabilistic ... false positives are likely but can be reduced with neural networks and other AI
Slow unless you can encapsulate pattern qualities and distribute the search (MapReduce style)
Checkout image analysis books such as:
Pattern Classification 2ed
Image Processing Fundamentals
Image Processing - Principles and Applications
And others
If you are scaling the image, then things are simpler. If not, then you have to contend with the fact that scaling is lossy in more ways than sample reduction.
Using the byte size of the image for comparison would be suitable for many applications. Another way would be to:
Strip out the metadata.
Calculate the MD5 (or other suitable hashing algorithm) for the
image.
Compare that to the MD5 (or whatever) of the potential dupe
image (provided you've stripped out
the metadata for that one too)
You could use an algorithm like SIFT (Scale Invariant Feature Transform) to determine key points in the pictures and match these.
See http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
It is used e.g. when stitching images in a panorama to detect matching points in different images.
You want to perform an image hash. Since you didn't specify a particular language I'm guessing you don't have a preference. At the very least there's a Matlab toolbox (beta) that can do it: http://users.ece.utexas.edu/~bevans/projects/hashing/toolbox/index.html. Most of the google results on this are research results rather than actual libraries or tools.
The problem with MD5ing it is that MD5 is very sensitive to small changes in the input, and it sounds like you want to do something a bit "smarter."
Pretty interesting question. Fastest and easiest would be to calculate crc32 of content byte array but that would work only on 100% identical images. For more intelligent compare you would probably need some kind of fuzy logic analyzis...
I've implemented at least a trivial version of this. I transform and resize all images to a very small (fixed size) black and white thumbnail. I then compare those. It detects exact, resized, and duplicates transformed to black and white. It gets a lot of duplicates without a lot of cost.
The easiest thing to do is to do a hash (like MD5) of the image data, ignoring all other metadata. You can find many open source libraries that can decode common image formats so it's quite easy to strip metadata.
But that doesn't work when image itself is manipulated in anyway, including scaling, rotating.
To do exactly what you want, you have to use Image Watermarking but it's patented and can be expensive.
This is just an idea: Possibly low frequency components present in the DCT of the jpeg could be used as a size invariant identifier.