Is it safe to compute a hash on an image compressed in a lossless format such as PNG, GIF, etc.? - hash

I was wondering if any lossless image compression format such as PNG comes with some kind of uniqueness guarantee, i.e. that two different compressed binaries always decode to different images.
I want to compute the hash of images that are stored in a lossless compression format and am wondering if computing the hash of the compressed version would be sufficient.
(There are some good reasons to compute the hash on the uncompressed image but there are out of the scope of my question here.)

No, that's not true for PNG. The compression procedure have many parameters (filtering type used for each row, ZLIB compression level and settings), so a single raw image can result in many different PNG files. Even worse, PNG allows to include ancillary data (chunks) with miscelaneous info (for example, textual comments).

Related

Are file formats independent of endian of system?

There are many common file formats, for example jpeg images.
Suppose a jpeg image exists on two systems, one using big endian and the other small endian.
Will the saveed jpeg files look different?
In other words, if we have the images saved in a contiguous area of memory say starting from byte 0, will the images be the same?

Best practice to compress bitmap with LZ4

I'm packing some image resources for my game, and since this is a typical "compress once, decompress multiple" scenario, LZ4 High Compression fits me well (LZ4HC take longer time to compress, but decompress very fast).
I compressed a bitmap from 7.7MB to 3.0MB, which looks good to me, until I found that the PNG version is only 1.9MB.
I know that LZ4 HC do not have the ratio that deflate (which is used by PNG) does, but the ratio 2.55 vs 4.05 looks not right.
I searched and find that before compressing, PNG format will perform a Filtering operation, though I don't the details, it looks like that the Filtering move manipulate the data to fits the compress algorithm better.
So my question is:
Do I need to perform a filtering move before compressing using lz4?
If yes, where can I get a library (or code snippet) to perform filtering?
If no, is there any solution to make a PNG (or other lossless image formats) compress slow but decompress fast?
The simplest filtering in PNG is just taking the difference of subsequent pixels. The first pixel is sent as is, the next pixel is sent as the difference of that pixel and the previous pixel, and so on. That would be quite fast, and provide a good bit of the compression gain of filtering.

Apple Compression Library: identify the algorithm used to compress data

Apple has four compression algorithms in its Compression library.
How can I tell which algorithm was used to compress a given NSData instance? The compressed data should have a header of some kind, no?
edit: by extension, how do you know the size of compressed/decompressed data and thus the required buffer size?

Storing lots of images on server compression

We have a project which will generate lots (hundreds of thousands) of .PNG images that are around 1mb. Rapid serving is not a priority as we use the images internally, not front end.
We know to use filesystem not DB to store.
We'd like to know how best to compress these images on the server to minimise long term storage costs.
linux server
They already are compressed, so you would need to recode the images into another lossless format, while preserving all of the information present in the PNG files. I don't know of a format that will do that, but you can roll your own by recoding the image data using a better lossless compressor (you can see benchmarks here), and have a separate metadata file that retains the other information from the original .png files, so that you can reconstruct the original.
The best you could get losslessly, based on the benchmarks, would be about 2/3 of their current size. You would need to test the compressors on your actual data. Your mileage may vary.

image and video compression

What are similar compressors to the RAR algorithm?
I'm interested in compressing videos (for example, avi) and images (for example, jpg)
Winrar reduced an avi video (1 frame/sec) to .88% of it's original size (i.e. it was 49.8MB, and it went down to 442KB)
It finished the compression in less than 4 seconds.
So, I'm looking to a similar (open) algorithm. I don't care about decompression time.
Compressing "already compressed" formats are meaningless. Because, you can't get anything further. Even some archivers refuse to compress such files and stores as it is. If you really need to compress image and video files you need to "recompress" them. It's not meant to simply convert file format. I mean decode image or video file to some extent (not require to fully decoding), and apply your specific models instead of formats' model with a stronger entropy coder. There are several good attempts for such usages. Here is a few list:
PackJPG: Open source and fast performer JPEG recompressor.
Dell's Experimental MPEG1 and MPEG2 Compressor: Closed source and proprietry. But, you can at least test that experimental compressor strength.
Precomp: Closed source free software (but, it'll be open in near future). It recompress GIF, BZIP2, JPEG (with PackJPG) and Deflate (only generated with ZLIB library) streams.
Note that recompression is usually very time consuming process. Because, you have to ensure bit-identical restoration. Some programs even check every possible parameter to ensure stability (like Precomp). Also, their models have to be more and more complex to gain something negligible.
Compressed formats like (jpg) can't really be compressed anymore since they have reached entropy; however, uncompressed formats like bmp, wav, and avi can.
Take a look at LZMA