Multifile favicon.ico much bigger than sum of sizes of component files - favicon

I want to create multifile favicon.ico according to great favicon cheat sheet.
I created 3 .png files, optimized them with OptiPNG and received files with 1, 2 and 3kb size.
Now I'm trying to create favicon.ico from them using Imagemagick but final file is around 15kb big (sum of component files is around 6kB).
What causes this effect and how i can avoid it ?

A workaround is to rely on the HTTP gzip compression.
For example, I packed 3 PNG pictures (3580 bytes in total) in a 15086 bytes favicon.ico file. When I download it with gzip compression enabled on the server side, I get 2551 bytes of data. This is even more efficient than downloading the PNG pictures one by one, as gzip is usually not enabled for this kind of file.
However, gzip is often not configured for .ico files (this is more for text files). On Apache, this can be fixed by adding:
<IfModule mod_deflate.c>
(... other rules here...)
AddOutputFilterByType DEFLATE image/x-icon
</IfModule>
in an Apache configuration file, such as /etc/apache2/mods-available/deflate.conf.
This is not the answer you expected and I hope someone will come with a magic command line to produce the lightweight favicon.ico we deserve!

Another workaround is using other another tool to create favicon file.
I found (on stack ofc ;)) png2ico tool. It gave me 8kb file (sum of components is around 6) which is size I can live with.
Please read comments to this answer - philippe_b observed that sometimes png2ico gives him poor results.

Only the 256x256 32bit icon is PNG compressed, others are stored as classic icons in .ICO file. So, there's a logical increase in filesize since it will decompress the other smaller size icons.

Related

Generate PNG file with pre-known CRC

Is it possible to create a PNG file with a predefined CRC? (kind of a programming challenge..)
I have a python script to generate hex codes with the target CRC, but I'm not sure how to make a valid PNG out of it.
BTW - it may be that I'm talking nonsense, but it sounds possible on theory (right?)
You can use spoof.c to do that, either at the level of a PNG chunk or at the level of the entire file. (Note that a PNG file does not contain a CRC of the whole thing, only CRCs of the chunks.)

Get Maximum Compression from 7zip compression algorithm

I am trying to compress some of my large document files. But most of files are getting compresses by only 10% maximum. I am using 7zip Terminal Commands.
7z a filename.7z -m0=LZMA -mx=9 -mmt=on -aoa -mfb=64 filename.pptx
Any suggestion on changing parameters. I need at least 30% compression ratio.
.pptx files or .docx files are internally .zip archives. You can not expect a lot of compression on an already compressed file.
Documentation states lzma2 handles better data that can not be compressed, so you can try with
7z a -m0=lzma2 -mx filename.7z filename.pptx
But the required 30% is almost unreachable.
If you really need that compression, you could use the fact that a pptx is just a fancy zip file:
Unzip the pptx, then compress it with 7zip. To recover an equivalent (but not identical) pptx decompress with 7zip and recompress with zip.
There are probably some complications, for example with epub there is a certain file that must be stored uncompressed as first file in the archive at a certain offset from the start. I'm not familiar with pptx, but it might have similar requirements.
I think it's unlikely that the small reduction in file size is worth the trouble, but it's the only approach I can think of.
Depending on what's responsible for the size of the pptx you could also try to compress the contained files. For example by recompressing png files with a better compressor, stripping unnecessary data (e.g. meta-data or change histories) or applying lossy compression with lower quality settings for jpeg files.
Well just an idea to max compressing is
'recompress' these .zip archives(the .docx, .pptx, jar...) using -m0 (storing = NoCompression) and then
apply lzma2 on them
lzma2 is petty good - however if the file contains many jpg's consider to give the opensource packer peazip or more specify paq8o a try. Paq8 has a build in Jpeg compressor and supports range compression. So it will also come along with jpg's the are inside some other file. Winzip's zipx in contrast to this will require pure jpg files and is useless in this case.
But again to make PAQ effectively working/compressing your target file you'll need to 'null' the zip/deflate compression, turn it into an uncompressed zip.
Well PAQ is probably a little exotic, however it's in my eye's more honest and clear than zipx. PAQ is unsupport so it's as always a good idea to just google for what don't have/know and you will find something.
Zipx in contrast may appears a little intrigious since it looks like a normal zip and files are listed properly in Winrar or 7zip but when you like to extract jpg's it will fail so if the user is not experienced it may seem like the zip corrupted. It'll be much harder to find out that is a zipx that so far only winzip or The Unarchiver(unar.exe) can handle properly.
PPTX, XLSX, and DOCX files can indeed be compressed effectively if there are many of them. By unzipping each of them into their directories, an archiver can find commonalities between them, deduplicating the boilerplate XML as well as any common text between them.
If you must use the ZIP format, first create a zero-compression "store" archive containing all of them, then ZIP that. This is necessary because each file in a ZIP archive is compressed from scratch without taking advantage of redundancies across different files.
By taking advantage of boilerplate deduplication, 30% should be a piece of cake.

Method to decompress a PDF (non-Adobe) while retaining form fields?

I found a similar question that involves Acrobat, but in this case the PDF was made with a combination of MS Word and CenoPDF v3, with which I'm unfamiliar. Additionally the PDF is version 1.3. I'd like to decompress it, to see its low-level workings and make some changes. It's easy with GhostScript's -dCompressPages=false parameter, but that simultaneously strips all the fill-in form functionality. Is there a method for decompressing the file while leaving everything else intact? A quick search of the docs for tcpdf and fpdi (cited in the link) didn't reveal a compression option.
Ghostscript and pdfwrite isn't a good combination. The PDF file you get out is NOT the same as the one you put in. This is because of the way that Ghostscript and pdfwrite work; the input is fully interpreted to a sequence of graphics primitives, which is sent to the Ghostscript graphics library. These are then sent to the requested device, most devices then render the result to a bitmap, but the pdfwrite family reassemble those graphics primitives int a new PDF file.
Note that the contents of the new PDF file have no relationship to the original, other than the appearance when rendered. Ghostscript and pdfwrite do maintain much of the non-marking content of PDF files such as hyperlinks and so on (which obviously don't get turned into graphics primitives), by interpreting them into pdfmark operations (an extension to the PostScript language defined by Adobe). However, even if Ghostscript and pdfwrite maintained all this content, the resulting PDF file wouldn't be the same as the original one decompressed....
There are tools which will decompress PDF files, and I would recommend one of our other products, MuPDF. A part of this is mutool, and "mutool clean -d in.pdf out.pdf" will decompress pretty much everything in a PDF file
QPDF can decompress PDF documents (among other things). I used this tool in the past and it preserved forms and data.
The tool has some issues with large PDFs (can take too much time and memory for decompression). The tool can produce incomplete output (with warnings in console) for some partially broken / nonstandard PDFs.

is partial gz decompression possible?

For working with images that are stored as .gz files (my image processing software can read .gz files for shorter/smaller disk time/space) I need to check the header of each file.
The header is just a small struct of a fixed size at the start of each image, and for images that are not compressed, checking it is very fast. For reading the compressed images, I have no choice but to decompress the whole file and then check this header, which of course slows down my program.
Would it be possible to read the first segment of a .gz file (say a couple of K), decompress this segment and read the original contents? My understanding of gz is that after some bookkeeping at the start, the compressed data is stored sequentially -- is that correct?
so instead of
1. open big file F
2. decompress big file F
3. read 500-byte header
4. re-compress big file F
do
1. open big file F
2. read first 5 K from F as stream A
3. decompress A as stream B
4. read 500-byte header from B
I am using libz.so but solutions in other languages are appreciated!
You can use gzip -cd file.gz | dd ibs=1024 count=10 to uncompress just the first 10 KiB, for example.
gzip -cd decompresses to the standard output.
Pipe | this into the dd utility.
The dd utility copies the standard input to the standard output.
Sodd ibs=1024 sets the input block size to 1024 bytes instead of the default 512.
And count=10 Copies only 10 input blocks, thus halting the gzip decompression.
You'll want to do gzip -cd file.gz | dd count=1 using the standard 512 block size and just ignore the extra 12 bytes.
A comment highlights that you can use gzip -cd file.gz | head -c $((1024*10)) or in this specific case gzip -cd file.gz | head -c $(512). The comment that the original dd relies on gzip decompressing in 1024 doesn't seem to true. For example dd ibs=2 count=10 decompresses the first 20 bytes.
Yes, it is possible.
But don't reinvent the wheel, the HDF5 database supports different compression algorithms (gz among them) and you can address different pieces. It is compatible with Linux and Windows and there are wrappers to many languages. It also supports reading and decompressing in parallel, that is very useful if you use high compression rates.
Here is a comparison of read speed using different compression algorithms from Python through PyTables:
A Deflate stream can have multiple blocks back to back. But you can always decompress just the number of bytes you want, even if it's part of a larger block. The zlib function gzread takes a length arg, and there are various other ways to decompress a specific amount of plaintext bytes, regardless of how long the full stream is. See the zlib manual for a list of functions and how to use them.
It's not clear if you want to modify just the headers. (You mention recompressing the whole file, but option B doesn't recompress anything). If so, write headers in a separate Deflate block so you can replace that block without recompressing the rest of the image. Use Z_FULL_FLUSH when you call the zlib deflate function to write the headers. You probably don't need to record the compressed length of the headers anywhere; I think it can be computed when reading them to figure out which bytes to replace.
If you aren't modifying anything, recompressing the whole file makes no sense. You can seek and restart decompression from the start after finding headers you like...

what is the difference between tar and gZ?

when i compress the file "file.tar.gZ" in iphone SDK, it gives file.tar , but both tar and tar.gZ gives same size?any help please?
*.tar means that multiple files are combined to one. (Tape Archive)
*.gz means that the files are compressed as well. (GZip compression)
Edit: that the size is the same doesn't say a lot. Sometimes files can't be compressed.
As Rhapsody said, tar is an archive containing multiple files, and gz is a file that is compressed using gzip. The reason why two formats are used is because gzip only supports compressing one file - perhaps due to the UNIX philosophy that a program should do one thing, and do it well.
In any case, if you have the option you may want to use bzip2 compression, which is more efficient (IE, compresses files to a smaller size) than gzip.