what is the difference between tar and gZ? - iphone

when i compress the file "file.tar.gZ" in iphone SDK, it gives file.tar , but both tar and tar.gZ gives same size?any help please?

*.tar means that multiple files are combined to one. (Tape Archive)
*.gz means that the files are compressed as well. (GZip compression)
Edit: that the size is the same doesn't say a lot. Sometimes files can't be compressed.

As Rhapsody said, tar is an archive containing multiple files, and gz is a file that is compressed using gzip. The reason why two formats are used is because gzip only supports compressing one file - perhaps due to the UNIX philosophy that a program should do one thing, and do it well.
In any case, if you have the option you may want to use bzip2 compression, which is more efficient (IE, compresses files to a smaller size) than gzip.

Related

Spark Scala read a specific file within a compressed archive

How do I read (without OS uncompress) a specific file in a compressed archive?
e.g. I have thousands of compressed archives with:
file1.tsv <--- read by default
file2.tsv <--- I want this one in each archive
file3.tsv
I want to read file2.tsv in every compressed archive without uncompressing each first.
When I tried spark.read("/compressed-archive.tar.gz") it only seems to read file1.tsv.

copy (multiple) files only if filesize is smaller

I'm trying to make my image reference library take up less space. I know how to make Photoshop batch save directories of images with a particular amount of compression. BUT some of my images were originally save with more compression than what I would have done.
So I wind up with two directories of images, some of the newer files have a larger filesize, some smaller, and some the same. I want to copy over the new images into the old directory, excluding any files that have a larger filesize (or the same, though these probably aren't numerous enough for me to care about the extra time to process them).
I obviously don't want to sit there and parse through each file, but other than that I'm not picky about how it gets tackled.
running Windows 10, btw.
We have similar situations. Instead of Photoshop, I use FFmpeg (using its qscale option) to batch re-encode multiple images into a subfolder then use XXCOPY to overwrite only the larger original source images. In fact I ended up creating my BATCH file which let FFmpeg do the batch e-encoding (using its "best" qscale setting), then let ExifTool batch copy the metadata to the newly encoded images, then let XXCOPY copy only the smaller newly created images. All automated, with the "new" folder and its leftover newly created but larger-sized images deleted too. Thus I save considerable disk space, as I have many images categorized/grouped in many different folders. But you should make a test run first or back up your images. I hope this works for you.
Here is my XXCOPY command line:
xxcopy "C:\SOURCE" "C:\DESTINATION" /s /bzs /y
The original post/forum where I learned this from is:
overwrite only files wich are smaller
https://groups.google.com/forum/#!topic/alt.msdos.batch.nt/Agooyf23kFw
Just to add, XXCOPY can also do it if the larger file size is wanted instead which I think is /BZL. I think it's also mentioned in that original post/forum.

Get Maximum Compression from 7zip compression algorithm

I am trying to compress some of my large document files. But most of files are getting compresses by only 10% maximum. I am using 7zip Terminal Commands.
7z a filename.7z -m0=LZMA -mx=9 -mmt=on -aoa -mfb=64 filename.pptx
Any suggestion on changing parameters. I need at least 30% compression ratio.
.pptx files or .docx files are internally .zip archives. You can not expect a lot of compression on an already compressed file.
Documentation states lzma2 handles better data that can not be compressed, so you can try with
7z a -m0=lzma2 -mx filename.7z filename.pptx
But the required 30% is almost unreachable.
If you really need that compression, you could use the fact that a pptx is just a fancy zip file:
Unzip the pptx, then compress it with 7zip. To recover an equivalent (but not identical) pptx decompress with 7zip and recompress with zip.
There are probably some complications, for example with epub there is a certain file that must be stored uncompressed as first file in the archive at a certain offset from the start. I'm not familiar with pptx, but it might have similar requirements.
I think it's unlikely that the small reduction in file size is worth the trouble, but it's the only approach I can think of.
Depending on what's responsible for the size of the pptx you could also try to compress the contained files. For example by recompressing png files with a better compressor, stripping unnecessary data (e.g. meta-data or change histories) or applying lossy compression with lower quality settings for jpeg files.
Well just an idea to max compressing is
'recompress' these .zip archives(the .docx, .pptx, jar...) using -m0 (storing = NoCompression) and then
apply lzma2 on them
lzma2 is petty good - however if the file contains many jpg's consider to give the opensource packer peazip or more specify paq8o a try. Paq8 has a build in Jpeg compressor and supports range compression. So it will also come along with jpg's the are inside some other file. Winzip's zipx in contrast to this will require pure jpg files and is useless in this case.
But again to make PAQ effectively working/compressing your target file you'll need to 'null' the zip/deflate compression, turn it into an uncompressed zip.
Well PAQ is probably a little exotic, however it's in my eye's more honest and clear than zipx. PAQ is unsupport so it's as always a good idea to just google for what don't have/know and you will find something.
Zipx in contrast may appears a little intrigious since it looks like a normal zip and files are listed properly in Winrar or 7zip but when you like to extract jpg's it will fail so if the user is not experienced it may seem like the zip corrupted. It'll be much harder to find out that is a zipx that so far only winzip or The Unarchiver(unar.exe) can handle properly.
PPTX, XLSX, and DOCX files can indeed be compressed effectively if there are many of them. By unzipping each of them into their directories, an archiver can find commonalities between them, deduplicating the boilerplate XML as well as any common text between them.
If you must use the ZIP format, first create a zero-compression "store" archive containing all of them, then ZIP that. This is necessary because each file in a ZIP archive is compressed from scratch without taking advantage of redundancies across different files.
By taking advantage of boilerplate deduplication, 30% should be a piece of cake.

Unzip NSData without temporary file

I've found a couple of libs (LiteZip and ZipArchive) that allow to unzip files on iPhone. But both of them require an input as a file. Is there a library that allows to directly unzip NSData containing zip-archived data without writing it to temporary file?
I've tried to adopt mentioned above libs for that, but with no success so far.
In this answer to this question, I point out the CocoaDev wiki category on NSData which adds zip / unzip support to that class. This would let you do this entirely in memory.
From what I understand, the zip format stores files separately and each stored file is compressed using a compression algorithm (generally it's the DEFLATE algorithm).
If you're only interested in uncompressing data that was compressed using the DEFLATE algorithm you could use this zlib addition to NSData from Google Toolbox For Mac
It doesn't need temporary files.

How can I tar files larger than physical memory using Perl's Archive::Tar?

I'm using Perl's Archive::Tar module. Problem with it is it pulls everything on to the memory and the does archiving and then writes on to the file system so there is limitation on the maximum file size that can be archived. Most of the times it says out of memory. In case of GNU tar, it takes chunk of file, archives it and writes it on to the memory so it can handle files of any size. How can I do that using Perl's Archive::Tar module.
It looks like Archive::Tar::Wrapper is your best bet. I've not tried it myself, but it uses your system's tar executable and doesn't keep files in memory.
Contrary to Chas. Owen's answer, Archive::Tar::Streamed does keep files in memory and does not use your system's tar. It actually uses Archive::Tar internally, but it processes one file at a time (taking advantage of the fact that tar archives can be concatenated). This means that Archive::Tar::Streamed can handle archives bigger than memory, as long as each individual file in the archive will fit in memory. But that's not what you asked for.
It looks like there is a different module that doesn't use an in-memory structure: Archive::Tar::Streamed. The downside is that it requires tar to be available on the system it is run on. Still, it is better than puppet-stringing tar yourself.