How to combine zip files with Archive::Zip in Perl - perl

I have two zip files, A.zip and B.zip. I want to add whatever files are in A.zip to B.zip.
How can I do this with Archive::Zip in Perl? I thought I could do something like this:
my $zipA = Archive::Zip->new();
my $zipB = Archive::Zip->new();
die 'read error' unless ($zipA->read( 'A.zip' ) == AZ_OK );
my #members = $zipA->memberNames();
for my $m (#members) {
my $file = $zipA->removeMember($m);
$zipB->addMember($file);
}
but unless I call writeToFileNamed() then no files get created, and if I do call it, B.zip gets overwritten with the contents of A.zip.
I could read in the contents of B.zip, and write them along with the contents of A.zip back to B.zip but this seems really inefficient. (My problem actually involves millions of text files compressed into thousands of zip files.)
Is there a better way to do this?

Using Archive::Zip:
my $zipA = Archive::Zip->new('A.zip');
my $zipB = Archive::Zip->new('B.zip');
foreach ($zipA->members) {
$zipA->removeMember($_);
$zipB->addMember($_);
}
$zipB->overwrite;
The problem is:
You need to add the actual members, not just the memberNames.
You need to read B.zip before you can add to it.
(I'll leave it to you to do error handling etc)

You may try chilkat::CkZip instead of Archive::Zip. Its QuickAppend() method seems to be helpful.

Related

Copying files with different extensions

I am new to perl and i am trying to create a script which can copy several files with different extensions from one directory to another. I am trying to use an Array but not sure if this is possible but i am open to other ways if it is easier.
My code looks something like this;
my $locationone = "filepath"
my $locationtwo = "filepath"
my #files = ("test.txt", "test.xml", "test.html");
if (-e #files){
rcopy($locationone, $locationtwo)
}
The code might be a little rough because i'm going off the top of my head and i'm still new to perl.
I'd really appreciate the help.
Regards
The original idea you have, is right, but it misses something.
...
use File::Copy; # you will use this for the copy!
...
my $dest_folder = "/path/to/dest/folder";
my #sources_filenames = ("test.txt", "test.xml", "test.html");
my $source_folder = "/path/to/source/folder";
We set some useful variables: folder names and an array of file names.
foreach my $filename (#sources_filename) {
We run into the file names
my $source_fullpath = "$source_folder/$filename"; # you could use
my $dest_fullpath = "$dest_folder/$filename"; # File::Spec "catfile" too.
Then we build (for each file) a full path starting name and a full path destination name.
copy($source_fullpath, $dest_fullpath) if -e $source_fullpath;
Lastly we copy only if file exists.
}
You can do something like this:
foreach my $file (#files)
{
next unless (-e "$locationone/$file");
`mv $locationone/$file $locationtwo`;
}

How to recursively in perl readdir contents starting from root and then according to a user specified level retrieve files that end in .txt

I was trying to ask for help I posted a previous question. Also I don't want to use any modules unless it is a built in module I prefer to write my own. I know the recursive part to list all files from multiple directories, but don't understand where exactly or how I would specify the desired level of search, so if I give as parameters root and 3 it should look through at least 3 directories and then retrieve all files as long it is less than or equal to 3. Any help is greatly appreciated.
do you just want it to list all files, or to return them in an array. If merely printing them is enough, you do something like:
sub print_txt_recurse() {
my ($filepath, $level) = #_;
#some code to get file paths and and print txt files going through each file
elsif (-d $file && $level > 1 ) {
print_txt_recurse($file, $level - 1);
}
return;
}
You could use File::Find, a core module of Perl, which means it supposes to be available everywhere.
See Core modules (F)

how to copy a file(input from user) with the name edited to the same directory?

I would like to request a txt file from user and duplicated an exact copy with the name edited on the duplicated file in the same location.
Eg: User provide /file/works/done/abc.txt
The duplicated file will need to be /file/works/done/abc_edited.txt
I am able to duplicate the file.However, I cant append the name to the one I wish to have.
Assumption: $file is argument from user, eg: $file is /file/works/done/abc.txt
Code as below:
my $a = '_edited';
my $duplicatedfile = $file.$a;
copy($file,$duplicatedfile) or die "Failed to copy $file: $!\n
After execution, the duplicated file is /file/works/done/abc.txt_edited
However the one that I wish to have is /file/works/done/abc_edited.txt
Show us some code and a problem you're having with it, but please don't ask us to write the whole thing for you. You might want to look at the File::Copy module for an easy-to-use "copy file" method.
Oh well, after reading your comment it looks like all you need is something like
my $new_file_name = $file;
$new_file_name =~ s/\.([^\.]+)$/_edited.$1/;
use File::Basename;
my $full_path = '/file/works/done/abc.txt';
my ($name, $path, $ext) = fileparse($full_path, qr/\.[^.]*/);
my $new_full_path = $path.$name.'_edited'.$ext;
print $new_full_path;

How can my Perl script determine whether an Excel file is in XLS or XLSX format?

I have a Perl script that reads data from an Excel (xls) binary file. But the client that sends us these files has started sending us XLSX format files at times. I've updated the script to be able to read those as well. However, the client sometimes likes to name the XLSX files with an .xls extension, which currently confuses the heck outta my script since it uses the file name to determine which file type it is.
An XLSX file is a zip file that contains XML stuff. Is there a simple way for my script to look at the file and tell whether it's a zip file or not? If so, I can make my script go by that instead of just the file name.
Yes, it is possible by checking magic number.
There are quite a few modules in Perl for checking magic number in a file.
An example using File::LibMagic:
use strict;
use warnings;
use File::LibMagic;
my $lm = File::LibMagic->new();
if ( $lm->checktype_filename($filename) eq 'application/zip; charset=binary' ) {
# XLSX format
}
elsif ( $lm->checktype_filename($filename) eq 'application/vnd.ms-office; charset=binary' ) {
# XLS format
}
Another example, using File::Type:
use strict;
use warnings;
use File::Type;
my $ft = File::Type->new();
if ( $ft->mime_type($file) eq 'application/zip' ) {
# XLSX format
}
else {
# probably XLS format
}
.xlsx files have the first 2 bytes as 'PK', so a simple open and examination of the first 2 characters will do.
Edit: Archive::Zip is a better
solution
# Read a Zip file
my $somezip = Archive::Zip->new();
unless ( $somezip->read( 'someZip.zip' ) == AZ_OK ) {
die 'read error';
}
Use File::Type:
my $file = "foo.zip";
my $filetype = File::Type->new( );
if( $filetype->mime_type( $file ) eq 'application/zip' ) {
# File is a zip archive.
...
}
I just tested it with a .xlsx file, and the mime_type() returned application/zip. Similarly, for a .xls file the mime_type() is application/octet-stream.
You can detect the xls file by checking the first bytes of the file for Excel headers.
A list of valid older Excel headers can be gotten from here (unless you know exact version of their Excel, check for all applicable possibilities):
http://toorcon.techpathways.com/uploads/headersig.txt
Zip headers are described here: http://en.wikipedia.org/wiki/ZIP_(file_format)#File_headers
but i'm not sure if .xlsx files have the same headers.
File::Type's logic seems to be "PK\003\004" as the file header to decide on zip files... but I'm not certain if that logic would work as far as .xlsx, not having a file to test.
The-Evil-MacBook:~ ivucica$ file --mime-type --brief file.zip
application/zip
Hence, probably comparing
`file --mime-type --brief $filename`
with application/zipwould do the trick of detecting zips. Of course, you need to have file installed which is quite usual on UNIX systems. I'm afraid I cannot provide Perl example since all knowledge of Perl evaporated from my memory, and I have no examples at hand.
I can't say about Perl, but with the framework I use, .Net, there are a number of libraries available that will manipulate zip files you could use.
Another thing that I've seen people use is the command-line version of WinZip. It give a return-value that is 0 when a file is unzipped and non-zero when there is an error.
This may not be the best way to do this, but it's a start.

Check length per file instead of entire request in CGI Upload

I am attempting to modify the Uber-Uploader perl script so that when an upload is checked if it meets the minimum requirements, it checks per file instead of just the entire request.
I'm not too experienced with perl, and don't know how to do this. Currently the script simply does this:
elsif($ENV{'CONTENT_LENGTH'} > $config{'max_upload_size'}){
my $max_size = &format_bytes($config{'max_upload_size'}, 99);
die("Maximum upload size of $max_size exceeded");
}
All that does is check the content length of the request (which contains multiple files), and fail when the total is greater than the max allowed size.
Is there a way to check it per file?
Note: Changing upload scripts is not an option. Don't try
I'm not sure what you mean by "Changing upload scripts is not an option.", but have you tried something like this?
my $q = CGI->new();
my #files = $q->upload();
foreach my $file (#files){
if ((-s $file) > $config{'max_upload_size'}){
die("Maximum upload size of $file exceeded");
}
}
(NOTE: this is untested code!!!!!)