How to use Archive::Extract safely - againist zip bomb or similar? - perl

Problem outline:
need allow upload ZIP files (and tgz and more compressed directory trees) via web-from
the zip files should be extracted for their content handling
planning to use Archive::Extract for the extracting
here are things like ZIP BOMBS and like...
From the manual
Archive::Extract can use either pure perl modules or command line
programs under the hood. Some of the pure perl modules (like
Archive::Tar and Compress::unLZMA) take the entire contents of the
archive into memory, which may not be feasible on your system.
Consider setting the global variable $Archive::Extract::PREFER_BIN to
1 , which will prefer the use of command line programs and won't
consume so much memory.
The questions are:
When I set the $Archive::Extract::PREFER_BIN = 1 - i'm enough protected againist ZIP-BOMB like things?
$Archive::Extract::PREFER_BIN protect me againist much memory usage - but, the standard unzip, tar -z unrar binaries are safe againist zip bomb like attacks?
If not - how to handle safely uploaded compressed directory tree? (so here is not only one file inside the e.g zip archive).

$Archive::Extract::PREFER_BIN = 1 doesn't protect you against zip bombs, you are passing the problem to the binary unzip tool of your system.
This SO question may helps you. I like the idea of running a second process with ulimit.

Related

Virtual filesystem in Perl

I'm looking for a virtual filesystem layer in Perl. Something that would provide a general abstraction for basic filesystem routines like ls, mkdir and so on, regardless how the actual filesystem is implemented.
I'd like an interface like this:
# create a directory "/some/path/tmp" in my current filesystem
my $plainfs = Module::new->(type => 'local', root=>'/some/path);
$plainfs->mdkir("/tmp");
# create "tmp" dir on a remote filesystem
my $sshfs = Module::new->(type=>'ssh', root=>'user:password#example.com:~/pub')
$sshfs->mdkir("/tmp");
I found the VFS package on MetaCPAN, unfortunately there are only empty, unimplemented modules.
Is something already implemented? Right now, I'm looking for only “local” filesystems and ftp or ssh—I don't need a database “filesystem” or any other exotic “filesystem” like CVS or so. Searching 20k MetaCPAN modules is painful without any tagging system or alike…
Perhaps File::System is what you're looking for. It provides basic functionalities found in common operating systems for managing a virtual file system (not necessarily comprised only of files and directories).
Most of the functionalities are presented as method of the File::System::Object package.
what about some FUSE implementation? ( file system in userspace ) ? I would guess there is at least one pseudo-filesystem implemented in perl based on that. After all, it should be quite easy to implement, basically it's no more than some set of operations like mount, ls, df, stat and so on. I was once through autofs sources in C, looked pretty straightforward. You might want to see http://code.google.com/p/mogilefs/ as well.
Don't be too stuck up on the module approach. All you need is some utility that mounts SSH/FTP filesystem as a local filesystem and then you will simply use standard commands like cd, mkdir and so on. The reason why you don't see any modules for this is that this approach is generally preferred.
Look at http://sourceforge.net/apps/mediawiki/fuse/index.php?title=FileSystems
You will simply use FUSE to mount any of those file systems and that is it. Here are some links to look at, but most of those can be got as packages in most distributions too.
http://sourceforge.net/projects/lufs/
http://lftpfs.sourceforge.net
Here is module to simply mount FUSE file systems within perl:
http://search.cpan.org/~dpavlin/Fuse/Fuse.pm
There are a LOT of File::* modules which handle different parts of cross-platform filesystem management.
For example:
use File::Spec::Functions qw(catfile);
Will let you get my $filename = catfile $root, $path, "$filename.$ext"; or my $new_directory = catfile $path, "new_sub_directory"; and be sure to use the correct separators, e.g. / or \, et cetera.
Another thing you seem to want can be had with:
use File::Path qw(make_path);
which is pretty handy, and can be called like make_path($new_directory, { mode => 0755 });
I'm not really sure if File::System actually handles remote systems the way you want.
A couple different ways occur to me to handle that, but I think Net::SSH::Expect is what I've used in the past, and isn't too bad, although you'd probably have an easier time if you could somehow mount the remote filesystem locally, do what you have to do, then unmount it.

how to check for activity or lack thereof on a unix file directory using perl or unix commands

Scenario:
I have a process where many files are being copied (scp'd) to a DestinationServer by Host1, Host2, Host3, Host4 for example. Going to the same common directory: DestinationServer:/home/target. All the files are unique so no files will be overwritten. Host1-Host4 will have a cronjob that will launch their scp script to DestinationServer. The caveat is the Hosts are in different time zones, locations. So, they will finish at different times.
Need:
Since the files are being scp'd to Destination:/home/target, what is the best way to programmatically check when those scp's from the other Hosts are done??
Options:
My options are to programmatically do this either in perl or shell if possible.
What do I look for, what unix commands or perl modules could I use to help determine when the processes would finish? Any ideas, examples would be great! Thanks.
Use a Maildir kind of approach: copy all files to a temporary directory, then after the transfer is complete have the originating host perform a rename into the target directory via ssh. That way when a file appears in the target directory, you know that it is complete.
I suggest this because if you just scp files into the target directory and monitor the directory in whatever way, you cannot distinguish a complete transfer from an interrupted scp command or a network failure.
SGI::FAM, Sys::Gamin
Similar but alternative way to Jouni is to use semaphore files. Before scp-ing files originating host puts up semaphore-file and when finished, remove it. So you know, it's time.

How can I tar multiple files in Perl?

How can I tar multiple directories and also append files with some pattern like '.txt' and exclude some directories and exclude some patterns like '.exe' all into a single tar file. The main point is the number of directories are unknown(dynamic), so I need to loop through I guess?
I'd use Archive::Tar and populate #filelist with Class::Path (specifically Class::Path::Dir's recurse method)
Assuming you have worked out what files you want using File::Find then something like
my #dir = qw/a b/ ;
system "tar -cvf mytar #dir" ;
might work. But you might find that the command line is too long.
In which case maybe write the list of files to a file and use the option
--files-from=NAME
(and please don't tell me you are not allowed to write to files)
If for some reason you cannot, or are not permitted to, install additional modules beyond the base system you could use File::Find instead of Class::Path.
It sounds like you already know how to call out to the system tar command so I'll leave it at that.

How can I resume downloads in Perl?

I have a project that depends upon some other binaries to be downloaded from web at install time.For this what i do is:
if ( file-present-in-src/)
# skip that file
else
# use wget to download the file
The problem with this approach is that when I interrupt a download in middle, and do invoke the script next time, the partially downloaded file is also skipped (which is not desired), also I want wget to resume the download of the partially downloaded file.
How should I go about it:
Possible Solutions I could think of:
Let the file to be downloaded to some file say download_tmp. Move to original file
if successful.
Handle SIG{'INT'} to write proper cleanup code.
But none of these could help resume the partial file download,
Any insights?
Fist, I don't understand what this has to do with Perl, since you're using wget to do the dowloading ... You could use libwww-perl (perldoc LWP) and have more control about the download process.
Then I second your idea of downloading to a "tmp" filename and move the file on success.
However I think you need to go further and verify the integrity of the files. Doing an MD5 or SHA hash is very easy, and match the downloaded one with what you're expecting. You can have a short file on server containing the checksum (filename.md5). Determine success only when you have a match.
Note that catching all the signals and generally trying to make the process unkillable, and then expecting it to have worked is bound to fail at one point or another. There could be a network timeout, a crash, power failure, configuration problem on the server ... you should instead assume downloads can fail, because they will, and code so that your process can recover.
Finally you're not telling us what kind of binaries you're downloading and what you're doing with them. Since you use wget I'm going to assume you're on Unix; you should consider using RPM+Yum or the likes, they handle all this for you. RPM are easy to write, really.
use your first approach ..
download to "FileName".tmp
move "FileName".tmp to "FileName" move! not copy
once per diem clean out all .tmp files (paranoia rulez)
You could just use wget's -N and -c options and remove the entire "if file exists" logic.

How can I copy a directory but ignore some files in Perl?

In my Perl code, I need to copy a directory from one location to another on the same host excluding some files/patterns (e.g. *.log, ./myDir/abc.cl).
What would be the optimum way of doing this in Perl across all the platforms?
On Windows, xcopy is one such solution. On unix platforms, is there a way to do this in Perl?
I think you're looking for rsync. It's not Perl, but it's going to work a lot better than anything you make in Perl:
% rsync --exclude='*.log' --exclude='./myDir/abc.cl' SOURCE DEST
If you have a bunch of patterns, you can put those all in a file:
*.log
./myDir/abc.cl
Now ignore all the patterns in a file:
% rsync --exclude-from=do_not_sync.txt SOURCE DEST
I'd use File::Find, and step over each file, but instead of calling File::Copy's copy() on each file, first test to see if it matches the pattern, and then next if it does.
On *nix, you can use native tar command, with -exclude options. Then after creating the tar file, you can bring it over to your destination to untar it.