perl save a file downloaded by lwp - perl

Im using LWP to download an executable file type and with the response in memory, i am able to hash the file. However how can i save this file on my system? I think i'm on the wrong track with what i'm trying below. The download is successful as i am able to generate the hash correctly (I've double checked it by downloading the actual file and comparing the hashes).
use strict;
use warnings;
use LWP::Useragent;
use Digest::MD5 qw( md5_hex );
use Digest::MD5::File qw( file_md5_hex );
use File::Fetch;
my $url = 'http://www.karenware.com/progs/pthasher-setup.exe';
my $filename = $url;
$filename =~ m/.*\/(.*)$/;
$filename = $1;
my $dir ='/download/two';
print "$filename\n";
my $ua = LWP::UserAgent->new();
my $response = $ua->get($url);
die $response->status_line if !$response->is_success;
my $file = $response->decoded_content( charset => 'none' );
my $md5_hex = md5_hex($file);
print "$md5_hex\n";
my $save = "Downloaded/$filename";
unless(open SAVE, '>>'.$save) {
die "\nCannot create save file '$save'\n";
}
print SAVE $file;
close SAVE;
If you are wondering why do i not instead download everything then parse the folder for each file and hash, its because im downloading all these files in a loop. And during each loop, i upload the relevant source URL (where this file was found) , along with the file name and hash into a database at one go.

Try getstore() from LWP::Simple
use strict;
use warnings;
use LWP::Simple qw(getstore);
use LWP::UserAgent;
use Digest::MD5 qw( md5_hex );
use Digest::MD5::File qw( file_md5_hex );
use File::Fetch;
my $url = 'http://www.karenware.com/progs/pthasher-setup.exe';
my $filename = $url;
$filename =~ m/.*\/(.*)$/;
$filename = $1;
my $dir ='/download/two';
print "$filename\n";
my $ua = LWP::UserAgent->new();
my $response = $ua->get($url);
die $response->status_line if !$response->is_success;
my $file = $response->decoded_content( charset => 'none' );
my $md5_hex = md5_hex($file);
print "$md5_hex\n";
my $save = "Downloaded/$filename";
getstore($url,$save);

getstore is an excellent solution, however for anyone else reading this response in a slightly different setup, it may not solve the issue.
First of all, you could quite possibly just be suffering from a binary/text issue.
I'd change
my $save = "Downloaded/$filename";
unless(open SAVE, '>>'.$save) {
die "\nCannot create save file '$save'\n";
}
print SAVE $file;
close SAVE;
to
my $save = "Downloaded/$filename";
open my $fh, '>>', $save or die "\nCannot create save file '$save' because $!\n";
# on platforms where this matters
# (like Windows) this is needed for
# 'binary' files:
binmode $fh;
print $fh $file;
close $fh;
The reason I like this better is that if you have set or acquired some settings on your browser object ($ua), they are ignored in LWP::Simple's getstore, as it uses its own browser.
Also, it uses the three parameter version of open which should be safer.
Another solution would be to use the callback method and store the file while you are downloading it, if for example you are dealing with a large file. The hashing algorithm would have to be changed so it is probably not relevant here but here's a sample:
my $req = HTTP::Request->new(GET => $uri);
open(my $fh, '>', $filename) or die "Could not write to '$filename': $!";
binmode $fh;
$res = $ua->request($req, sub {
my ($data, $response, $protocol) = #_;
print $fh $data;
});
close $fh;
And if the size is unimportant (and the hashing is done some other way) you could just ask your browser to store it directly:
my $req = HTTP::Request->new(GET => $uri);
$res = $ua->request($req, $filename);

Related

How to load image from form directly into ImageMagick with Perl

Currently I am using the following code to allow a user to upload an image via a html form. It creates a copy of the image which I then read in to ImageMagick. But of course it would be way better to simply read the data from the form straight into the ImageMagick object. But I have not been able to achieve that.
use Image::Magick;
use MIME::Base64;
$arg = new CGI;
$fetch_photo = param('fileuploadphoto');
($data, $base64) = split /,/, $fetch_photo;
($type) = $data =~ m!data:image/(\w+);base64!;
$decoded = MIME::Base64::decode_base64($base64);
$filename = 'test.jpg';
open(my $file, '>', "$filename") or die "Error cannot open file: $file";
binmode $file;
print $file $decoded;
close($file);
$image = Image::Magick->new;
$image->Read($filename);
Thanks to Håkon Hægland for the answer. This works perfectly.
use Image::Magick;
use MIME::Base64;
$fetch_photo = param('fileuploadphoto');
($data, $base64) = split /,/, $fetch_photo;
$decoded = MIME::Base64::decode_base64($base64);
$image = Image::Magick->new;
$image->Read(blob => $decoded);

Where does PERL LWP::Simple getstore save the image?

I am trying to use perl getstore to get a list of image from URL after reading a text file containing the file names, I created the code and able to run it successfully but I do not know where is the file saved, i checked the disk size it and shows that every time i run the code the hard disk free space decrease, so i assume there are file saved but I can't find it. So where does perl getstore save file and what is the correct way to save image from a link ?
use strict;
use warnings;
use LWP::UserAgent;
use LWP::Simple;
my $url = "https://labs.jamesooi.com/images/";
my $ua = LWP::UserAgent->new;
$ua->agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/59.0.3071.104 Safari/537.36");
my $file = 'image-list.txt';
open (DATA, $file) or die "Could not open $file: $!";
while(<DATA>){
my $link = "$url" . "$_";
my $filename = "$_";
print $link;
print $filename;
my $req = HTTP::Request->new(GET => $link);
my $res = $ua->request($req);
if($res->is_success){
my $rc = getstore($link, $filename);
if(is_success($rc)){
print "Success\n";
}else{
print "Error\n";
}
} else {
print $res->status_line, "\n";
}
}
According to
the documentation,
getstore(url, file) takes the URL as the first argument and the second argument is the file name where the result is stored. If the file name is a relative path (it doesn't begin with a slash /) it will be relative to the current working directory.
But you read the name from a file and then treat the full line, including the newline character, as the file name. That's probably not what you want, so you should use chomp to remove the newline.
Apart from that:
You are doing first a GET request using LWP::UserAgent to retrieve the file but ignore the response and instead call getstore to retrieve and store the same resource if the first GET was successful. It would be simpler to either just save the result from the first GET or just skip it and use only getstore.
You are using DATA as a file handle. While this is not wrong, DATA is already an implicit file handle which points to the program file after the __DATA__ marker, so I recommend to use a different file handle.
When using a simplified version of the code the file gets successfully stored:
use strict;
use warnings;
use LWP::Simple;
my $url = "https://labs.jamesooi.com/images/";
my $file = 'image-list.txt';
open (my $fh, '<', $file) or die "Could not open $file: $!";
while ( <$fh> ) {
chomp; # remove the newline from the end of the line
my $link = $url . $_;
my $filename = $_;
my $rc = getstore($link, $filename);
if (is_success($rc)) {
print "Success\n";
}
else {
print "Error\n";
}
}

Downloading text files: Perl

The files are not downloading, please help.
#!/usr/bin/perl -w
require HTTP::Response;
require LWP::UserAgent;
open (INPUT, "ndb_id_file.txt") or die "can't open ndb_id_file.txt";
#input = <INPUT>
foreach $line(#input) {
$ua = LWP::UserAgent->new;
$ua->env_proxy('http');
$ua->proxy(['http', 'ftp'],'http://144020019:*******#netmon.****.ac.in:80');
response =
$ua->get('www.ndbserver.rutgers.edu/files/ftp/NDB/coordinates/na-biol/$line');
if ($response->is_success) {
$content = $response->content();
open(OUT,">$line.pdb") or die "Output file $line cannot be produced... Error...";
print (OUT "$content");
}
}
There are a number of problems with your program. The major ones being in this line
response = $ua->get('www.ndbserver.rutgers.edu/files/ftp/NDB/coordinates/na-biol/$line');
You are trying to assign to response, which is not a variable name
the value of $line is not being inserted into the URL because you are using single quotes
The contents of $line end with a linefeed, which should be removed using chomp
The URL has no scheme — it should start with http://
Apart from those points, you should fix these issues
You must always use strict and use warnings at the top of every Perl program you write. Adding -w on the shebang line is far inferior
You should use rather than require LWP::UserAgent. And there is no need to also use HTTP::Response as it is loaded as part of LWP
You should always use the three-parameter form of open with lexical file handles. And if the open fails you should print a die string that includes the value of $! which gives the reason for the failure
You should use while to read data from a file one line at a time, unless you have a good reason to need all of it in memory at once
There is no need to create a new user agent $ua every time around the loop. Just make one and use it to fetch every URL
You should use decoded_content instead of content to fetch the content of an HTTP::Response message in case it is compressed
Here is a program that includes all of those fixes. I haven't been able to test it but it does compile
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
my $in_file = 'ndb_id_file.txt';
open my $fh, '<', $in_file or die qq{Unable to open "$in_file" for input: $!};
my $ua = LWP::UserAgent->new;
$ua->env_proxy('http');
$ua->proxy(['http', 'ftp'], 'http://144020019:*******#netmon.****.ac.in:80');
while ( my $line = <$fh> ) {
chomp $line;
my $url = "http://www.ndbserver.rutgers.edu/files/ftp/NDB/coordinates/na-biol/$line";
my $response = $ua->get($url);
unless ( $response->is_success ) {
warn $response->status_line;
next;
}
my $content = $response->decoded_content;
my $out_file = "$line.pdb";
open my $out_fh, '>', $out_file or die qq{Unable to open "$out_file" for output: $!};
print $out_fh $content;
close $out_fh or die qq{Unable to close "$out_file": $!};
}

Decode with base64 a text file gzipped and read it

I retrieve from a system a text file which has as been (in order):
gzipped
encoded with base64
So I would like in Perl to decode it, unzip it and read it without passing by intermediate file.
I tried the following:
use Compress::Zlib;
use MIME::Base64;
my $workingDir = "./log/";
my $inputFile = $workingDir . "log_result_base64.txt";
my $readtmp ='';
open (INPFIC, $inputFile) or die "ERROR: Impossible to open file ($inputFile)\n";
while (my $buf = <INPFIC> ) {
$readtmp .= decode_base64($buf);
}
close(INPFIC);
my $output = uncompress($readtmp);
print $output;
But it does not work, the $output variable is still undef.
[Edit]
I gave up to do it by passing only by Variable.
I changed my script by creating a new file at each stage:
#!/usr/bin/perl
use strict ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
use MIME::Base64;
my $inputFile = $workingDir . "log_inbase64.txt";
my $inputFilegz = $workingDir . "log.txt.gz";
my $inputFileuncomp = $workingDir . "log.txt";
my #out;
my #readtmp;
my $readtmp;
# Reading the file encoded in base64
open (INPFIC, $inputFile) or die "ERROR: Impossible to open file ($inputFile)\n";
my #readtmp = <INPFIC>;
close(INPFIC);
$readtmp = join('',#readtmp);
# Decode in base64 to retreive a Gzip file
my $out = decode_base64($readtmp);
open my $fh, '>', $inputFilegz or die $!;
binmode $fh;
print $fh $out;
close $fh;
# Decompress the early created gzip file
gunzip $inputFilegz => $inputFileuncomp
or die "gunzip failed: $GunzipError\n";
# Reading the Text file
open (INPFIC, $inputFileuncomp) or die "ERROR: Impossible to open file ($inputFileuncomp )\n";
my #out = <INPFIC>;
close(INPFIC);
The uncompress method does not work for gzipped data.
IO::Uncompress::Gunzip can use scalar references instead of file names if you want to keep everything in memory.
Sample code:
use IO::Uncompress::Gunzip qw( gunzip $GunzipError );
use MIME::Base64 qw( decode_base64 );
my $tmp = decode_base64 do {
local $/;
<DATA>
};
gunzip \$tmp => \my $data or die "Could not gunzip: $GunzipError";
print $data;
__DATA__
H4sIAHWHLlUAAwvJyCxWAKLi/NxUhZLU4hKFlMSSRC4AsSDaaxcAAAA=
Should produce:
This is some test data
I'd put the whole file in a string before decode:
local $/ = undef;
my $str = <INPFIC>
my $dec = decode_base64 $str;
my $uncom = uncompress($dec)
According to Compress::Zlib doc, try to open and read in same time:
my $workingDir = "./log/";
my $inputFile = $workingDir . "log_result_base64.txt";
my $buffer;
my $output;
my $gz = gzopen($inputFile,"rb")
or die "Cannot open $inputFile: $gzerrno\n" ;
while ( $gz->gzread($buffer) > 0 ){
$output .= decode_base64 $buffer;
}
die "Error reading from $inputFile: $gzerrno" . ($gzerrno+0) . "\n"
if $gzerrno != Z_STREAM_END ;
$gz->gzclose();
print $output;

Why is my image download CGI script written in Perl not working?

#!/usr/bin/perl
use CGI ':standard';
use CGI::Carp qw(fatalsToBrowser);
my $files_location;
my $ID;
my #fileholder;
$files_location = "C:\Users\user\Documents\hello\icon.png";
open(DLFILE, "<$files_location") ;
#fileholder = <DLFILE>;
close (DLFILE) ;
print "Content-Type:application/x-download\n";
print "Content-Disposition:attachment;filename=$ID\n\n";
print #fileholder;
When I run this script, instead of returning the icon.png file it returns the download.pl (the name of the script given above) with no content inside it. What is the issue?
Script i am using currently.
#!C:\Perl64\bin\perl.exe -w
use CGI qw(:standard);
use File::Copy qw( copy );
use File::Spec::Functions qw( catfile );
use constant IMG_DIR => catfile(qw( D:\ ));
serve_logo(IMG_DIR);
sub serve_logo {
my ($dir) = #_;
my $cgi = CGI->new;
my $file = "icon.png";
#print $file;
defined ($file) or die "Invalid image name in CGI request\n";
send_file($cgi, $dir, $file);
return;
}
sub send_file
{
my ($cgi, $dir, $file) = #_;
my $path = catfile($dir, $file);
open my $fh, '<:raw', $path or die "Cannot open '$path': $!";
print $cgi->header( -type => 'application/octet-stream', -attachment => $file, );
binmode STDOUT, ':raw';
copy $fh => \*STDOUT, 8_192;
close $fh or die "Cannot close '$path': $!";
return;
}
There are quite a few issues. The first one is the fact that you are using #fileholder = <DLFILE>; to slurp a binary file. On Windows, automatic conversion of line endings will wreak havoc on the contents of that file.
Other issues are:
You are not checking the return value of open. We don't even know if open succeeded.
You never assign a value to $ID, meaning you're sending "filename=\n\n" in your response.
You are slurping a binary file, making the memory footprint of your program proportional to the size of the binary file. Robust programs don't do that.
You're useing CGI.pm, but you are neither using it nor have you read the docs.
You're using a bareword (i.e. package global) filehandle.
The fundamental reason, however, is that open fails. Why does open fail? Simple:
C:\temp> cat uu.pl
#!/usr/bin/env perl
use strict; use warnings;
my $files_location = "C:\Users\user\Documents\hello\icon.png";
print "$files_location\n";
Let's try running that, shall we?
C:\temp> uu
Unrecognized escape \D passed through at C:\temp\uu.pl line 5.
Unrecognized escape \h passed through at C:\temp\uu.pl line 5.
Unrecognized escape \i passed through at C:\temp\uu.pl line 5.
C:SERSSERDOCUMENTSHELLOICON.PNG
Here is a short script illustrating a better way:
use CGI qw(:standard);
use File::Copy qw( copy );
use File::Spec::Functions qw( catfile );
use constant IMG_DIR => catfile(qw(
E:\ srv localhost images
));
serve_logo(IMG_DIR);
sub serve_logo {
my ($dir) = #_;
my %mapping = (
'big' => 'logo-1600x1200px.png',
'medium' => 'logo-800x600.png',
'small' => 'logo-400x300.png',
'thumb' => 'logo-200x150.jpg',
'icon' => 'logo-32x32.gif',
);
my $cgi = CGI->new;
my $file = $mapping{ $cgi->param('which') };
defined ($file)
or die "Invalid image name in CGI request\n";
send_file($cgi, $dir, $file);
return;
}
sub send_file {
my ($cgi, $dir, $file) = #_;
my $path = catfile($dir, $file);
open my $fh, '<:raw', $path
or die "Cannot open '$path': $!";
print $cgi->header(
-type => 'application/octet-stream',
-attachment => $file,
);
binmode STDOUT, ':raw';
copy $fh => \*STDOUT, 8_192;
close $fh
or die "Cannot close '$path': $!";
return;
}
I also posted a detailed explanation on my blog.
It took me a while to figure what was wrong, so for those that end up here (as I did) having random issues serving large files, here's my advice:
Avoid File::Copy, as it's bugged for this purpose.
When serving data through CGI, syswrite can return undef ($! being 'Resource temporarily unavailable') for some time.
File::Copy stops in that case (returns 0, sets $!), failing to transfer the entire file (or stream).
Many different options to work around that, retrying the syswrite, or using blocking sockets, not sure which on is the best though !