I created a script which access a URL with basic authentication. Once I've passed the credentials, it will download the file in my local folder. The problem is I got an incorrect filename. Here's my sample code:
#!/usr/bin/env perl
use strict;
use warnings;
use WWW::Mechanize;
use HTTP::Cookies;
my $url = "http://sampleurl.com";
my $dir = 'C:\\pl';
my $mech = WWW::Mechanize->new();
$mech->cookie_jar(HTTP::Cookies->new());
$mech ->credentials("sampleurl.com:80", "sampleurl.com", "username", "password");
$mech->get($url);
my $res = $mech->res();
if($res->is_success){
my $filename = $res->filename();
print $filename;
$mech->save_content( $dir.'\\'.$filename, binmode => ':raw', decoded_by_headers => 1 );
print $mech->status;
}else{
print "Error";
}
exit 0;
Instead of downloading sample_url.DOC, it only downloaded sample with no file extension. can you help with my problem? I want to download the whole file.
There's no guarantee that $res->filename(); will produce a file extension or anything at all for that matter. The page you're currently reading doesn't have a filename extension for example.
You will have to guess a filename extension from the media type.
use MIME::Types qw(by_mediatype);
...
my $filename = $r->filename();
if(!$filename) { $filename = 'untitled'; }
if($filename !~ /\.[a-zA-Z0-9]{1,4}$/) {
my $type = $res->header('Content-Type');
my $ext = 'txt';
if($type) {
my #types = by_mediatype($type);
if($#types > -1) {
$ext = $types[0][0];
}
}
$filename .= '.' . $ext;
}
print $filename;
Related
I am trying to use perl getstore to get a list of image from URL after reading a text file containing the file names, I created the code and able to run it successfully but I do not know where is the file saved, i checked the disk size it and shows that every time i run the code the hard disk free space decrease, so i assume there are file saved but I can't find it. So where does perl getstore save file and what is the correct way to save image from a link ?
use strict;
use warnings;
use LWP::UserAgent;
use LWP::Simple;
my $url = "https://labs.jamesooi.com/images/";
my $ua = LWP::UserAgent->new;
$ua->agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/59.0.3071.104 Safari/537.36");
my $file = 'image-list.txt';
open (DATA, $file) or die "Could not open $file: $!";
while(<DATA>){
my $link = "$url" . "$_";
my $filename = "$_";
print $link;
print $filename;
my $req = HTTP::Request->new(GET => $link);
my $res = $ua->request($req);
if($res->is_success){
my $rc = getstore($link, $filename);
if(is_success($rc)){
print "Success\n";
}else{
print "Error\n";
}
} else {
print $res->status_line, "\n";
}
}
According to
the documentation,
getstore(url, file) takes the URL as the first argument and the second argument is the file name where the result is stored. If the file name is a relative path (it doesn't begin with a slash /) it will be relative to the current working directory.
But you read the name from a file and then treat the full line, including the newline character, as the file name. That's probably not what you want, so you should use chomp to remove the newline.
Apart from that:
You are doing first a GET request using LWP::UserAgent to retrieve the file but ignore the response and instead call getstore to retrieve and store the same resource if the first GET was successful. It would be simpler to either just save the result from the first GET or just skip it and use only getstore.
You are using DATA as a file handle. While this is not wrong, DATA is already an implicit file handle which points to the program file after the __DATA__ marker, so I recommend to use a different file handle.
When using a simplified version of the code the file gets successfully stored:
use strict;
use warnings;
use LWP::Simple;
my $url = "https://labs.jamesooi.com/images/";
my $file = 'image-list.txt';
open (my $fh, '<', $file) or die "Could not open $file: $!";
while ( <$fh> ) {
chomp; # remove the newline from the end of the line
my $link = $url . $_;
my $filename = $_;
my $rc = getstore($link, $filename);
if (is_success($rc)) {
print "Success\n";
}
else {
print "Error\n";
}
}
I have a perl script that is supposed to use HTTP::Response to grab an XML file and then parse it. The script works great, except when it can't reach the server its trying to get the info from.
I know I have to have an error checking and if an error does exist, use a return to continue on with the loop, I have done with with NET::SNMP and SSH, but I can't seem to get it working for this scenario. I'm about to be pull my hair in frustration. Any help is greatly appreciated.
#! /usr/bin/perl
use LWP::UserAgent;
use HTTP::Request::Common;
use Net::SNMP;
use XML::Simple;
use HTTP::Status;
$ua = LWP::UserAgent->new;
if ( open ( FH, "DeviceList.txt" ) )
{
while ( defined ( my $line = <FH> ) )
{
$line =~ s/\s+$//;
$device = $line;
&checkcon;
}
close FH;
}
else
{
print "DeviceList.txt file not found\n";
}
#exit;
sub checkcon
{
my ($req, $error) = HTTP::Request->new(GET => 'https://'.$device.'/getxml?location=/HelloWorld');
$ua->ssl_opts(SSL_verify_mode => SSL_VERIFY_NONE);
$ua->timeout(10);
$req->authorization_basic('test', 'test');
$test = $ua->request($req)->content;
print $req;
if ($test =~ "parser error") {
print "No Workie\n";
return;
}
#if (!is_success ($req))
#{
#print "No Workie!";
#return;
#}
# create object
my $xml = new XML::Simple;
# read XML file
my $data = $xml->XMLin("$test");
print "Looping";
# access XML data
`echo "$device,$data->{Hello}{World}{content}" >> Test.txt`;
}
exit 0;
I'm really new to Perl Script and I'm developing a script to access a URL with NTLM authentication and save the response content in a folder. This content is a .xls, .doc, .pdf, .ppt, etc file. Actually,I was able to develop the NTLM authentication code. But my other requirement is to save the response content to a folder in the server. Can you help me with this?
#!/usr/bin/perl
use LWP::UserAgent;
use HTTP::Request::Common;
my $url = "http://myurl.com/AdsSDAF34141J";
my $ua = new LWP::UserAgent(keep_alive => 1);
my $username = 'ap\<username>';
my $password = '<password>';
$ua->credentials('myurl.com:80', '', $username, $password);
my $req = GET $url;
print "--Peforming request now...---------\n";
my $res = $ua->request($req);
print "--Done with request ...---------\n";
if ($res->is_success) {
print $res->content;
} else {
print "Error: " . $res->status_line . "\n";
}
exit 0;
I want to save the $res->content into a folder. Like I said this $res->content is a file of type .xls, .doc, .ppt, etc. Thanks in advance
Like this way:
if ($res->is_success) {
my $filename = 'file.xls';
open(my $fh,'>',$filename) or die $!;
binmode($fh);
print $fh $res->content;
close($fh);
} else {
or
### this way it automatically stored in that file
my $filename = 'file.xls';
$ua->request( $req, $filename );
I am trying to automate one of my task where i have to download a last 5 releases of some softwares let say Google talk from http://www.filehippo.com/download_google_talk/.
I have never done such type of programming i mean, to interact with Web through perl .I have just read and came to know that through CGI module we can implement this thing so i tried with this module.
If some body can give me better advice then please you are welcome :)
My code :
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use CGI::Carp qw/fatalsToBrowser/;
my $path_to_files = 'http://www.filehippo.com/download_google_talk/download/298ba15362f425c3ac48ffbda96a6156';
my $q = CGI->new;
my $file = $q->param('file') or error('Error: No file selected.');
print "$file\n";
if ($file =~ /^(\w+[\w.-]+\.\w+)$/) {
$file = $1;
}
else {
error('Error: Unexpected characters in filename.');
}
if ($file) {
download($file) or error('Error: an unknown error has occured. Try again.');
}
sub download
{
open(DLFILE, '<', "$path_to_files/$file") or return(0);
print $q->header(-type => 'application/x-download',
-attachment => $file,
'Content-length' => -s "$path_to_files/$file",
);
binmode DLFILE;
print while <DLFILE>;
close (DLFILE);
return(1);
}
sub error {
print $q->header(),
$q->start_html(-title=>'Error'),
$q->h1($_[0]),
$q->end_html;
exit(0);
}
In above code i am trying to print the file name which i wan to download but it is displaying error message.I am not able to figure it out why this error "Error: No file selected." is comming.
Sorry, but you are in the wrong track. Your best bet is this module: http://metacpan.org/pod/WWW::Mechanize
This page contain a lot of example to start with: http://metacpan.org/pod/WWW::Mechanize::Examples
It could be more elegant but I think this code easier to understand.
use strict;
use warnings;
my $path_to_files = 'http://www.filehippo.com/download_google_talk/download/298ba15362f425c3ac48ffbda96a6156';
my $mech = WWW::Mechanize->new();
$mech->get( $path_to_files );
$mech->save_content( "download_google_talk.html" );#save the base to see how it looks like
foreach my $link ( $mech->links() ){ #walk all links
print "link: $link\n";
if ($link =~ m!what_you_want!i){ #if it match
my $fname = $link;
$fname =~ s!\A.*/!! if $link =~ m!/!;
$fname .= ".zip"; #add extension
print "Download $link to $fname\n";
$mech->get($link,":content_file" => "$fname" );#download the file and stoore it in a fname.
}
}
Is there any perl module like File::Remote, that works over http (read only)? Something like
$magic_module->open( SCRAPE, "http://somesite.com/");
while(<SCRAPE>)
{
#do something
}
Yes, of course. You can use LWP::Simple:
use LWP::Simple;
my $content = get $url;
Don't forget to check if the content is not empty:
die "Can't download $url" unless defined $content;
$content will be undef it some error occurred during downloading.
Also you can use File::Fetch module:
File::Fetch
->new(uri => 'http://google.com/robots.txt')
->fetch(to => \(my $file));
say($file);
With HTTP::Tiny:
use HTTP::Tiny qw();
my $response = HTTP::Tiny->new->get('http://example.com/');
if ($response->{success}) {
print $response->{content};
}
If you want unified interface to handle both local, remote (HTTP/FTP) and whatever else files, use IO::All module.
use IO::All;
# reading local
my $handle = io("file.txt");
while(defined(my $line = $handle->getline)){
print $line
}
# reading remote
$handle = io("http://google.com");
while(defined(my $line = $handle->getline)){
print $line
}