Downloading text files: Perl - perl

The files are not downloading, please help.
#!/usr/bin/perl -w
require HTTP::Response;
require LWP::UserAgent;
open (INPUT, "ndb_id_file.txt") or die "can't open ndb_id_file.txt";
#input = <INPUT>
foreach $line(#input) {
$ua = LWP::UserAgent->new;
$ua->env_proxy('http');
$ua->proxy(['http', 'ftp'],'http://144020019:*******#netmon.****.ac.in:80');
response =
$ua->get('www.ndbserver.rutgers.edu/files/ftp/NDB/coordinates/na-biol/$line');
if ($response->is_success) {
$content = $response->content();
open(OUT,">$line.pdb") or die "Output file $line cannot be produced... Error...";
print (OUT "$content");
}
}

There are a number of problems with your program. The major ones being in this line
response = $ua->get('www.ndbserver.rutgers.edu/files/ftp/NDB/coordinates/na-biol/$line');
You are trying to assign to response, which is not a variable name
the value of $line is not being inserted into the URL because you are using single quotes
The contents of $line end with a linefeed, which should be removed using chomp
The URL has no scheme — it should start with http://
Apart from those points, you should fix these issues
You must always use strict and use warnings at the top of every Perl program you write. Adding -w on the shebang line is far inferior
You should use rather than require LWP::UserAgent. And there is no need to also use HTTP::Response as it is loaded as part of LWP
You should always use the three-parameter form of open with lexical file handles. And if the open fails you should print a die string that includes the value of $! which gives the reason for the failure
You should use while to read data from a file one line at a time, unless you have a good reason to need all of it in memory at once
There is no need to create a new user agent $ua every time around the loop. Just make one and use it to fetch every URL
You should use decoded_content instead of content to fetch the content of an HTTP::Response message in case it is compressed
Here is a program that includes all of those fixes. I haven't been able to test it but it does compile
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
my $in_file = 'ndb_id_file.txt';
open my $fh, '<', $in_file or die qq{Unable to open "$in_file" for input: $!};
my $ua = LWP::UserAgent->new;
$ua->env_proxy('http');
$ua->proxy(['http', 'ftp'], 'http://144020019:*******#netmon.****.ac.in:80');
while ( my $line = <$fh> ) {
chomp $line;
my $url = "http://www.ndbserver.rutgers.edu/files/ftp/NDB/coordinates/na-biol/$line";
my $response = $ua->get($url);
unless ( $response->is_success ) {
warn $response->status_line;
next;
}
my $content = $response->decoded_content;
my $out_file = "$line.pdb";
open my $out_fh, '>', $out_file or die qq{Unable to open "$out_file" for output: $!};
print $out_fh $content;
close $out_fh or die qq{Unable to close "$out_file": $!};
}

Related

Can't use an undefined value as a symbol reference on line 12

So I am trying to teach myself perl as a new language. I find the best way to learn a new language is to set myself a project. This project is a text game. I have just started this evening. I need to take input from the user and then write it to a file. As I am going to be doing this over and over again, I thought it would be best to put the code in subroutines, as you can see below.
The only problem is that I keep getting the following error:
Can't use an undefined value as a symbol reference at book1.pl line 12, <> line 2.
Any help would be greatly appreciated :)
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
my $filename = 'save.txt';
sub open_save{
open(my $fh, '>', $filename) or die "Could not open file '$filename' $!";
}
sub close_save{
close my $fh;
}
print "Welcome to the 40K universe\nWhat is your first name?";
my $first_name = <>;
print"What is your surname?";
my $surname = <>;
my $name = $first_name . $surname;
open_save();
print "$name";
close_save();
my creates and returns a new variable. You pass this new variable to close, which quite legitimately complains that it's not a file handle.
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
sub open_save {
my ($filename) = #_;
open(my $fh, '>', $filename)
or die "Can't open file '$filename': $!\n";
return $fh;
}
sub close_save {
my ($fh) = #_;
close $fh;
}
{
my $filename = 'save.txt';
...
my $fh = open_save($filename);
print $fh "$name\n";
close_save($fh);
}

How to replace ^M character in perl code in same file

I am looking to delete control-M character file in perl code (not perl one liner).
I tried this, but it will write to new file. Whats the way to do it in same file?
#!/usr/bin/perl
open (IN, '<', "FILE.dat") or die "$!";
open (OUT, '>', "FILE.dat.cpy") or die "$!";
while(<IN>)
{
$line = $_;
$line=~ tr/\015//d;
print OUT "$line";
}
close (IN);
close (OUT);
store the file internal in a String.
#!/usr/bin/perl
my $content = ''
open (IN, '<', "FILE.dat") or die "$!";
while(<IN>)
{
$line = $_;
$line=~ tr/\015//d;
$content .$line
}
close (IN);
open (OUT, '>', "FILE.dat") or die "$!";
print OUT $line;
close (OUT);
It could be done using $^I variable.
use autodie;
local $^I = "";
local #ARGV = "FILE.dat";
while (my $line = <>) {
$line=~ tr/\r//d;
print $line;
}
From perlvar
$^I - The current value of the inplace-edit extension. Use undef to disable inplace editing.
So it is undef by default, empty string is used to edit in-place, and non-empty string will be added as suffix to backup file name.
You needn't be afraid of coping the edited content to a new file, and then renaming. This is the standard way to edit content unless you're dealing with a file with fixed with records.
Check out How do I change, delete, or insert a line in a file, or append to the beginning of a file? to observe most of the ways to edit content, and the majority of them will ultimately be copying to a new file.
My preferred advice is to use in-place edit as demonstrated by mpapec. The only addition is that Windows forces you to specify a backup, so just need to add the unlink line after the processing.
use strict;
use warnings;
use autodie;
my $file = 'FILE.dat';
local #ARGV = $file;
local $^I = '.bak';
while (<>) {
tr/\r//d;
print;
}
unlink "$file$^I";
If you insist on being able to open the file only once, then perhaps you can take a look at I still don't get locking. I just want to increment the number in the file. How can I do this?
use strict;
use warnings;
use autodie;
use Fcntl qw(:seek);
my $file = 'afile.dat';
open my $fh, '+<:raw', $file;
my $data = do {local $/; <$fh>};
seek $fh, SEEK_SET, 0;
truncate $fh, 0;
$data =~ tr/\r//d;
print $fh $data;
close $fh;

Writing results in a text file with perl

I have a problem when the script print the whole line of text file in a result text file:
use strict;
use warnings;
use autodie;
my $out = "result2.txt";
open my $outFile, ">$out" or die $!;
my %permitted = do {
open my $fh, '<', 'f1.txt';
map { /(.+?)\s+\(/, 1 } <$fh>;
};
open my $fh, '<', 'f2.txt';
while (<$fh>) {
my ($phrase) = /(.+?)\s+->/;
if ($permitted{$phrase}) {
print $outFile $fh;
}
close $outFile;
The problem is in this line
print $outFile $fh;
Any idea please?
Thank you
print $outFile $fh is printing the value of the file handle $fh to the file handle $outFile. Instead you want to print the entire current line, which is in $_.
There are a couple of other improvements that can be made
You should always use the three-parameter form of open, so the open mode appears on its own as the second paremeter
There is no need to test the success of an open of autodie is in place
If you have a variable that contains the name of the output file, then you really should have ones for the names of the two input files as well
This is how your program should look. I hope it helps.
use strict;
use warnings;
use autodie;
my ($in1, $in2, $out) = qw/ f1.txt f2.txt result2.txt /;
my %permitted = do {
open my $fh, '<', $in1;
map { /(.+?)\s+\(/, 1 } <$fh>;
};
open my $fh, '<', $in2;
open my $outfh, '>', $out;
while (<$fh>) {
my ($phrase) = /(.+?)\s+->/;
if ($permitted{$phrase}) {
print $outfh $_;
}
}
close $outfh;
I think you want print $outfile $phrase here, don't you? The line you currently have is trying to print out a file handle reference ($fh) to a file ($outfile).
Also, just as part of perl best practices, you'll want to use the three argument open for your first open line:
open my $outFile, ">", $out or die $!;
(FWIW, you're already using 3-arg open for your other two calls to open.)
Although Borodin has provided an excellent solution to your question, here's another option where you pass your 'in' files' names to the script on the command line, and let Perl handle the opening and closing of those files:
use strict;
use warnings;
my $file2 = pop;
my %permitted = map { /(.+?)\s+\(/, 1 } <>;
push #ARGV, $file2;
while (<>) {
my ($phrase) = /(.+?)\s+->/;
print if $permitted{$phrase};
}
Usage: perl script.pl inFile1 inFile2 [>outFile]
The last, optional parameter directs output to a file.
The pop command implicitly removes inFile2's name off of #ARGV, and stores it in $file2. Then, inFile1 is read using the <> directive. The file name of inFile2 is then pushed onto #ARGV, and that file is read and a line is printed if $permitted{$phrase} is true.
Running the script without the last, optional parameter will print results (if any) to the screen. Using the last parameter saves output to a file.
Hope this helps!

perl save a file downloaded by lwp

Im using LWP to download an executable file type and with the response in memory, i am able to hash the file. However how can i save this file on my system? I think i'm on the wrong track with what i'm trying below. The download is successful as i am able to generate the hash correctly (I've double checked it by downloading the actual file and comparing the hashes).
use strict;
use warnings;
use LWP::Useragent;
use Digest::MD5 qw( md5_hex );
use Digest::MD5::File qw( file_md5_hex );
use File::Fetch;
my $url = 'http://www.karenware.com/progs/pthasher-setup.exe';
my $filename = $url;
$filename =~ m/.*\/(.*)$/;
$filename = $1;
my $dir ='/download/two';
print "$filename\n";
my $ua = LWP::UserAgent->new();
my $response = $ua->get($url);
die $response->status_line if !$response->is_success;
my $file = $response->decoded_content( charset => 'none' );
my $md5_hex = md5_hex($file);
print "$md5_hex\n";
my $save = "Downloaded/$filename";
unless(open SAVE, '>>'.$save) {
die "\nCannot create save file '$save'\n";
}
print SAVE $file;
close SAVE;
If you are wondering why do i not instead download everything then parse the folder for each file and hash, its because im downloading all these files in a loop. And during each loop, i upload the relevant source URL (where this file was found) , along with the file name and hash into a database at one go.
Try getstore() from LWP::Simple
use strict;
use warnings;
use LWP::Simple qw(getstore);
use LWP::UserAgent;
use Digest::MD5 qw( md5_hex );
use Digest::MD5::File qw( file_md5_hex );
use File::Fetch;
my $url = 'http://www.karenware.com/progs/pthasher-setup.exe';
my $filename = $url;
$filename =~ m/.*\/(.*)$/;
$filename = $1;
my $dir ='/download/two';
print "$filename\n";
my $ua = LWP::UserAgent->new();
my $response = $ua->get($url);
die $response->status_line if !$response->is_success;
my $file = $response->decoded_content( charset => 'none' );
my $md5_hex = md5_hex($file);
print "$md5_hex\n";
my $save = "Downloaded/$filename";
getstore($url,$save);
getstore is an excellent solution, however for anyone else reading this response in a slightly different setup, it may not solve the issue.
First of all, you could quite possibly just be suffering from a binary/text issue.
I'd change
my $save = "Downloaded/$filename";
unless(open SAVE, '>>'.$save) {
die "\nCannot create save file '$save'\n";
}
print SAVE $file;
close SAVE;
to
my $save = "Downloaded/$filename";
open my $fh, '>>', $save or die "\nCannot create save file '$save' because $!\n";
# on platforms where this matters
# (like Windows) this is needed for
# 'binary' files:
binmode $fh;
print $fh $file;
close $fh;
The reason I like this better is that if you have set or acquired some settings on your browser object ($ua), they are ignored in LWP::Simple's getstore, as it uses its own browser.
Also, it uses the three parameter version of open which should be safer.
Another solution would be to use the callback method and store the file while you are downloading it, if for example you are dealing with a large file. The hashing algorithm would have to be changed so it is probably not relevant here but here's a sample:
my $req = HTTP::Request->new(GET => $uri);
open(my $fh, '>', $filename) or die "Could not write to '$filename': $!";
binmode $fh;
$res = $ua->request($req, sub {
my ($data, $response, $protocol) = #_;
print $fh $data;
});
close $fh;
And if the size is unimportant (and the hashing is done some other way) you could just ask your browser to store it directly:
my $req = HTTP::Request->new(GET => $uri);
$res = $ua->request($req, $filename);

Why is my image download CGI script written in Perl not working?

#!/usr/bin/perl
use CGI ':standard';
use CGI::Carp qw(fatalsToBrowser);
my $files_location;
my $ID;
my #fileholder;
$files_location = "C:\Users\user\Documents\hello\icon.png";
open(DLFILE, "<$files_location") ;
#fileholder = <DLFILE>;
close (DLFILE) ;
print "Content-Type:application/x-download\n";
print "Content-Disposition:attachment;filename=$ID\n\n";
print #fileholder;
When I run this script, instead of returning the icon.png file it returns the download.pl (the name of the script given above) with no content inside it. What is the issue?
Script i am using currently.
#!C:\Perl64\bin\perl.exe -w
use CGI qw(:standard);
use File::Copy qw( copy );
use File::Spec::Functions qw( catfile );
use constant IMG_DIR => catfile(qw( D:\ ));
serve_logo(IMG_DIR);
sub serve_logo {
my ($dir) = #_;
my $cgi = CGI->new;
my $file = "icon.png";
#print $file;
defined ($file) or die "Invalid image name in CGI request\n";
send_file($cgi, $dir, $file);
return;
}
sub send_file
{
my ($cgi, $dir, $file) = #_;
my $path = catfile($dir, $file);
open my $fh, '<:raw', $path or die "Cannot open '$path': $!";
print $cgi->header( -type => 'application/octet-stream', -attachment => $file, );
binmode STDOUT, ':raw';
copy $fh => \*STDOUT, 8_192;
close $fh or die "Cannot close '$path': $!";
return;
}
There are quite a few issues. The first one is the fact that you are using #fileholder = <DLFILE>; to slurp a binary file. On Windows, automatic conversion of line endings will wreak havoc on the contents of that file.
Other issues are:
You are not checking the return value of open. We don't even know if open succeeded.
You never assign a value to $ID, meaning you're sending "filename=\n\n" in your response.
You are slurping a binary file, making the memory footprint of your program proportional to the size of the binary file. Robust programs don't do that.
You're useing CGI.pm, but you are neither using it nor have you read the docs.
You're using a bareword (i.e. package global) filehandle.
The fundamental reason, however, is that open fails. Why does open fail? Simple:
C:\temp> cat uu.pl
#!/usr/bin/env perl
use strict; use warnings;
my $files_location = "C:\Users\user\Documents\hello\icon.png";
print "$files_location\n";
Let's try running that, shall we?
C:\temp> uu
Unrecognized escape \D passed through at C:\temp\uu.pl line 5.
Unrecognized escape \h passed through at C:\temp\uu.pl line 5.
Unrecognized escape \i passed through at C:\temp\uu.pl line 5.
C:SERSSERDOCUMENTSHELLOICON.PNG
Here is a short script illustrating a better way:
use CGI qw(:standard);
use File::Copy qw( copy );
use File::Spec::Functions qw( catfile );
use constant IMG_DIR => catfile(qw(
E:\ srv localhost images
));
serve_logo(IMG_DIR);
sub serve_logo {
my ($dir) = #_;
my %mapping = (
'big' => 'logo-1600x1200px.png',
'medium' => 'logo-800x600.png',
'small' => 'logo-400x300.png',
'thumb' => 'logo-200x150.jpg',
'icon' => 'logo-32x32.gif',
);
my $cgi = CGI->new;
my $file = $mapping{ $cgi->param('which') };
defined ($file)
or die "Invalid image name in CGI request\n";
send_file($cgi, $dir, $file);
return;
}
sub send_file {
my ($cgi, $dir, $file) = #_;
my $path = catfile($dir, $file);
open my $fh, '<:raw', $path
or die "Cannot open '$path': $!";
print $cgi->header(
-type => 'application/octet-stream',
-attachment => $file,
);
binmode STDOUT, ':raw';
copy $fh => \*STDOUT, 8_192;
close $fh
or die "Cannot close '$path': $!";
return;
}
I also posted a detailed explanation on my blog.
It took me a while to figure what was wrong, so for those that end up here (as I did) having random issues serving large files, here's my advice:
Avoid File::Copy, as it's bugged for this purpose.
When serving data through CGI, syswrite can return undef ($! being 'Resource temporarily unavailable') for some time.
File::Copy stops in that case (returns 0, sets $!), failing to transfer the entire file (or stream).
Many different options to work around that, retrying the syswrite, or using blocking sockets, not sure which on is the best though !