Perl Catch Variable in Error - perl

I have a Perl script and I am trying to make it print out the value for $article when it errors. The script looks like:
eval{
for my $article($output =~ m/<value lang_id="">(.*?)<\/value>/g)
{
$article =~ s/ /+/g;
$agent->get("someurl");
$agent->follow_link(url_regex => qr/(?i:pdf)/ );
my $pdf_data = $agent->content;
open my $ofh, '>:raw', "$article.pdf"
or die "Could not write: $!";
print {$ofh} $pdf_data;
close $ofh;
sleep 10;
}
};
if($#){
print "error: ...: $#\n";
}
So if there is no .pdf file the code sends an error which is what I want. But what I need to know is it somehow possible to get the name of the $article that caused the error? I was trying to use some kind of global variable with no luck.

Why don't you put the eval inside the for loop? Something like this:
for my $article($output =~ m/<value lang_id="">(.*?)<\/value>/g)
{
$article =~ s/ /+/g;
eval{
# ...
}
if ($#) {
print STDERR "Error handling article: ", $article, " ", $!, "\n";
}
}

If that's your only problem, just declare my $article; before the eval, and remove the my from the for loop. But from your reply to Cornel Ghiban, I suspect it isn't.

Include the file name in the die>/ string:
open my $ofh, '>:raw', "$article.pdf" or die "Could not write '$article': $!";
I assume that you want to write and not read. Unless you have a permission issue or a full file system, a write is likely to succeed and you will never see an error.

Your script does not need to die, you can just set a flag or save message to the log or store error for late handling.
my #errors=();
................
open my $ofh, '>:raw', "$article.pdf" or do { push #errors,"$article: $!" };
if(-e $ofh) {
# work with the file
}
................
if(#errors) {
# do something
}

Related

Writing to a file inside if statement not working in Perl

I've looked around here a bit and found similar questions but not exactly. If there is one, I apologize and please point me to it.
I have the following code. I'm trying to create a csv file of simply an ID pulled from a filename and the filename itself. This is the ENTIRE script.
use strict;
use warnings;
use File::Find;
find( \&findAllFiles, '.');
exit;
sub findAllFiles {
my #fp1;
my #fp2;
my $patId;
my $filename;
my $testvar = "hello again";
$filename = $File::Find::name;
if ($filename =~ /\.pdf$/) {
open (my $fh, '>', 'filenames.csv') or die "Failed to open - $!\n";
print $fh "starting...$testvar\n" or die "Failed to print to file - $!\n";
#fp1 = split('/', $filename);
#fp2 = split('_', $fp1[-1]);
$patId = $fp2[-1];
$patId =~ s/\.pdf$//;
print "Adding $patId, file = $filename\n";
print $fh "$patId,$filename\n" or die "File print error: $!";
close $fh or warn "close failed! - $!";
}
return;
}
The line that prints to the screen, prints perfectly.
If I take the file open/close and the first print statement out of the if block, it prints that line into the file, but not the data inside the block.
I've tried every combo I can think of and it doesn't work. I've alternated between '>' and '>>' since it clearly needs the append since it's looping over filenames, but neither works inside the if block.
Even this code above doesn't throw the die errors! It just ignores those lines! I'm figuring there's something obvious I'm missing.
Quoting File::Find::find's documentation:
Additionally, for each directory found, it will chdir() into that directory
It means that when you open inside findAllFiles, you are potentially opening a file filenames.csv inside a subdirectory of your initial directory. You can run something like find . -name filenames.csv from your terminal, and you'll see plenty of filenames.csv. You can change this behavior by passing no_chdir option to find:
find( { wanted => \&findAllFiles, no_chdir => 1}, '.');
(and additionally changing > for >> in your open)
However, personally, I'd avoid repeatedly opening and closing filenames.csv when you could open it just once before calling find. If you don't want to have your filehandle globally defined, you can always pass it as an argument to findAllFiles:
{
open my $fh, '>', 'filenames.csv' or die "Failed to open 'filenames.csv': $!";
find(sub { findAllFiles($fh) }, '.')
}
sub findAllFiles {
my ($fh) = #_;
...
filenames.csv will be created in the directory where the pdf is found, since find() changes directories as it searches. If that's not what you want, use an absolute path to open it (or open it before calling find, which seems like a better idea).

Picking a specific line with a specific string

I am trying this in Perl to pick one complete line from whole document which contains "CURRENT_RUN_ID". I have been using below code to accomplish the above said task but I am unable to enter the while loop.
my $sSuccessString = "CURRENT_RUN_ID";
open(LOG, "$slogfile") or die("Can't open $slogfile\n");
my $sLines;
{
local $/ = undef;
$sLines=<LOG>;
}
my $spool = 0;
my #matchingLines;
while (<LOG>)
{
print OUTLOG "in while loop\n";
if (m/$sSuccessString/i) {
print OUTLOG "in if loop\n";
$spool = 1;
print map { "$_ \n" } #matchingLines;
#matchingLines = ();
}
if ($spool) {
push (#matchingLines, $_);
}
}
You are already done reading from the filehandle LOG after you have slurped it into $sLines. <LOG> in the head of the while will return undef because it has reached eof. You either have to use that variable $sLines in your while loop or get rid of it. You're not using it anyway.
If you only want to print the line that matches, all you need to do is this:
use strict;
use warnings;
open my $fh_in, '<', 'input_file' or die $!;
open my $fh_out '>', 'output_file' or die $!;
while (my $line = <$fh_in>) {
print $fh_out $line if $line =~ m/CURRENT_RUN_ID/;
}
close $fh_in;
close $fh_out;
When you execute this code:
$sLines=<LOG>;
it reads all of the data from LOG into $sLines and it leaves the file pointer for LOG at the end of the file. So when you next try to read from that file handle with:
while (<LOG>)
nothing is returned as there is no more data to read.
If you want to read the file twice, then you will need to use the seek() function to reset the file pointer before your second read.
seek LOG, 0, 0;
But, given that you never do anything with $sLines I suspect that you can probably just remove that whole section of the code.
The whole thing with $spool and #matchingLines seems strange too. What were you trying to achieve there?
I think your code can be simplified to just:
my $sSuccessString = "CURRENT_RUN_ID";
open(LOG, $slogfile) or die("Can't open $slogfile\n");
while (<LOG>) {
print OUTLOG if /$sSuccessString/i/;
}
Personally, I'd make it even simpler, by reading from STDIN and writing to STDOUT.
my $sSuccessString = 'CURRENT_RUN_ID';
while (<>) {
print if /$sSuccessString/i/;
}
And then using Unix I/O redirection to connect up the correct files.
$ ./this_filter.pl < your_input.log > your_output.log

How to write a correct name using combination of variable and string as a filehandler?

I want to make a tool to classify each line in input file to several files
but it seems have some problem in naming a filehandler so I can't go ahead , how do I solve?
here is my program
ARGV[0] is the input file
ARGV[1] is the number of classes
#!/usr/bin/perl
use POSIX;
use warnings;
# open input file
open(Raw,"<","./$ARGV[0]") or die "Can't open $ARGV[0] \n";
# create a directory class to store class files
system("mkdir","Class");
# create files for store class informations
for($i=1;$i<=$ARGV[1];$i++)
{
# it seems something wrong in here
open("Class$i",">","./Class/$i.class") or die "Can't create $i.class \n";
}
# read each line and random decide which class to store
while( eof(Raw) != 1)
{
$Line = readline(*Raw);
$Random_num = ceil(rand $ARGV[1]);
for($k=1;$k<=$ARGV[1];$k++)
{
if($Random_num == $k)
{
# Store to the file
print "Class$k" $Line;
last;
}
}
}
for($h=1;$h<=$ARGV[1];$h++)
{
close "Class$h";
}
close Raw;
thanks
Later I use the advice provided by Bill Ruppert
I put the name of filehandler into array , but it seems appear a syntax bug , but I can't correct it
I label the syntax bug with ######## A syntax error but it looks quite OK ########
here is my code
#!/usr/bin/perl
use POSIX;
use warnings;
use Data::Dumper;
# open input file
open(Raw,"<","./$ARGV[0]") or die "Can't open $ARGV[0] \n";
# create a directory class to store class files
system("mkdir","Class");
# put the name of hilehandler into array
for($i=0;$i<$ARGV[1];$i++)
{
push(#Name,("Class".$i));
}
# create files of classes
for($i=0;$i<=$#Name;$i++)
{
$I = ($i+1);
open($Name[$i],">","./Class/$I.class") or die "Can't create $I.class \n";
}
# read each line and random decide which class to store
while( eof(Raw) != 1)
{
$Line = readline(*Raw);
$Random_num = ceil(rand $ARGV[1]);
for($k=0;$k<=$#Name;$k++)
{
if($Random_num == ($k+1))
{
print $Name[$k] $Line; ######## A syntax error but it looks quite OK ########
last;
}
}
}
for($h=0;$h<=$#Name;$h++)
{
close $Name[$h];
}
close Raw;
thanks
To quote the Perl documentation on the print function:
If you're storing handles in an array or hash, or in general whenever you're using any expression more complex than a bareword handle or a plain, unsubscripted scalar variable to retrieve it, you will have to use a block returning the filehandle value instead, in which case the LIST may not be omitted:
print { $files[$i] } "stuff\n";
print { $OK ? STDOUT : STDERR } "stuff\n";
Thus, print $Name[$k] $Line; needs to be changed to print { $Name[$k] } $Line;.
How about this one:
#! /usr/bin/perl -w
use strict;
use POSIX;
my $input_file = shift;
my $file_count = shift;
my %hash;
open(INPUT, "<$input_file") || die "Can't open file $input_file";
while(my $line = <INPUT>) {
my $num = ceil(rand($file_count));
$hash{$num} .= $line
}
foreach my $i (1..$file_count) {
open(OUTPUT, ">$i.txt") || die "Can't open file $i.txt";
print OUTPUT $hash{$i};
close OUTPUT;
}
close INPUT;

Open filehandle or assign stdout

I'm working in a program where the user can pass a -o file option, and output should be then directed to that file. Otherwise, it should go to stdout.
To retrieve the option I'm using the module getopt long, and that's not the problem. The problem is that I want to create a file handle with that file or assign stdout to it if the option was not set.
if ($opt) {
open OUTPUT, ">", $file;
} else {
open OUTPUT, # ???
}
That's because this way, later in my code I can just:
print OUTPUT "...";
Without worrying if OUTPUT is stdout or a file the user specified. Is this possible? If I'm doing a bad design here, please let me know.
This would be a good example on how to use select.
use strict;
use warnings;
use autodie;
my $fh;
if ($opt) {
open $fh, '>', $file;
select $fh;
}
print "This goes to the file if $opt is defined, otherwise to STDOUT."
Look at the open documentation. The easiest is to reopen STDOUT itself and not use a filehandle in your code.
if ($opt) {
open(STDOUT, ">", $file);
}
...
print "this goes to $file or STDOUT\n";
(Add some error checking of course.)
A constant item such as OUTPUT cannot be assigned. Using a variable such as $output works better. For example:
my ($output, $display_filename);
if ($opt)
{
if ($opt eq '-')
{
$display_filename = 'stdout';
$output = *STDOUT;
}
else
{
$display_filename = $opt;
open($output, '>', $opt) or
die("Cannot open $opt for writing: $!\n");
}
}
That way the program can print to standard output and/or to an output file:
print $output "This might go to a file\n";
print "Data written to $display_filename\n" if ($verbose);

How do I determine whether a Perl file handle is a read or write handle?

You are given either an IO::File object or a typeglob (\*STDOUT or Symbol::symbol_to_ref("main::FH")); how would you go about determining if it is a read or write handle? The interface cannot be extended to pass this information (I am overriding close to add calls to flush and sync before the actual close).
Currently I am attempting to flush and sync the filehandle and ignoring the error "Invalid argument" (which is what I get when I attempt to flush or sync a read filehandle):
eval { $fh->flush; 1 } or do {
#this seems to exclude flushes on read handles
unless ($! =~ /Invalid argument/) {
croak "could not flush $fh: $!";
}
};
eval { $fh->sync; 1 } or do {
#this seems to exclude syncs on read handles
unless ($! =~ /Invalid argument/) {
croak "could not sync $fh: $!";
}
};
Have a look at the fcntl options. Maybe F_GETFL with O_ACCMODE.
Edit: I did a little googling and playing over lunch and here is some probably non-portable code but it works for my Linux box, and probably any Posix system (perhaps even Cygwin, who knows?).
use strict;
use Fcntl;
use IO::File;
my $file;
my %modes = ( 0 => 'Read only', 1 => 'Write only', 2 => 'Read / Write' );
sub open_type {
my $fh = shift;
my $mode = fcntl($fh, F_GETFL, 0);
print "File is: " . $modes{$mode & 3} . "\n";
}
print "out\n";
$file = new IO::File();
$file->open('> /tmp/out');
open_type($file);
print "\n";
print "in\n";
$file = new IO::File();
$file->open('< /etc/passwd');
open_type($file);
print "\n";
print "both\n";
$file = new IO::File();
$file->open('+< /tmp/out');
open_type($file);
Example output:
$ perl test.pl
out
File is: Write only
in
File is: Read only
both
File is: Read / Write