Multilevel 'do' in Perl? - perl

This question may look simple, but I am thinking on this over past some days, I couldn't find the answer.
I have multilevel scripting architecture (the code is shown below)
CallingScript.pl (Include toplevel library and check for compiler-error)
do "IncludesConsumer.pm";
print "\n callingScript error : $#" if($# || $!);
do "IncludesConsumer.pm";
print "\n callingScript error : $#" if($#);
do "IncludesConsumer.pm";
print "\n callingScript error : $#" if($#);
IncludesConsumer.pm (adds the library INCLUDES.pm and has its own functions)
do "INCLUDES.pm";
print "\nin IncludesConsumer";
INCLUDES.pm ( multiple modules in one place, acts as a library )
use Module;
print "\n in includes";
Module.pm (with syntax-error)
use strict;
sub MakeSyntaxError
{
print "\nerror in Module
}
1;
In concept, once of the core Modules(e.g. Module.pm) may contain syntax errors. So I need to capture them in CallingScript.pl. i.e: I would like to capture the syntax error in Module.pm (low-level) in the CallingScript.pl file.
OUTPUT:
D:\Do_analysis>CallingScript.pl
in IncludesConsumer
in IncludesConsumer
in IncludesConsumer
Why is the compiler-erro not caught in CallingScript.pl?
Please pour-in your thoughts.
Thanks!

Five errors, starting with the one causing the problem you are asking about:
You didn't handle errors from some of the do.
You only check $! some of the time.
$! can be true even if there was error. Only check $! is both do and $# are false.
All but one of the your included file didn't signal a lack of error (1;). You need to return true so do returns true, so you know whether an error occurred or not.
You used a file that doesn't have a package.
CallingScript.pl:
do "IncludesConsumer.pm" or die "CallingScript: ".($# || $!);
do "IncludesConsumer.pm" or die "CallingScript: ".($# || $!);
do "IncludesConsumer.pm" or die "CallingScript: ".($# || $!);
IncludesConsumer.pm (which isn't actually a "pm"):
do "INCLUDES.pm" or die "IncludesConsumer: ".($# || $!);
print "\nin IncludesConsumer";
1;
INCLUDES.pm (which isn't actually a "pm"):
use Module;
print "\n in includes";
1;
Module.pm (with syntax-error)
package Module;
sub MakeSyntaxError {
1;
There's really no reason to use do in this fashion. It's a very poor programming practice. please avoid it in favor of modules.

Well, you have the following hierarchy:
CallingScript
do IncludesConsumer
do INCLUDES
use Module
The use Module is processed at compile time of INCLUDES.pm, which then also fails. After do "INCLUDES.pm", the $# variable is set.
However, the $# refers to the last eval (of which do FILE is a variant). In CallingScript.pl, this is the do "IncludesConsumer.pm", which ran fine.
The do FILE syntax is unneccessary since the advent of modules to Perl. You want to use your files in nearly all cases instead, or require them if runtime effects are needed.
If you want to assert that modules can be loaded fine, I refer you to Test::More with the use_ok function.

Related

How to use gdbm in Perl

I'm new to gdbm and I would like to use it in Perl. I know that Perl ships by default with a module for that (GDBM_File). Now, when I try the simplest example possible, namely:
#!/usr/bin/perl
use strict;
use warnings;
use GDBM_File;
my $dbfile = '/tmp/test.gdbm';
my $ok = tie(my %db, 'GDBM_File', $dbfile, &GDBM_WRCREAT, 0664);
die "can't tie to $dbfile for WRCREAT access: $!" unless $ok;
$db{test} = 1;
untie %db;
and execute it I get the following warning:
untie attempted while 1 inner references still exist at ./gdbm-test line 13.
I read the perl documentation (see the "untie gotcha" in the provided link) but that explanation does not seem to apply here since it is clear that %db has no references anywhere in the code pointing to it.
Nonetheless the code seems to work since when I inspect the database file I get the correct result:
bash$ echo list | gdbmtool /tmp/test.gdbm
test 1
Why does this warning appear and how can I get rid of it?
I think that this is, in fact, a manifestation of the gotcha that you point to. The documentation for tie() says this:
The object returned by the constructor is also returned by the tie function
So your $ok contains a reference to the object, and you should undefine that before calling untie().
undef $ok;
untie %db;

Cannot load `Cwd` (and other, non-core, modules) at runtime

Imagine I want to load a module at runtime. I expected this to work
use warnings;
use strict;
eval {
require Cwd;
Cwd->import;
};
if ($#) { die "Can't load Cwd: $#" }
say "Dir: ", getcwd;
but it doesn't, per Bareword "getcwd" not allowed ....
The Cwd exports getcwd by default. I tried giving the function name(s) to import and I tried with its other functions.
It works with the full name, say Cwd::getcwd, so I'd think that it isn't importing.
This works as attempted for a few other core modules that I tried, for example
use warnings;
use strict;
eval {
require List::Util;
List::Util->import('max');
};
if ($#) { die "Can't load List::Util: $#" }
my $max = max (1, 14, 3, 26, 2);
print "Max is $max\n";
NOTE added Apparently, function calls with parenthesis give a clue to the compiler. However, in my opinion the question remains, please see EDIT at the end. In addition, a function like first BLOCK LIST from the module above does not work.
However, it does not work for a few (well established) non-core modules that I tried. Worse and more confusingly, it does not work even with the fully qualified names.
I can imagine that the symbol (function) used is not known at compile time if require is used at runtime, but it works for (other) core modules. I thought that this was a standard way to load at runtime.
If I need to use full names when loading dynamically then fine, but what is it with the inconsistency? And how do I load (and use) non-core modules at runtime?
I also tried with Module::Load::Conditional and it did not work.
What am I missing, and how does one load modules at runtime? (Tried with 5.16 and 5.10.1.)
EDIT
As noted by Matt Jacob, a call with parenthesis works, getcwd(). However, given perlsub
NAME LIST; # Parentheses optional if predeclared/imported.
this implies that the import didn't work and the question of why remains.
Besides, having to use varied syntax based on how the module is loaded is not good. Also, I cannot get non-core modules to work this way, specially the ones with syntax like List::MoreUtils has.
First, this has nothing to do with core vs. non-core modules. It happens when the parser has to guess whether a particular token is a function call.
eval {
require Cwd;
Cwd->import;
};
if ($#) { die "Can't load Cwd: $#" }
say "Dir: ", getcwd;
At compile time, there is no getcwd in the main:: symbol table. Without any hints to indicate that it's a function (getcwd() or &getcwd), the parser has no way to know, and strict complains.
eval {
require List::Util;
List::Util->import('max');
};
if ($#) { die "Can't load List::Util: $#" }
my $max = max (1, 14, 3, 26, 2);
At compile time, there is no max in the main:: symbol table. However, since you call max with parentheses, the parser can guess that it's a function that will be defined later, so strict doesn't complain.
In both cases, the strict check happens before import is ever called.
List::MoreUtils is special because the functions use prototypes. Prototypes are ignored if the function definition is not visible at compile time. So, not only do you have to give the parser a hint that you're calling a function, you also have to call it differently since the prototype will be ignored:
use strict;
use warnings 'all';
use 5.010;
eval {
require List::MoreUtils;
List::MoreUtils->import('any')
};
die "Can't load List::MoreUtils: $#" if $#;
say 'found' if any( sub { $_ > 5 }, 1..9 );

Perl : Name "main::IN" used only once, but it is actually used

I writing a short perl script that reads in a file. See tmp.txt:
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
1 gene_id "XLOC_000001"; gene_name "DDX11L1"; oId
My perl program, convert.pl is :
use warnings;
use strict;
use autodie; # die if io problem with file
my $line;
my ($xloc, $gene, $ens);
open (IN, "tmp.txt")
or die ("open 'tmp.txt' failed, $!\n");
while ($line = <IN>) {
($xloc, $gene) = ($line =~ /gene_id "([^"]+)".*gene_name "([^"]+)"/);
print("$xloc $gene\n");
}
close (IN)
or warn $! ? "ERROR 1" : "ERROR 2";
It outputs:
Name "main::IN" used only once: possible typo at ./convert.pl line 8.
XLOC_000001 DDX11L1
XLOC_000001 DDX11L1
XLOC_000001 DDX11L1
XLOC_000001 DDX11L1
I used IN, so I don't understand the Name "main::IN" used... warning. Why is it complaining?
This is mentioned under BUGS section of autodie
"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE). Scalar filehandles are strongly recommended instead.
use diagnostics; says:
Name "main::IN" used only once: possible typo at test.pl line 9 (#1)
(W once) Typographical errors often show up as unique variable names.
If you had a good reason for having a unique name, then just mention
it again somehow to suppress the message. The our declaration is also
provided for this purpose.
NOTE: This warning detects package symbols that have been used only
once. This means lexical variables will never trigger this warning.
It also means that all of the package variables $c, #c, %c, as well as
*c, &c, sub c{}, c(), and c (the filehandle or format) are considered the same; if a program uses $c only once but also uses any of the
others it will not trigger this warning. Symbols beginning with an
underscore and symbols using special identifiers (q.v. perldata) are
exempt from this warning.
So if you use lexical filehandle then it will not warn.
use warnings;
use strict;
use autodie; # die if io problem with file
use diagnostics;
my $line;
my ($xloc, $gene, $ens);
open (my $in, "<", "tmp.txt")
or die ("open 'tmp.txt' failed, $!\n");
while ($line = <$in>) {
($xloc, $gene) = ($line =~ /gene_id "([^"]+)".*gene_name "([^"]+)"/);
print("$xloc $gene\n");
}
close ($in)
or warn $! ? "ERROR 1" : "ERROR 2";
I'm pretty sure this is because of autodie.
I don't know exactly why, but if you remove it, it goes away.
If you read perldoc autodie you'll see:
BUGS ^
"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE). Scalar filehandles are strongly recommended instead.
I'd suggest that's because of how the or die is being handled, compared to autodie trying to handle it.
However I'd also suggest it would be much better style to use a 3 argument open:
open ( my $input, '<', 'tmp.txt');
And either autodie or or die. I must confess, I'm not really sure which way around the two would be applied if your process did fail the open.

File locking with Fcntl: Baffling bug involving 'use' and 'require'

The following Perl script outputs "SUCCESS" as you'd expect:
use Fcntl qw(:DEFAULT :flock);
sysopen(LF, "test.txt", O_RDONLY | O_CREAT) or die "SYSOPEN FAIL: $!";
if(flock(LF, LOCK_EX)) { print "SUCCESS.\n"; }
else { print "FAIL: $!\n"; }
But now, replace that first line with
require "testlib.pl";
where testlib.pl contains
use Fcntl qw(:DEFAULT :flock);
1;
Now, strangely enough, the script fails, like so:
FAIL: Bad file descriptor
The question: Why?
ADDED:
And now that I know why -- thanks! -- I'm wondering what is the best way to deal with this:
Just do the use Fcntl twice, once in the main script and once in the required library (both the main script and the library need it).
Replace O_RDONLY with &O_RDONLY, etc.
Replace O_RDONLY with O_RDONLY(), etc.
Something else?
By foregoing use, you deprive the Perl parser of the knowledge that O_RDONLY et al. are parameterless subroutines. You have to be a bit more verbose in that situation:
sysopen(LF, "test.txt", O_RDONLY() | O_CREAT()) or die "SYSOPEN FAIL: $!";
if(flock(LF, LOCK_EX())) { print "SUCCESS.\n"; }
EDIT: To elaborate a bit further, without the parentheses, the O_RDONLY and O_CREAT were being interpreted as barewords (strings), which don't behave as you'd expect when binary-or'ed together:
$ perl -le 'print O_RDONLY | O_CREAT'
O_SVOO\Y
(The individual characters are being bitwise or'ed togther.)
In this case, the string "O_SVOO\Y" (or whatever it is on your system) was being interpreted as the number 0 to sysopen, which would therefore still work as long as O_RDONLY is 0 (as is typical) and the file already existed (so the O_CREAT was superfluous). But fcntl is apparently not as forgiving with non-numeric arguments:
$ perl -e 'flock STDOUT, "LOCK_EX" or die "Failed: $!"'
Failed: Bad file descriptor at -e line 1.
Similarly:
$ perl -e 'flock STDOUT, LOCK_EX or die "Failed: $!"'
Failed: Bad file descriptor at -e line 1.
However:
$ perl -e 'use Fcntl qw(:flock); flock STDOUT, LOCK_EX or die "Failed: $!"'
(no output)
Finally, note that use strict provides many helpful clues.
The line use Fcntl qw(:DEFAULT :flock); is not just loading the Fcntl library for you, but also exporting some symbols into your script's namespace. If you move that to a different scope, then the constants O_RDONLY, O_CREAT, LF, and LOCK_EX are no longer available to you, and your code won't do the same thing [however you could still reach them, if you know what namespace they ended up in -- since it was a script that did the export, you could call &main::NAME or simply &NAME, but then you have to be aware of what another file is doing with its code, which is not very clean].
This is described in the documentation under EXPORTED SYMBOLS:
By default your system's F_* and O_* constants (eg, F_DUPFD and O_CREAT) and the FD_CLOEXEC constant are exported into your namespace.
You can request that the flock() constants (LOCK_SH, LOCK_EX, LOCK_NB and LOCK_UN) be provided by using the tag ":flock". See Exporter.
If you add the lines
use strict;
use warnings;
to the top of your script, you will get more informative error messages such as "Name "main::O_RDONLY" used only once: possible type at line ...", which would give you a clue that these constants definitions are no longer visible.
Edit: in response to your question, the best practice would be #1, to include
the use statement in every file that needs it. See perldoc -f use -- the Fcntl library is only included once, but the import() call is made every time it is needed, which is what you want.
use is equivalent to:
BEGIN { require Module; Module->import( LIST ); }
guaranteeing that the import functions are available before the code starts executing. Whe you replace use with require, it simply reads the code in at the lexical point in the program where it exists.

How can I get around a 'die' call in a Perl library I can't modify?

Yes, the problem is with a library I'm using, and no, I cannot modify it. I need a workaround.
Basically, I'm dealing with a badly written Perl library, that exits with 'die' when a certain error condition is encountered reading a file. I call this routine from a program which is looping through thousands of files, a handful of which are bad. Bad files happen; I just want my routine to log an error and move on.
IF I COULD modify the library, I would simply change the
die "error";
to a
print "error";return;
, but I cannot. Is there any way I can couch the routine so that the bad files won't crash the entire process?
FOLLOWUP QUESTION: Using an "eval" to couch the crash-prone call works nicely, but how do I set up handling for catch-able errors within that framework? To describe:
I have a subroutine that calls the library-which-crashes-sometimes many times. Rather than couch each call within this subroutine with an eval{}, I just allow it to die, and use an eval{} on the level that calls my subroutine:
my $status=eval{function($param);};
unless($status){print $#; next;}; # print error and go to next file if function() fails
However, there are error conditions that I can and do catch in function(). What is the most proper/elegant way to design the error-catching in the subroutine and the calling routine so that I get the correct behavior for both caught and uncaught errors?
You could wrap it in an eval. See:
perldoc -f eval
For instance, you could write:
# warn if routine calls die
eval { routine_might_die }; warn $# if $#;
This will turn the fatal error into a warning, which is more or less what you suggested. If die is called, $# contains the string passed to it.
Does it trap $SIG{__DIE__}? If it does, then it's more local than you are. But there are a couple strategies:
You can evoke its package and override die:
package Library::Dumb::Dyer;
use subs 'die';
sub die {
my ( $package, $file, $line ) = caller();
unless ( $decider->decide( $file, $package, $line ) eq 'DUMB' ) {
say "It's a good death.";
die #_;
}
}
If not, can trap it. (look for $SIG on the page, markdown is not handling the full link.)
my $old_die_handler = $SIG{__DIE__};
sub _death_handler {
my ( $package, $file, $line ) = caller();
unless ( $decider->decide( $file, $package, $line ) eq 'DUMB DIE' ) {
say "It's a good death.";
goto &$old_die_handler;
}
}
$SIG{__DIE__} = \&_death_handler;
You might have to scan the library, find a sub that it always calls, and use that to load your $SIG handler by overriding that.
my $dumb_package_do_something_dumb = \&Dumb::do_something_dumb;
*Dumb::do_something_dumb = sub {
$SIG{__DIE__} = ...
goto &$dumb_package_do_something_dumb;
};
Or override a builtin that it always calls...
package Dumb;
use subs 'chdir';
sub chdir {
$SIG{__DIE__} = ...
CORE::chdir #_;
};
If all else fails, you can whip the horse's eyes with this:
package CORE::GLOBAL;
use subs 'die';
sub die {
...
CORE::die #_;
}
This will override die globally, the only way you can get back die is to address it as CORE::die.
Some combination of this will work.
Although changing a die to not die has a specific solution as shown in the other answers, in general you can always override subroutines in other packages. You don't change the original source at all.
First, load the original package so you get all of the original definitions. Once the original is in place, you can redefine the troublesome subroutine:
BEGIN {
use Original::Lib;
no warnings 'redefine';
sub Original::Lib::some_sub { ... }
}
You can even cut and paste the original definition and tweak what you need. It's not a great solution, but if you can't change the original source (or want to try something before you change the original), it can work.
Besides that, you can copy the original source file into a separate directory for your application. Since you control that directory, you can edit the files in it. You modify that copy and load it by adding that directory to Perl's module search path:
use lib qw(/that/new/directory);
use Original::Lib; # should find the one in /that/new/directory
Your copy sticks around even if someone updates the original module (although you might have to merge changes).
I talk about this quite a bit in Mastering Perl, where I show some other techniques to do that sort of thing. The trick is to not break things even more. How you not break things depends on what you are doing.