File locking with Fcntl: Baffling bug involving 'use' and 'require' - perl

The following Perl script outputs "SUCCESS" as you'd expect:
use Fcntl qw(:DEFAULT :flock);
sysopen(LF, "test.txt", O_RDONLY | O_CREAT) or die "SYSOPEN FAIL: $!";
if(flock(LF, LOCK_EX)) { print "SUCCESS.\n"; }
else { print "FAIL: $!\n"; }
But now, replace that first line with
require "testlib.pl";
where testlib.pl contains
use Fcntl qw(:DEFAULT :flock);
1;
Now, strangely enough, the script fails, like so:
FAIL: Bad file descriptor
The question: Why?
ADDED:
And now that I know why -- thanks! -- I'm wondering what is the best way to deal with this:
Just do the use Fcntl twice, once in the main script and once in the required library (both the main script and the library need it).
Replace O_RDONLY with &O_RDONLY, etc.
Replace O_RDONLY with O_RDONLY(), etc.
Something else?

By foregoing use, you deprive the Perl parser of the knowledge that O_RDONLY et al. are parameterless subroutines. You have to be a bit more verbose in that situation:
sysopen(LF, "test.txt", O_RDONLY() | O_CREAT()) or die "SYSOPEN FAIL: $!";
if(flock(LF, LOCK_EX())) { print "SUCCESS.\n"; }
EDIT: To elaborate a bit further, without the parentheses, the O_RDONLY and O_CREAT were being interpreted as barewords (strings), which don't behave as you'd expect when binary-or'ed together:
$ perl -le 'print O_RDONLY | O_CREAT'
O_SVOO\Y
(The individual characters are being bitwise or'ed togther.)
In this case, the string "O_SVOO\Y" (or whatever it is on your system) was being interpreted as the number 0 to sysopen, which would therefore still work as long as O_RDONLY is 0 (as is typical) and the file already existed (so the O_CREAT was superfluous). But fcntl is apparently not as forgiving with non-numeric arguments:
$ perl -e 'flock STDOUT, "LOCK_EX" or die "Failed: $!"'
Failed: Bad file descriptor at -e line 1.
Similarly:
$ perl -e 'flock STDOUT, LOCK_EX or die "Failed: $!"'
Failed: Bad file descriptor at -e line 1.
However:
$ perl -e 'use Fcntl qw(:flock); flock STDOUT, LOCK_EX or die "Failed: $!"'
(no output)
Finally, note that use strict provides many helpful clues.

The line use Fcntl qw(:DEFAULT :flock); is not just loading the Fcntl library for you, but also exporting some symbols into your script's namespace. If you move that to a different scope, then the constants O_RDONLY, O_CREAT, LF, and LOCK_EX are no longer available to you, and your code won't do the same thing [however you could still reach them, if you know what namespace they ended up in -- since it was a script that did the export, you could call &main::NAME or simply &NAME, but then you have to be aware of what another file is doing with its code, which is not very clean].
This is described in the documentation under EXPORTED SYMBOLS:
By default your system's F_* and O_* constants (eg, F_DUPFD and O_CREAT) and the FD_CLOEXEC constant are exported into your namespace.
You can request that the flock() constants (LOCK_SH, LOCK_EX, LOCK_NB and LOCK_UN) be provided by using the tag ":flock". See Exporter.
If you add the lines
use strict;
use warnings;
to the top of your script, you will get more informative error messages such as "Name "main::O_RDONLY" used only once: possible type at line ...", which would give you a clue that these constants definitions are no longer visible.
Edit: in response to your question, the best practice would be #1, to include
the use statement in every file that needs it. See perldoc -f use -- the Fcntl library is only included once, but the import() call is made every time it is needed, which is what you want.

use is equivalent to:
BEGIN { require Module; Module->import( LIST ); }
guaranteeing that the import functions are available before the code starts executing. Whe you replace use with require, it simply reads the code in at the lexical point in the program where it exists.

Related

How to use gdbm in Perl

I'm new to gdbm and I would like to use it in Perl. I know that Perl ships by default with a module for that (GDBM_File). Now, when I try the simplest example possible, namely:
#!/usr/bin/perl
use strict;
use warnings;
use GDBM_File;
my $dbfile = '/tmp/test.gdbm';
my $ok = tie(my %db, 'GDBM_File', $dbfile, &GDBM_WRCREAT, 0664);
die "can't tie to $dbfile for WRCREAT access: $!" unless $ok;
$db{test} = 1;
untie %db;
and execute it I get the following warning:
untie attempted while 1 inner references still exist at ./gdbm-test line 13.
I read the perl documentation (see the "untie gotcha" in the provided link) but that explanation does not seem to apply here since it is clear that %db has no references anywhere in the code pointing to it.
Nonetheless the code seems to work since when I inspect the database file I get the correct result:
bash$ echo list | gdbmtool /tmp/test.gdbm
test 1
Why does this warning appear and how can I get rid of it?
I think that this is, in fact, a manifestation of the gotcha that you point to. The documentation for tie() says this:
The object returned by the constructor is also returned by the tie function
So your $ok contains a reference to the object, and you should undefine that before calling untie().
undef $ok;
untie %db;

Perl wrongly complaining about Name "main::FILE" used only once

I simplified my program to the following trivial snippet and I'm still getting the message
Name "main::FILE" used only once: possible typo...
#!/usr/bin/perl -w
use strict;
use autodie qw(open close);
foreach my $f (#ARGV) {
local $/;
open FILE, "<", $f;
local $_ = <FILE>; # <--- HERE
close FILE;
print $_;
}
which obviously isn't true as it gets used three times. For whatever reason, only the marked occurrence counts.
I am aware about nicer ways to open a file (using a $filehandle), but it doesn't pay for short script, does it? So how can I get rid of the wrong warning?
According to the documentation for autodie:
BUGS
"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE ). Scalar filehandles are strongly recommended instead.
I get the warning on Perl 5.10.1, but not 5.16.3, so there may be something else going on as well.

Perl Porter Stemmer

I was checking this porter stemmer. Below they said I should change my first line. To what exactly I tried every thing but the stemmer ain't working. What a good example might be?
#!/usr/local/bin/perl -w
#
# Perl implementation of the porter stemming algorithm
# described in the paper: "An algorithm for suffix stripping, M F Porter"
# http://www.muscat.com/~martin/stem.html
#
# Daniel van Balen (vdaniel#ldc.usb.ve)
#
# October-1999
#
# To Use:
#
# Put the line "use porter;" in your code. This will import the subroutine
# porter into your current name space (by default this is Main:: ). Make
# sure this file, "porter.pm" is in your #INC path (it includes the current
# directory).
# Afterwards use by calling "porter(<word>)" where <word> is the word to strip.
# The stripped word will be the returned value.
#
# REMEMBER TO CHANGE THE FIRST LINE TO POINT TO THE PATH TO YOUR PERL
# BINARY
#
As A code I am writing what follows:
use Lingua::StopWords qw(getStopWords);
use Main::porter;
my $stopwords = getStopWords('en');
#stopwords = grep { $stopwords->{$_} } (keys %$stopwords);
chdir("c:/perl/input");
#files = <*>;
foreach $file (#files)
{
open (input, $file);
while (<input>)
{
open (output,">>c:/perl/normalized/".$file);
chomp;
porter<$_>;
for my $stop (#stopwords)
{
s/\b\Q$stop\E\b//ig;
}
$_ =~s/<[^>]*>//g;
$_ =~ s/[[:punct:]]//g;
print output "$_\n";
}
}
close (input);
close (output);
The code gives no errors except it is not stemming anything!!!
That comment block is full of incorrect advice.
A #! line in a .pm file has no effect. It's a common mistake. The #! line tells Unix which interpreter to run the program with if and only if you run the file as a command line program.
./somefile # uses #! to determine what to run somefile with
/usr/bin/perl somefile # runs somefile with /usr/bin/perl regardless of #!
The #! line does nothing in a module, a .pm file which you use. Perl is already running at that point. The line is nothing but a comment.
The second problem is that your default namespace is main not Main. Casing matters.
Moving on to your code, use Main::porter; should not work. It should be use porter. You should get an error message like Can't locate Main/porter.pm in #INC (#INC contains: ...). If that code runs, perhaps you moved porter.pm into a Main/ directory? Move it out, it will confuse the importing of the porter function.
porter<$_>; says "try to read a line from the filehandle $_ and pass that into porter". $_ isn't a filehandle, it's a line from the file you just opened. You want porter($_) to pass the line into the porter function. If you turn on warnings (add use warnings to the top of your script) Perl will warn you about mistakes like that.
You'll also presumably want to do something with the return value from porter, otherwise it will truly do nothing. my #whatever_porter_returns = porter($_).
Likely one or more of your chdir or opens have silently failed so your program may have no input. Unfortunately, Perl does not let you know when this happens, you have to check. Normally you add an or die $! after the function to check for the error. This is busy work and often one forgets, instead you can use autodie which will automatically produce an error if any system calls like chdir or open fail.
With that stuff fixed your code should work, or at least produce useful error messages.
Finally, there are many stemming modules on CPAN which are likely to be higher quality than the one you've found with documentation and tests and updates and all that. Lingua::Stem and Text::English specifically use the porter algorithm. You might want to give those a shot.

Writing a macro in Perl

open $FP, '>', $outfile or die $outfile." Cannot open file for writing\n";
I have this statement a lot of times in my code.
I want to keep the format same for all of those statements, so that when something is changed, it is only changed at one place.
In Perl, how should I go about resolving this situation?
Should I use macros or functions?
I have seen this SO thread How can I use macros in Perl?, but it doesn't say much about how to write a general macro like
#define fw(FP, outfile) open $FP, '>', \
$outfile or die $outfile." Cannot open file for writing\n";
First, you should write that as:
open my $FP, '>', $outfile or die "Could not open '$outfile' for writing:$!";
including the reason why open failed.
If you want to encapsulate that, you can write:
use Carp;
sub openex {
my ($mode, $filename) = #_;
open my $h, $mode, $filename
or croak "Could not open '$filename': $!";
return $h;
}
# later
my $FP = openex('>', $outfile);
Starting with Perl 5.10.1, autodie is in the core and I will second Chas. Owens' recommendation to use it.
Perl 5 really doesn't have macros (there are source filters, but they are dangerous and ugly, so ugly even I won't link you to the documentation). A function may be the right choice, but you will find that it makes it harder for new people to read your code. A better option may be to use the autodie pragma (it is core as of Perl 5.10.1) and just cut out the or die part.
Another option, if you use Vim, is to use snipMate. You just type fw<tab>FP<tab>outfile<tab> and it produces
open my $FP, '>', $outfile
or die "Couldn't open $outfile for writing: $!\n";
The snipMate text is
snippet fw
open my $${1:filehandle}, ">", $${2:filename variable}
or die "Couldn't open $$2 for writing: $!\n";
${3}
I believe other editors have similar capabilities, but I am a Vim user.
There are several ways to handle something similar to a C macro in Perl: a source filter, a subroutine, Template::Toolkit, or use features in your text editor.
Source Filters
If you gotta have a C / CPP style preprocessor macro, it is possible to write one in Perl (or, actually, any language) using a precompile source filter. You can write fairly simple to complex Perl classes that operate on the text of your source code and perform transformations on it before the code goes to the Perl compiler. You can even run your Perl code directly through a CPP preprocessor to get the exact type of macro expansions you get in C / CPP using Filter::CPP.
Damian Conway's Filter::Simple is part of the Perl core distribution. With Filter::Simple, you could easily write a simple module to perform the macro you are describing. An example:
package myopinion;
# save in your Perl's #INC path as "myopinion.pm"...
use Filter::Simple;
FILTER {
s/Hogs/Pigs/g;
s/Hawgs/Hogs/g;
}
1;
Then a Perl file:
use myopinion;
print join(' ',"Hogs", 'Hogs', qq/Hawgs/, q/Hogs/, "\n");
print "In my opinion, Hogs are Hogs\n\n";
Output:
Pigs Pigs Hogs Pigs
In my opinion, Pigs are Pigs
If you rewrote the FILTER in to make the substitution for your desired macro, Filter::Simple should work fine. Filter::Simple can be restricted to parts of your code to make substations, such as the executable part but not the POD part; only in strings; only in code.
Source filters are not widely used in in my experience. I have mostly seen them with lame attempts to encrypt Perl source code or humorous Perl obfuscators. In other words, I know it can be done this way but I personally don't know enough about them to recommend them or say not to use them.
Subroutines
Sinan Ünür openex subroutine is a good way to accomplish this. I will only add that a common older idiom that you will see involves passing a reference to a typeglob like this:
sub opensesame {
my $fn=shift;
local *FH;
return open(FH,$fn) ? *FH : undef;
}
$fh=opensesame('> /tmp/file');
Read perldata for why it is this way...
Template Toolkit
Template::Toolkit can be used to process Perl source code. For example, you could write a template along the lines of:
[% fw(fp, outfile) %]
running that through Template::Toolkit can result in expansion and substitution to:
open my $FP, '>', $outfile or die "$outfile could not be opened for writing:$!";
Template::Toolkit is most often used to separate the messy HTML and other presentation code from the application code in web apps. Template::Toolkit is very actively developed and well documented. If your only use is a macro of the type you are suggesting, it may be overkill.
Text Editors
Chas. Owens has a method using Vim. I use BBEdit and could easily write a Text Factory to replace the skeleton of a open with the precise and evolving open that I want to use. Alternately, you can place a completion template in your "Resources" directory in the "Perl" folder. These completion skeletons are used when you press the series of keys you define. Almost any serious editor will have similar functionality.
With BBEdit, you can even use Perl code in your text replacement logic. I use Perl::Critic this way. You could use Template::Toolkit inside BBEdit to process the macros with some intelligence. It can be set up so the source code is not changed by the template until you output a version to test or compile; the editor is essentially acting as a preprocessor.
Two potential issues with using a text editor. First is it is a one way / one time transform. If you want to change what your "macro" does, you can't do it, since the previous text of you "macro" was already used. You have to manually change them. Second potential issue is that if you use a template form, you can't send the macro version of the source code to someone else because the preprocessing that is being done inside the editor.
Don't Do This!
If you type perl -h to get valid command switches, one option you may see is:
-P run program through C preprocessor before compilation
Tempting! Yes, you can run your Perl code through the C preprocessor and expand C style macros and have #defines. Put down that gun; walk away; don't do it. There are many platform incompatibilities and language incompatibilities.
You get issues like this:
#!/usr/bin/perl -P
#define BIG small
print "BIG\n";
print qq(BIG\n);
Prints:
BIG
small
In Perl 5.12 the -P switch has been removed...
Conclusion
The most flexible solution here is just write a subroutine. All your code is visible in the subroutine, easily changed, and a shorter call. No real downside other than the readability of your code potentially.
Template::Toolkit is widely used. You can write complex replacements that act like macros or even more complex than C macros. If your need for macros is worth the learning curve, use Template::Toolkit.
For very simple cases, use the one way transforms in an editor.
If you really want C style macros, you can use Filter::CPP. This may have the same incompatibilities as the perl -P switch. I cannot recommend this; just learn the Perl way.
If you want to run Perl one liners and Perl regexs against your code before it compiles, use Filter::Simple.
And don't use the -P switch. You can't on newer versions of Perl anyway.
For something like open i think it's useful to include close in your factorized routine. Here's an approach that looks a bit wierd but encapsulates a typical open/close idiom.
sub with_file_do(&$$) {
my ($code, $mode, $file) = #_;
open my $fp, '>', $file or die "Could not open '$file' for writing:$!";
local $FP = $fp;
$code->(); # perhaps wrap in an eval
close $fp;
}
# usage
with_file_do {
print $FP "whatever\n";
# other output things with $FP
} '>', $outfile;
Having the open params specified at the end is a bit wierd but it allows you to avoid having to specify the sub keyword.

How can I use Smart::Comments in a module I load without changing its source?

How can I specify that Smart::Comments be loaded for my original script, as well as for any of the modules it directly loads. However, since it is a source-filter, it would probably wreck havoc if applied to every module loaded by every other loaded module.
For example, my script includes
use Neu::Image;
I would like to load Smart::Comments for Neu::Image as well, but specifying
$ perl -MSmart::Comments script.pl
does not load Smart::Comments for Neu::Image.
This behavior is described in the Smart::Comments documentation:
If you're debugging an application you
can also invoke it with the module
from the command-line:
perl -MSmart::Comments $application.pl
Of course, this only enables smart
comments in the application file
itself, not in any modules that the
application loads.
A few other things that I've looked at already:
Perl Command-Line Options
perldoc perlrun (I searched it
for the word "module")
WORKAROUND
As gbacon mentions, Smart::Comments provides an environment variable option that would allow turning it on or off. However, I would like to be able to turn it on without modifying the original source, if possible.
You almost certainly want to add use Smart::Comments to modules that contain such and then flip the switch in your environment by setting $Smart_Comments appropriately.
Stash-munging, import-hijacking monkey-patching is madness.
But maybe you're into that sort of thing. Say you have Foo.pm:
package Foo;
use Exporter 'import';
our #EXPORT = qw/ foo /;
#use Smart::Comments;
sub foo {
my #result;
for (my $i = 0; $i < 5; $i++) {
### $i
push #result => $i if $i % 2 == 0;
}
wantarray ? #result : \#result;
}
1;
Ordinary usage:
$ perl -MFoo -e 'print foo, "\n"'
024
Ordinary is dull and boring, of course. With run-foo, we take bold, dashing steps!
#! /usr/bin/perl
use warnings;
use strict;
BEGIN {
unshift #INC => \&inject_smart_comments;
my %direct;
open my $fh, "<", $0 or die "$0: open: $!";
while (<$fh>) {
++$direct{$1} if /^\s*use\s+([A-Z][:\w]*)/;
}
close $fh;
sub inject_smart_comments {
my(undef,$path) = #_;
s/[\/\\]/::/g, s/\.pm$// for my $mod = $path;
if ($direct{$mod}) {
open my $fh, "<", $path or die "$0: open $path: $!";
return sub {
return 0 unless defined($_ = <$fh>);
s{^(\s*package\s+[A-Z][:\w]*\s*;\s*)$}
{$1 use Smart::Comments;\n};
return 1;
};
}
}
}
use Foo;
print foo, "\n";
(Please pardon the compactness: I shrunk it so it would all fit in an unscrolled block.)
Output:
$ ./run-foo
### $i: 0
### $i: 1
### $i: 2
### $i: 3
### $i: 4
024
¡Viva!
With #INC hooks we can substitute our own or modified sources. The code watches for attempts to require modules directly used by the program. On a hit, inject_smart_comments returns an iterator that yields one line at a time. When this crafty, artful iterator sees the package declaration, it appends an innocent-looking use Smart::Comments to the chunk, making it appear as though it were in the module's source all along.
By trying to parse Perl code with regular expressions, the code will break if the package declaration isn't on a line by itself, for example. Season to taste.
It doesn't seem like this idea makes any sense. If you are utilizing Smart::Comments in a module, why would you not want to use Smart::Comments in that module's source? Even if you could get Smart::Comments to apply to all modules loaded in a script via -M, it probably wouldn't be a good idea because:
You're obfuscating the fact that your modules are using smart comments by not including the use line in their source.
You could potentially introduce bizarre behavior from modules you use in your script which happen to have what look like smart comments, but aren't really. If a module doesn't contain smart comments, you should not force them down its throat.
As gbacon said, the right way to do this is to use the module in each of your modules that make use of it, and then suppress them with an environment variable when you don't want the output.
Also as he said, it's still probably possible to do this with some "Stash-munging, import-hijacking monkey-patching" madness, but that's a lot of work. I don't think anyone is going to put the effort into giving you a solution along those lines when it is not a good idea in the first place.