What should be the first line of a Perl test (.t) script? - perl

I was looking through my CPAN distributions and realized that I had various inconsistent things at the top of my .t scripts, based on where I'd cargo-culted them from. This of course offends me.
So, what's the "best" first line of a Perl test (.t) script? A non-scientific survey of my .cpanm sources showed me:
3429 use strict;
3211 #!/usr/bin/perl
1344 #!/usr/bin/env perl
937 #!perl
909 #!/usr/bin/perl -w
801 #!perl -w
596
539 #!perl -T
Related to What should I use for a Perl script's shebang line?, but here I'm wondering if the shebang is necessary/useful at all, if tests are always expected to be called from prove.

use strict; should be in EVERY Perl program (as well as use warnings; and probably a few others (depending who you talk to). It doesn't have to be the first line.
The rest of the stuff you present are merely multiple versions of the shebang. On Unix and Unix based systems, the shebang is used to tell the interpreter to use. If the shebang isn't there, the current shell will be used. (That is, if your shell supports the shebang1.).
If you are on a Windows system, that shebang line is useless. Windows uses the suffix of the file to figure out how to execute the file.
The question is whether you really even need a shebang line in scripts that you run through a test harness. I don't normally execute these test scripts individually, so there may not really be a need for the shebang at all. In the rare event you do want to execute one of these scripts, you could always run them as an argument to the perl command itself. The same is true with Perl modules.
So, why the shebang? Mainly habit. I put one too on my test scripts and on my modules. If nothing else, it makes it easy to run modules and test scripts independently for testing purposes. Plus, the shebang lets people know this is a Perl script and not a shell script or a Python script.
There's a little issue with the shebang though, and that has to do with portability. If I put #! /usr/bin/perl on my first line, and the Perl interpreter is in #! /usr/local/bin/perl, I'll get an error. Thus, many of the shebangs reflect the location of the Perl interpreter for that developer.
There's also another issue: I use Perlbrew which allows me to install multiple versions of Perl. The version of Perl I currently have in my shell is at /Users/David/perl5/perlbrew/perls/perl-5.18.0/bin/perl. That's not good if I pass that program to another user on another system. Or, if I decide I want to test my Perl script under 5.10.
The #! perl variant is an attempt to solve this. On SunOS (I believe), this would execute the Perl that's in your $PATH (since it could be /usr/local/bin/perl or /usr/share/bin/perl or /usr/bin/perl depending upon the system). However, this does not work on most Unix boxes. For example, on my Mac, I would get a bad interpreter error.
I use #! /usr/bin/env perl. The env program takes the argument perl, sees where perl is on my $PATH, and then executes the full path of wherever Perl happens to be on my $PATH. This is the most universal way to do the shebang and the way I recommend. It allows those of use to use Perlbrew without problem.
Note, I said almost universal, and not universal. On almost all systems, env is located in /usr/bin, but there apparently some systems out there where env is either on /bin, located elsewhere, or not on the system at all.
The -w turns on warnings when executing Perl. Before Perl 5.6, you used this to turn on warnings. After Perl 5.6, you can use the more flexible use warnings; and the -w is now considered deprecated2. Programs may use it because they were originally written pre 5.6, and never modified or the developer simply did it out of habit.
The -T has to do with taint mode. Taint mode is normally used for CGI scripting. Basically, in Taint mode, data from outside a program is considered tainted and cannot be used until untainted. Taint mode also affects #INC and PERLLIB.
The real answer is that there is no set answer. I'd put #! /usr/bin/env perl as my first line out of habit -- even if I never expect to execute that file from a Unix command line. There's no real need for it for these types of scripts that are almost always executed as part of the install and not directly from the command line. What you see is really the result of 30 years of Perl habits and cruft.
1. Your shell supports the shebang unless you're using a 30 year old version of the Bourne shell or a very old version of the C shell.
2. I actually don't know if -w has ever been officially deprecated, but there's no reason to use it.

.t files are still regular Perl script. Use whatever you need for this particular script on first line.

Related

from cron running a subroutine from a perl module

I have a Perl Module that i created and i want to run one of the subroutine in it on a schedule. I know I can just make a small perl script that calls the subroutine and call it from the crontab but if there is a way to call the subroutine right from the crontab that would be cool!
Is this possible?
You can use Perl's -e switch for executing code from the command line, e.g.
perl -e 'use your_module; your_function()'
Make that even shorter with the -M switch for loading a module:
perl -Myour_module -e 'your_function()'
The perlrun man page is your friend.
You can run the subroutine from the command line using something like
perl -MYour::Module=some,functions,to,import,such,as,foo -e 'foo();'
So you will be able to do the same from the crontab. Note that the cron usually runs with a restricted set of environment variables, so you may need to add a -I/path/to/your/modules option.
If you want a more elegant solution, your module can be configured to detect that it is being run as a script and behave differently in that situation. See this discussion: In Perl, how can I find out if my file is being used as a module or run as a script?

How to specify in the script to only use a specific version of perl?

I have a perl script written for version 5.6.1 and which has dependencies on Oracle packages, DBI packages and more stuff. Everything is correctly installed and works.
Recently perl version 5.8.4 got installed and because those dependencies are not correctly set so the script fails.
'perl' command points to /program/perl_v5.8.4/bin/perl -- latest version
So, when I have to run my perl script I have to manually specify in command prompt
/program/perl_v5.6.1/bin/perl scriptName.pl
I tried adding following lines in script:
/program/perl_v5.6.1/bin/perl
use v5.6.1;
But, this means script has to take Perl version > 5.6.1
I found couple of related question which suggested:
To export path. But, I want the script for all the users without have to export path
Specify version greater than. But, I want to use just one specific version for this script.
use/require commands. Same issue as above.
My question: How to specify in the script to only use a specific version of perl?
The problem is that the interpreter specified in the shebang line (#!/some/path/to/perl) is not used if perl script is called like this:
perl some_script.pl
... as the 'default' (to simplify) perl is chosen then. One should use the raw power of shebang instead, by executing the file itself:
./some_script.pl
Of course, that means this file should be made executable (with chmod a+x, for example).
I have in my code:
our $LEVEL = '5.10.1';
our $BRACKETLEVEL = sprintf "%d.%03d%03d", split/\./, $LEVEL;
if ($] != $currentperl::BRACKETLEVEL)
{
die sprintf "Must use perl %s, this is %vd!\n", $LEVEL, $^V;
}
These are actually two different modules, but that's the basic idea. I simply "use correctlevel" at the top of my script instead of use 5.10.1; and I get this die if a developer tries using the wrong level of perl for that product. It does not, however, do anything else that use 5.10.1; would do (enable strict, enable features like say, switch, etc.).

What is the significance of -T or -w in #!/usr/bin/perl?

I googled about #!/usr/bin/perl, but I could not find any satisfactory answer. I know it’s a pretty basic thing, but still, could explain me what is the significance of #!/usr/bin/perl in Perl? Moreover, what does -w or -T signify in #!/usr/bin/perl? I am a newbie to Perl, so please be patient.
The #! is commonly called a "shebang" and it tells the computer how to run a script. You'll also see lots of shell-scripts with #!/bin/sh or #!/bin/bash.
So, /usr/bin/perl is your Perl interpreter and it is run and given the file to execute.
The rest of the line are options for Perl. The "-T" is tainting (it means input is marked as "not trusted" until you check it's format). The "-w" turns warnings on.
You can find out more by running perldoc perlrun (perldoc is Perl's documentation reader, might be installed, might be in its own package).
For scripts you write I would recommend starting them with:
#!/usr/bin/perl
use warnings;
use strict;
This turns on lots of warnings and extra checks - especially useful while you are learning (I'm still learning and I've been using Perl for more than 10 years now).
Both -w and -T are sort of "foolproof" flags.
-w is the same as use warning statement in your code, and it's an equivalent of warning option in many compilers. A simplest example would be a warning about using uninitialized variable:
#!/usr/bin/perl -w
print "$A\n";
print "Hello, world!\n";
Will print:
Name "main::A" used only once: possible typo at ./perl-warnings line 3.
Use of uninitialized value $A in concatenation (.) or string at
./perl-warnings line 3.
Hello, world!
The -T flag means that any value that came from the outside world (as opposite to being calculated inside the program) is considered potential threat, and disallows usage of such values in system-related operations, like writing files, executing system command, etc. (That's why Perl would activate the "taint" mode when the script is running under setuid/setgid.)
The "tainted" mode is "enforcing" you to double-check the value inside the script.
E.g., the code:
#!/usr/bin/perl -T
$A = shift;
open FILE, ">$A";
print "$A\n";
close FILE;
Will produce a fatal error (terminating the program):
$ ./perl-tainted jkjk
Insecure dependency in open while running with -T switch at
./perl-tainted line 3.
And that's only because the argument value came from "outside" and was not "double-checked". The "taint" mode is drawing your attention to that fact. Of course, it's easy to fool it, e.g.:
#!/usr/bin/perl -T
$A = shift;
$A = $1 if $A =~ /(^.*$)/;
open FILE, ">$A";
print "$A\n";
close FILE;
In this case everything worked fine. You "fooled" the "taint mode". Well, the assumption is that programer's intentions are to make the program safer, so the programmer wouldn't just work around the error, but would rather take some security measures. One of Perl's nicknames is "the glue and the duct tape of system administrators". It's not unlikely that system administrator would create Perl script for his own needs and would run it with root permissions. Think of this script doing something normal users are not allowed to do... you probably want to double-check things which are not part of the program itself, and you want Perl to remind you about them.
Hope it helps.
about Taint Mode(-T):
require and use statements change when taint mode is turned on.
The path to load libraries/modules no longer contains . (the current directory) from its path.
So if you load any libraries or modules relative to the current working directory without explicitly specifying the path, your script will break under taint mode.
For ex: Consider perl_taint_ex.pl
#!/usr/bin/perl -T
require "abc.pl";
print "Done";
would fail like this
D:\perlex>perl perl_taint_ex.pl
"-T" is on the #! line, it must also be used on the command line
at perl_taint_ex.pl line 1.
D:\perlex>perl -T perl_taint_ex.pl
Can't locate abc.pl in #INC (#INC contains: C:/Perl/site/lib C:/Perl/lib)
at perl_taint_ex.pl line 3.
So when taint mode is on, you must tell the require statement explicitly where to load the library since . is removed during taint mode from the #INC array.
#INC contains a list of valid paths to read library files and modules from.
If taint mode is on, you would simply do the following:
D:\perlex>perl -ID:\perlex -T perl_taint_ex.pl
Done
-ID:\perlex will include directory D:\perlex in #INC.
You can try other ways for adding path to #INC,this is just one example.
It's called a shebang. On Unix based systems (OSX, Linux, etc...) that line indicates the path to the language interpreter when the script is run from the command line. In the case of perl /usr/bin/perl is the path to the perl interpreter. If the hashbang is left out the *nix systems won't know how to parse the script when invoked as an executable. It will instead try to interpret the script in whatever shell the user happens to be running (probably bash) and break the script.
http://en.wikipedia.org/wiki/Hashbang
The -W and -T are arguments that controll the way the perl interpreter operates. They are the same arguments that you could invoke when calling perl interpreter directly from the command line.
-W shows warnings (aka debuging information).
-T turns on taint / security checking.

invoking perl scripts

I have perl scripts starting with #!/usr/bin/perl or #!/usr/bin/env perl
First, what does the second version mean?
Second, I use Ubuntu. All the scripts are set as executables. When I try to run a script by simply invoking it's name (e.g. ./script.pl) I get : No such file or directory. when I invoke by perl ./script.pl it runs fine.
Why?
The #!/usr/bin/env perl uses the standard POSIX tool env to work around the "problem" that UNIX doesn't support relative paths in shebang lines (AFAIK). The env tool can be used to start a program (in this case perl) after modifying environment variables. In this case, no variables are modified and env then searches the PATH for Perl and runs it. Thus a script with that particular shebang line will work even when Perl is not installed in /usr/bin but in some other path (which must be in the PATH variable).
Then, you problem with ./script.pl not working: you said it has the executable bit(s) set, like with chmod +x script.pl ? But does it also start with a shebang (#!) line ? That is, the very first two bytes must be #! and it must be followed by a file path (to perl). That is necessary to tell the kernel with which program to run this script. If you have done so, is the path correct ? You want to try the #!/usr/bin/env perl variant ;-)
Using #!/usr/bin/env perl gets around the problem of perl not necessarily being in /usr/bin on every system; it's just there to make the script more portable
On a related note, for your second problem, is there a /usr/bin/perl and/or /usr/bin/env? If not, that would explain why running the scripts directly doesn't work; the shebang isn't handled if you run the script as an argument to perl

How can I make Perl's a2p support gawk?

I have some awk scripts that use gawk from cygwin. Now I need to pass these scripts to colleagues that don't have cygwin installed, but do have Perl. I was hoping I can just use a2p that is included in cygwin, but it fails with errors like the following:
Undefined subroutine &main::gensub called at ./t.pl line 18, <> line 1.
I am hoping there are existing Perl libraries/modules that implement these methods. Any pointers?
The gensub() function isn't supported by a2p. If you modify your code to use gsub() instead then it should compile.
Alternatively, you could add a gensub() subroutine to the end of the translated Perl program to simulate the gensub() functionality.
However, the Perl code produced by a2p isn't very maintainable so I'd only use it as a last resort.
If your gawk program doesn't make calls to other cygwin/unix utilities then it would probably be better to just distribute a Windows gawk executable to your colleagues along with the program.