What is the significance of -T or -w in #!/usr/bin/perl? - perl

I googled about #!/usr/bin/perl, but I could not find any satisfactory answer. I know it’s a pretty basic thing, but still, could explain me what is the significance of #!/usr/bin/perl in Perl? Moreover, what does -w or -T signify in #!/usr/bin/perl? I am a newbie to Perl, so please be patient.

The #! is commonly called a "shebang" and it tells the computer how to run a script. You'll also see lots of shell-scripts with #!/bin/sh or #!/bin/bash.
So, /usr/bin/perl is your Perl interpreter and it is run and given the file to execute.
The rest of the line are options for Perl. The "-T" is tainting (it means input is marked as "not trusted" until you check it's format). The "-w" turns warnings on.
You can find out more by running perldoc perlrun (perldoc is Perl's documentation reader, might be installed, might be in its own package).
For scripts you write I would recommend starting them with:
#!/usr/bin/perl
use warnings;
use strict;
This turns on lots of warnings and extra checks - especially useful while you are learning (I'm still learning and I've been using Perl for more than 10 years now).

Both -w and -T are sort of "foolproof" flags.
-w is the same as use warning statement in your code, and it's an equivalent of warning option in many compilers. A simplest example would be a warning about using uninitialized variable:
#!/usr/bin/perl -w
print "$A\n";
print "Hello, world!\n";
Will print:
Name "main::A" used only once: possible typo at ./perl-warnings line 3.
Use of uninitialized value $A in concatenation (.) or string at
./perl-warnings line 3.
Hello, world!
The -T flag means that any value that came from the outside world (as opposite to being calculated inside the program) is considered potential threat, and disallows usage of such values in system-related operations, like writing files, executing system command, etc. (That's why Perl would activate the "taint" mode when the script is running under setuid/setgid.)
The "tainted" mode is "enforcing" you to double-check the value inside the script.
E.g., the code:
#!/usr/bin/perl -T
$A = shift;
open FILE, ">$A";
print "$A\n";
close FILE;
Will produce a fatal error (terminating the program):
$ ./perl-tainted jkjk
Insecure dependency in open while running with -T switch at
./perl-tainted line 3.
And that's only because the argument value came from "outside" and was not "double-checked". The "taint" mode is drawing your attention to that fact. Of course, it's easy to fool it, e.g.:
#!/usr/bin/perl -T
$A = shift;
$A = $1 if $A =~ /(^.*$)/;
open FILE, ">$A";
print "$A\n";
close FILE;
In this case everything worked fine. You "fooled" the "taint mode". Well, the assumption is that programer's intentions are to make the program safer, so the programmer wouldn't just work around the error, but would rather take some security measures. One of Perl's nicknames is "the glue and the duct tape of system administrators". It's not unlikely that system administrator would create Perl script for his own needs and would run it with root permissions. Think of this script doing something normal users are not allowed to do... you probably want to double-check things which are not part of the program itself, and you want Perl to remind you about them.
Hope it helps.

about Taint Mode(-T):
require and use statements change when taint mode is turned on.
The path to load libraries/modules no longer contains . (the current directory) from its path.
So if you load any libraries or modules relative to the current working directory without explicitly specifying the path, your script will break under taint mode.
For ex: Consider perl_taint_ex.pl
#!/usr/bin/perl -T
require "abc.pl";
print "Done";
would fail like this
D:\perlex>perl perl_taint_ex.pl
"-T" is on the #! line, it must also be used on the command line
at perl_taint_ex.pl line 1.
D:\perlex>perl -T perl_taint_ex.pl
Can't locate abc.pl in #INC (#INC contains: C:/Perl/site/lib C:/Perl/lib)
at perl_taint_ex.pl line 3.
So when taint mode is on, you must tell the require statement explicitly where to load the library since . is removed during taint mode from the #INC array.
#INC contains a list of valid paths to read library files and modules from.
If taint mode is on, you would simply do the following:
D:\perlex>perl -ID:\perlex -T perl_taint_ex.pl
Done
-ID:\perlex will include directory D:\perlex in #INC.
You can try other ways for adding path to #INC,this is just one example.

It's called a shebang. On Unix based systems (OSX, Linux, etc...) that line indicates the path to the language interpreter when the script is run from the command line. In the case of perl /usr/bin/perl is the path to the perl interpreter. If the hashbang is left out the *nix systems won't know how to parse the script when invoked as an executable. It will instead try to interpret the script in whatever shell the user happens to be running (probably bash) and break the script.
http://en.wikipedia.org/wiki/Hashbang
The -W and -T are arguments that controll the way the perl interpreter operates. They are the same arguments that you could invoke when calling perl interpreter directly from the command line.
-W shows warnings (aka debuging information).
-T turns on taint / security checking.

Related

What should be the first line of a Perl test (.t) script?

I was looking through my CPAN distributions and realized that I had various inconsistent things at the top of my .t scripts, based on where I'd cargo-culted them from. This of course offends me.
So, what's the "best" first line of a Perl test (.t) script? A non-scientific survey of my .cpanm sources showed me:
3429 use strict;
3211 #!/usr/bin/perl
1344 #!/usr/bin/env perl
937 #!perl
909 #!/usr/bin/perl -w
801 #!perl -w
596
539 #!perl -T
Related to What should I use for a Perl script's shebang line?, but here I'm wondering if the shebang is necessary/useful at all, if tests are always expected to be called from prove.
use strict; should be in EVERY Perl program (as well as use warnings; and probably a few others (depending who you talk to). It doesn't have to be the first line.
The rest of the stuff you present are merely multiple versions of the shebang. On Unix and Unix based systems, the shebang is used to tell the interpreter to use. If the shebang isn't there, the current shell will be used. (That is, if your shell supports the shebang1.).
If you are on a Windows system, that shebang line is useless. Windows uses the suffix of the file to figure out how to execute the file.
The question is whether you really even need a shebang line in scripts that you run through a test harness. I don't normally execute these test scripts individually, so there may not really be a need for the shebang at all. In the rare event you do want to execute one of these scripts, you could always run them as an argument to the perl command itself. The same is true with Perl modules.
So, why the shebang? Mainly habit. I put one too on my test scripts and on my modules. If nothing else, it makes it easy to run modules and test scripts independently for testing purposes. Plus, the shebang lets people know this is a Perl script and not a shell script or a Python script.
There's a little issue with the shebang though, and that has to do with portability. If I put #! /usr/bin/perl on my first line, and the Perl interpreter is in #! /usr/local/bin/perl, I'll get an error. Thus, many of the shebangs reflect the location of the Perl interpreter for that developer.
There's also another issue: I use Perlbrew which allows me to install multiple versions of Perl. The version of Perl I currently have in my shell is at /Users/David/perl5/perlbrew/perls/perl-5.18.0/bin/perl. That's not good if I pass that program to another user on another system. Or, if I decide I want to test my Perl script under 5.10.
The #! perl variant is an attempt to solve this. On SunOS (I believe), this would execute the Perl that's in your $PATH (since it could be /usr/local/bin/perl or /usr/share/bin/perl or /usr/bin/perl depending upon the system). However, this does not work on most Unix boxes. For example, on my Mac, I would get a bad interpreter error.
I use #! /usr/bin/env perl. The env program takes the argument perl, sees where perl is on my $PATH, and then executes the full path of wherever Perl happens to be on my $PATH. This is the most universal way to do the shebang and the way I recommend. It allows those of use to use Perlbrew without problem.
Note, I said almost universal, and not universal. On almost all systems, env is located in /usr/bin, but there apparently some systems out there where env is either on /bin, located elsewhere, or not on the system at all.
The -w turns on warnings when executing Perl. Before Perl 5.6, you used this to turn on warnings. After Perl 5.6, you can use the more flexible use warnings; and the -w is now considered deprecated2. Programs may use it because they were originally written pre 5.6, and never modified or the developer simply did it out of habit.
The -T has to do with taint mode. Taint mode is normally used for CGI scripting. Basically, in Taint mode, data from outside a program is considered tainted and cannot be used until untainted. Taint mode also affects #INC and PERLLIB.
The real answer is that there is no set answer. I'd put #! /usr/bin/env perl as my first line out of habit -- even if I never expect to execute that file from a Unix command line. There's no real need for it for these types of scripts that are almost always executed as part of the install and not directly from the command line. What you see is really the result of 30 years of Perl habits and cruft.
1. Your shell supports the shebang unless you're using a 30 year old version of the Bourne shell or a very old version of the C shell.
2. I actually don't know if -w has ever been officially deprecated, but there's no reason to use it.
.t files are still regular Perl script. Use whatever you need for this particular script on first line.

How to find path of find2perl script on Unix using bash or perl

We (the company I work for) need to run the find2perl script on over a thousand different Unix servers of different flavors (Linux, Solaris, HP-UX, AIX) and different versions.
The one thing that all the servers have in common, is that they all have at least one implementation of perl installed. However, not all systems have it configured the same way.
Finding the location of perl is easy enough using the which command. However, on 70% of the servers, the actual directory containing find2perl (the bin folder of perl) is not present in the $PATH variable and can't be located that way.
On some servers, perl is actually a symbolic link pointing another location, in which case I can use ls -l and sed to extract the target of the link to find where perl is actually installed.
On other servers however, it's more complicated, as it seems perl was compiled to a custom location and the binary of perl present in /bin or /usr/bin (or wherever perl is found) is not a symbolic link, but rather a full blown executable. In this case, I thought about using the #INC variable of perl to try to find find2perl but it seems rather excessive.
What would be the better/best/fullproof method (one-liner if possible) to always get the location of find2perl on a Unix system?
Ways to locate find2perl
Two ways, both of which rely on asking the perl install how it was configured:
Config.pm
Its probably scriptdirexp from Config.pm.
$ perl -MConfig -E 'say $Config{scriptdirexp}'
/usr/bin
And indeed, that's where find2perl is on my system. You can use Config; in your perl scripts, which is its major advantage over the next method.
perl -V:varname
As per Yanick Girouard's comment, you can also use perl -V:scriptdirexp to get this, in a format suitable to passing to eval in a shell script. There are actually several formats available (so, you don't need to use e.g., cut to parse it):
OPTION OUTPUT (\n = actual newline) NOTES
-V:scriptdirexp scriptdirexp='/usr/bin';\n full shell syntax, even if multiple -V options
-V:scriptdirexp: scriptdirexp='/usr/bin' trailing colon omits semicolon and newline
-V::scriptdirexp '/usr/bin'; \n extra leading colon omits var= part
-V::scriptdirexp: '/usr/bin' you can combine them.
Full documentation is in the perlrun manpage.
Ways to embed find2perl
If you decide to copy over find2perl, as per evil otto's comment, you can actually do that by embedding it in your shell script. There are many ways. If neither of the two below work, then you can certainly use shar (which has an extremely long history, and is likely compatible with everything).
Quoted here-document
The easiest way is if your shell supports quoted here-documents. They all should, as its a POSIX requirement:
#!/bin/sh
perl - -name 'foo' -mtime 2 -print <<'FIND2PERL'
#!/usr/bin/perl
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if $running_under_some_shell;
⋮
FIND2PERL
Hex dump in a non-quoted here-document
If some of your shells don't implement quoted here-documents (POSIX‽ what's that!), then you have to protect find2perl from shell expansion. An easy way is to hex dump it, as 0–9 and a–f are all safe from shell expansion. The dump is easily done with xxd -p /usr/bin/find2perl, which only requires xxd on one machine. To read back the dump, you can use plain perl:
#!/bin/sh
perl -n -e 'chomp; print pack("H*", $_)' <<HEX | perl - -name 'foo'
23212f7573722f62696e2f7065726c0a202020206576616c202765786563
202f7573722f62696e2f7065726c202d5320243020247b312b222440227d
⋮
HEX
Using find2perl several times
Naturally, with either approach, you could also write find2perl to a temporary file (if you need to invoke it multiple times, for example). You could also embed it in a shell function.
perl -lwe '$_ = $^X; s/perl$/find2perl/; -f or die qq($_ not -f); print'
Copy the interpreter executable path into dollar default argument. Patch the value, assuming that find2perl is in the same directory as perl itself. (This is specified as UNIX only, so you don't have to cater for perl.exe, which would be easy enough to deal with.) Then test the file exists, and die if it doesn't. (You might invent some better error handling.) Then print the path if we're still alive. That's it.
Okay, here's a version that works for Windows, too:
perl -lwe "$_ = $^X; s/perl(\.exe)?$/find2perl/;
-f or -f qq($_.bat) or die qq($_ not -f); print"
Note the double quotes, de rigueur on Windows for cmd.exe. And it has to go on one line, I just wrapped it for readability.

Could someone tell me what this means in Perl

I'm new to Perl and was hoping someone could tell me what this means exactly
eval 'exec ${PERLHOME}/bin/perl -S $0 ${1+"$#"}' # -*- perl -*-
if 0;
This is explained in perldoc perlrun:
-S
makes Perl use the PATH environment variable to search for the program
unless the name of the program contains path separators.
...
Typically this is used to emulate #! startup on platforms that don't
support #! . It's also convenient when debugging a script that uses
#! , and is thus normally found by the shell's $PATH search
mechanism.
This example works on many platforms that have a shell compatible with
Bourne shell:
#!/usr/bin/perl
eval 'exec /usr/bin/perl -wS $0 ${1+"$#"}'
if $running_under_some_shell;
The system ignores the first line and feeds the program to /bin/sh,
which proceeds to try to execute the Perl program as a shell script.
The shell executes the second line as a normal shell command, and thus
starts up the Perl interpreter. On some systems $0 doesn't always
contain the full pathname, so the -S tells Perl to search for the
program if necessary. After Perl locates the program, it parses the
lines and ignores them because the variable
$running_under_some_shell is never true. If the program will be
interpreted by csh, you will need to replace ${1+"$#"} with $* ,
even though that doesn't understand embedded spaces (and such) in the
argument list. To start up sh rather than csh, some systems may
have to replace the #! line with a line containing just a colon,
which will be politely ignored by Perl.
In short, it mimics shebang behavior for platforms that have shells compatible with Bash.
It's valid both as shell script and as a Perl program. It is used to run the Perl interpreter after all on systems where the shebang doesn't work, for some reason. It's rarely seen these days but used to be common in the early 1990s.
The comment is just a comment, but it has special meaning in Emacs, which will open the file in perl mode.
I just read #Zaid's response, which is better and more correct than mine as long as this code is on the first line of the script being executed, and no shebang exists. I've never seen this kind of substitute. Quite interesting, really.
The second line, if 0; is a part of the first line. You can tell since the first line lacks a ;. It would be more obvious if this was one long single line with the comment being after the semicolon.
So it's equivalent to:
if(0) {
eval 'exec ${PERLHOME}/bin/perl -S $0 ${1+"$#"}
}
In perl, 0 will be evaluated to false, and so the eval-clause will never execute. Presumably this condition(the if) was a quick way to disable the line. Perhaps the evaluation was once something real instead of an always-false.
See perl --help, perldoc -f eval and perldoc -f exec for information on the evaluation block itself.
The remaining trickyness (${1+"$#"}) I have no idea about. This isn't perl anyway; it's interpreted by whichever shell exec is launching (Correct me if I'm wrong on this!). If it's bash, I don't think it does anything at all and can be substituted with $#, which is the environment variable holding all commandline arguments (ie #ARGV in perl).

perl: force the use of command line flags?

I often write one-liners on the command line like so:
perl -Magic -wlnaF'\t' -i.orig -e 'abracadabra($_) for (#F)'
In order to scriptify this, I could pass the same flags to the shebang line:
#!/usr/bin/perl -Magic -wlnaF'\t' -i.orig
abracadabra($_) for (#F);
However, there's two problems with this. First, if someone invokes the script by passing it to perl directly (as 'perl script.pl', as opposed to './script.pl'), the flags are ignored. Also, I can't use "/usr/bin/env perl" for this because apparently I can't pass arguments to perl when calling it with env, so I can't use a different perl installation.
Is there anyway to tell a script "Hey, always run as though you were invoked with -wlnaF'\t' -i.orig"?
You're incorrect about the perl script.pl version; Perl specifically looks for and parses options out of a #! line, even on non-Unix and if run as a script instead of directly.
The #! line is always examined for switches as the line is being
parsed. Thus, if you're on a machine that allows only one argument
with the #! line, or worse, doesn't even recognize the #! line, you
still can get consistent switch behavior regardless of how Perl was
invoked, even if -x was used to find the beginning of the program.
(...)
Parsing of the #! switches starts wherever perl is mentioned in the
line. The sequences "-*" and "- " are specifically ignored so that you
could, if you were so inclined, say
#!/bin/sh
#! -*-perl-*-
eval 'exec perl -x -wS $0 ${1+"$#"}'
if 0;
to let Perl see the -p switch.
Now, the above quote expects perl -x, but it works just as well if you start the script with
#! /usr/bin/env perl -*-perl -p-*-
(with enough characters to get past the 32-character limit on systems with that limit; see perldoc perlrun for details on that and the rest of what I quoted above).
I had the same problem with #!env perl -..., and env ended up being helpful:
$ env 'perl -w'
env: ‘perl -w’: No such file or directory
env: use -[v]S to pass options in shebang lines
So, just modify the shebang to #!/usr/bin/env -S perl -...

Why does TextMate always complain 'Can't find string terminator '"'' when it runs a Perl script?

I have a long-ish Perl script that runs just fine, but always gives this warning:
Can't find string terminator '"' anywhere before EOF at -e line 1
I've read elsewhere online that this is because of a misuse of single or double quotes and the error generally stops the script from running, but mine works. I'm pretty sure I've used my quotes correctly.
Is there anything else that could cause this warning?
EDIT:
I'm running the script via TextMate, which may be spawning a new Perl process to run my script.
I actually get the error when I run simple scripts as well, like this one:
#!/usr/bin/perl -w
use strict;
use warnings;
print "Hello world.";
Yes, you are right, your script does that in TextMate when I try it too.
Simple solution: don't run it using TextMate; just use the command line:
cd Projectdirectory
chmod +x myscript.pl
./myscript.pl
Hello world
More complex solution: tell TextMate that their application is broken and wait for them to fix it. The error is coming from some other Perl script that TextMate is invoking. Even a completely blank file run as Perl in TextMate fails with this error.
-Alex
The "at -e line 1" bit means it's coming from a one-liner. I suspect your long script is somewhere starting a separate perl process (possibly indirectly), and that perl is what is giving the error (and not doing whatever it is supposed to do.)
Start the debugger by doing
perl -d ./youscript.pl
Then keep pressing n[ENTER] (or just ENTER after you press n once) until you see the warning - the line that was just executed is your culprit. n stands for the next debugger directive btw.