What is the meaning of the nest code
foreach (#items)
{
if (-l $_) ## this is what I don't understand: the meaning of -l
{
...
}
}
Thanks for any help.
Let's look at each thing:
foreach (#items) {
...
}
This for loop (foreach and for are the same command in Perl) is taking each item from the #items list, and setting it to $_. The $_ is a special variable in Perl that is used as sort of a default variable. The idea is that you could do things like this:
foreach (#items) {
s/foo/bar/;
uc;
print;
}
And each of those command would operate on that $_ variable! If you simply say print with nothing else, it would print whatever is in $_. If you say uc and didn't mention a variable, it would uppercase whatever is in $_.
This is now discouraged for several reasons. First, $_ is global, so there might be side effects that are not intended. For example, imagine you call a subroutine that mucked with the value of $_. You would suddenly be surprised that your program doesn't work.
The other -l is a test operator. This operator checks whether the file given is a symbolic link or not. I've linked to the Perldoc that explains all of the test operators.
If you're not knowledgeable in Unix or BASH/Korn/Bourne shell scripting, having a command that starts with a dash just looks weird. However, much of Perl's syntax was stolen... I mean borrowed from Unix shell and awk commands. In Unix, there's a command called test which you can use like this:
if test -L $FILE
then
....
fi
In Unix, that -L is a parameter to the test command, and in Unix, most parameters to commands start with dashes. Perl simply borrowed the same syntax dash and all.
Interestingly, if you read the Perldoc for these test commands, you will notice that like the foreach loop, the various test commands will use the $_ variable if you don't give it a variable or file name. Whoever wrote that script could have written their loop like this:
foreach (#items)
{
if (-l) ## Notice no mention of the `$_` variable
{
...
}
}
Yeah, that's soooo much clear!
Just for your information, The modern way as recommended by many Perl experts (cough Damian Conway cough) is to avoid the $_ variable whenever possible since it doesn't really add clarity and can cause problems. He also recommends just saying for and forgetting foreach, and using curly braces on the same line:
for my $file (#items) {
if ( -l $file ) {
...
}
}
That might not help with the -l command, but at least you can see you're dealing with files, so you might suspect that -l has something to do with files.
Unfortunately, the Perldoc puts all of these file tests under the -X section and alphabetized under X, so if you're searching the Perldoc for a -l command, or any command that starts with a dash, you won't find it unless you know. However, at least you know now for the future where to look when you see something like this: -s $file.
It's an operator that checks if a file is a symbolic link.
The -l filetest operator checks whether a file is a symbolic link.
The way -l works under the hood resembles the code below.
#! /usr/bin/env perl
use strict;
use warnings;
use Fcntl ':mode';
sub is_symlink {
my($path) = #_;
my $mode = (lstat $path)[2];
die "$0: lstat $path: $!" unless defined $mode;
return S_ISLNK $mode;
}
my #items = #ARGV;
foreach (#items) {
if (is_symlink $_) {
print "$0: link: $_\n";
}
}
Sample output:
$ ln -s foo/bar/baz quux
$ ./flag-links flag-links quux
./flag-links: link: quux
Note the call to lstat and not stat because the latter would attempt to follow symlinks but never identify them!
To understand how Unix mode bits work, see the accepted answer to “understanding and decoding the file mode value from stat function output.”
From perldoc :
-l File is a symbolic link.
Related
I'm trying to loop over some files in my directory in Perl. Let's say my current directory contains: song0.txt, song1.txt, song2.txt, song3.txt, song4.txt.
I supply "song?.txt" as an argument to my program.
When I do:
foreach $file (glob "$ARGV[0]") {
printf "$file\n";
}
It stops after printing "song0.txt".
However, if I replace "$ARGV[0]" with "song?.txt", it prints out all 5 of them as expected. Why doesn't Perl glob work with variables and what can I do to fix this?
When you call your program with song?.txt the shell expands that ? so
prog.pl song?.txt --> prog.pl song0.txt song1.txt ...
Thus "$ARGV[0]" in the program is song0.txt and there is nothing for Perl's glob to do with it.
So you'd either do
foreach my $file (#ARGV) { }
and call the program with prog.pl song?.txt, or do the globbing in Perl
foreach my $file (glob "song?.txt") { ... }
where now Perl's glob will construct the list of files using ? in the pattern.
Which of the two is "better" depends on the context. But I'd rather submit to a program a straight-up list of files, if that is an equal option, than get entangled in glob-ing patterns in the program.
Also note that Perl's glob is an ancient "demon", with "interesting" behaviors in some cases.
I am executing my script this way:
./script.pl -f files*
I looked at some other threads (like How can I open a file in Perl using a wildcard in the directory name?)
If i hard code the file name like it is written in this thread I get my desired result. If I take it from the command line it does not.
My options subroutine should save all the files I get this way in an array.
my #file;
sub Options{
my $i=0;
foreach my $opt (#ARGV){
switch ($opt){
case "-f" {
$i++;
### This part does not work:
#file= glob $ARGV[$i];
print Dumper("$ARGV[$i]"); #$VAR1 = 'files';
print Dumper(#file); #$VAR1 = 'files';
}
}
$i++;
}
}
It seems the execution is interpreted in advance and the wildcard (*) is dropped in the process.
Desired result: All files beginning with files are saved in an array, after execution from the command line.
I hope you get my problem. If not feel free to ask.
Thank you.
Well, first I'd suggest using a module to do args on command line:
Getopt::Long for example.
But otherwise your problem is simpler - your shell is expanding the 'file*' before perl gets it. (shell glob is getting there first).
If you do this with:
-f 'file*'
then it'll work properly. You should be able to see this - for example - if you just:
use Data::Dumper;
print Dumper \#ARGV;
I expect you'll see a much longer list than you thought.
However, I'd also point out - perl has a really nice feature you may be able to use (depending what you're doing with your files).
You can use <>, which automatically opens and reads all files specified on command line (in order).
Since your shell is already expanding the glob files* into a list of filenames, that's what the Perl program gets.
$ perl -E 'say #ARGV' files*
files1files2files3
There's no need to do that in Perl, if your shell can do it for you. If all you want is the filenames in an array, you already have #ARGV which contains those.
I think the title of my question basically covers it. Here's a contrived example which tries to filter for input lines that exactly equal a parameterized string, basically a Perlish fgrep -x:
perl -ne 'chomp; print if $_ eq $ARGV[0];' bb <<<$'aa\nbb\ncc';
## Can't open bb: No such file or directory.
The problem of course is that the -n option creates an implicit while (<>) { ... } loop around the code, and the diamond operator gobbles up all command-line arguments for file names. So, although technically the bb argument did get to #ARGV, the whole program fails because the argument was also picked up by the diamond operator. The end result is, it is impossible to pass command-line arguments to the Perl program when using -n.
I suppose what I really want is an option that would create an implicit while (<STDIN>) { ... } loop around the code, so command-line arguments wouldn't be taken for file names, but such a thing does not exist.
I can think of three possible workarounds:
1: BEGIN { ... } block to copy and clear #ARGV.
perl -ne 'BEGIN { our #x = shift(#ARGV); } chomp; print if $_ eq $x[0];' bb <<<$'aa\nbb\ncc';
## bb
2: Manually code the while-loop in the one-liner.
perl -e 'while (<STDIN>) { chomp; print if $_ eq $ARGV[0]; }' bb <<<$'aa\nbb\ncc';
## bb
3: Find another way to pass the arguments, such as environment variables.
PAT=bb perl -ne 'chomp; print if $_ eq $ENV{PAT};' <<<$'aa\nbb\ncc';
## bb
The BEGIN { ... } block solution is undesirable since it constitutes a bit of a jarring context switch in the one-liner, is somewhat verbose, and requires messing with the special variable #ARGV.
I consider the manual while-loop solution to be more of a non-solution, since it forsakes the -n option entirely, and the point is I want to be able to use the -n option with command-line arguments.
The same can be said for the environment variable solution; the point is I want to be able to use command-line arguments with the -n option.
Is there a better way?
You've basically identified them all. The only one you missed, that I know of at least, is the option of passing switch arguments (instead of positional arguments):
$ perl -sne'chomp; print if $_ eq $kwarg' -- -kwarg=bb <<<$'aa\nbb\ncc';
bb
You could also use one of the many getopt modules instead of -s. This is essentially doing the same thing as manipulating #ARGV in a BEGIN {} block before the main program loop, but doing it for you and making it a little cleaner for a one-liner.
I made the following script which searches for certain processes, displays uses pflags for each one, and stops when it finds one with the word "pause":
!cat find_pause
#!/usr/bin/perl -W
use warnings;
use strict;
if (open(WCF,
"ps -ef | grep '/transfile' | cut -c10-15 | xargs -n1 pflags 2>&1 |"
)) {
while (<WCF>) {
next if ($_ =~ /cannot/);
print $_;
last if ($_ =~ /pause/);
}
close(WCF);
}
It works, but I wonder if there is a better way to do this.
Update
pause is a low-level system call. Like read, nanosleep, waitid, etc.
With this script I want to find processes that are stuck in the pause call. We are trying to find a bug in our system, and we think it might be related to this.
I don't know what you'd consider a "better way" in this case, but I can offer some technique guidance for the approach you already have:
grep '/[t]ransfile'
A grep against ps output often runs the risk of matching the grep process itself, which is almost never desired. An easy protection against this is simply to introduce a character class of one member in the grep pattern argument.
awk '/\/[t]ransfile/{ print $2 }'
grep + cut, that is, field extraction following a pattern match, is an easy task for a single awk command.
Don't refer to $_
Tighter, more idiomatic perl would omit explicit use of $_. Try next if /cannot/ and the like.
open(my $wcf, ...)
Please use lexical filehandles, otherwise you'll be chided by those old enough to remember when we couldn't use them. :)
There are two possible improvements to this, depending on:
Do you actually require to print exact output of pflags command or some info from it (e.g. list of PIDs and flags?)
What does "pause" in pflags output mean? It's nowhere in "proc" or "pflags" man-pages and all the actual flags are upper case. Depending on its meaning, it might be found in native Perl implementation of "/proc" - Proc::processTable::Process.
For example, that Process object contains all the flags (in a bit vector) and process status (my suspicion is that "pause" might be a process status).
If the answers to those questions are "Proc::processTable::Process contains enough info for my needs", then a better solution is to use that:
#!/usr/bin/perl -W
use warnings;
use strict;
use Proc::ProcessTable;
my $t = new Proc::ProcessTable;
foreach $p ( #{$t->table} ) {
my $flags = $p->pid; # This is an integer containing bit vector.
# Somehow process $flags or $p->status to find "if the process is paused"
print "$flags\n";
last if paused($p); # No clue how to do that without more info from you
# May be : last if $p->status =~ /paused/;
}
However, if the native Perl process does not have enough info for you (unlikely but possible), OR if you acually desire to print exact pflags output as-is for some reason, the best optimization is to construct a list of PIDs for pflags natively - not as big of a win but you still lose a bunch of extra forked off processes. Something like this:
#!/usr/bin/perl -W
use warnings;
use strict;
use Proc::ProcessTable;
my $t = new Proc::ProcessTable;
my $pids = join " ", map { $_->pid } #{$t->table};
if (open(WCF, "pflags 2>&1 $pids|")) {
while (<WCF>) {
next if ($_ =~ /cannot/);
print $_;
last if ($_ =~ /pause/);
}
close(WCF);
}
The following Perl code has an obvious inefficiency;
while (<>)
{
if ($ARGV =~ /\d+\.\d+\.\d+/) {next;}
... or do something useful
}
The code will step through every line of the file we don't want.
On the size of files this particular script is running on this is unlikely to make a noticeable difference, but for the sake of learning; How can I junk the whole file <> is working and move to the next one?
The purpose of this is because the sever this script runs on stores old versions of apps with the version number in the file name, I'm only interested in the current version.
Paul Roub's solution is best if you can filter #ARGV before you start reading any files.
If you have to skip a file after you've begun iterating it,
while (<>) {
if (/# Skip the rest of this file/) {
close ARGV;
next;
}
print "$ARGV: $_";
}
grep ARGV first.
#ARGV = grep { $_ !~ /\d+\.\d+\.\d+/ } #ARGV;
while (<>)
{
# do something with the other files
}
Paul Roub's answer works for more information, see The IO operators section of perlop man page. The pattern of using grep is mentioned as well as a few other things related to <>.
Take note of the mention of ARGV::readonly regarding things like:
perl dangerous.pl 'rm -rfv *|'