How do I take multiple filenames from the Raku command line? - command-line

This Raku program works as I expect:
sub MAIN($name) { say "Got $name" }
I can pass a single name on the command line:
$ raku m1.raku foo
Got foo
The obvious extension, however,
sub MAIN(#names) { say "Got $_" for #names }
doesn't work:
$ raku mm.raku foo
Usage:
mm.raku <names>
$ raku mm.raku foo bar
Usage:
mm.raku <names>
What am I doing wrong?

What #cjm said.
However, you can go a little further than that, checking whether the names you specified, are actually files. And produce an error message if they are not. The trick is to use multi-dispatch:
subset File of Str where *.IO.f;
multi sub MAIN(*#files where #files.all ~~ File) {
say "These are all files: #files.join(",")";
}
multi sub MAIN(*#files) {
say "These are *NOT* files: #files.grep(* !~~ File).join(",")";
}
The first candidate will be run if all the names specified on the command line are in fact files. The second candidate will be run if the first didn't fire, implying that not all names specified are in fact files.

You must use the slurpy array signature for this:
sub MAIN(*#names) { say "Got $_" for #names }
Works as desired:
$ raku mm.raku
$ raku mm.raku foo
Got foo
$ raku mm.raku foo bar
Got foo
Got bar

There is a special $*ARGFILES variable which provides a way to iterate over files passed in to the program on the command line.
#!/usr/bin/raku
for $*ARGFILES.handles -> $fh {
say $fh;
}
$ ./enum_files.raku links.txt words.txt
IO::Handle<"links.txt".IO>(opened)
IO::Handle<"words.txt".IO>(opened)
We can use its lines method to read lines of the given files.
#!/usr/bin/raku
.say for $*ARGFILES.lines
$ ./read_lines.raku words.txt words2.txt
sky
cloud
cup
rock
war
tea
coffee
falcon
...

#cjm's answer is the proper one to your question. But just as Jan said there's another way:
say "Got:" #*ARGS
Where #*ARGS is a dynamic variable defined as the args on the command line.
$*ARGFILES has some extra nuance to it. Outside of main gets its values from #*ARGS or $*IN if there are no args. Inside main it just does $*IN.

Related

print doesn't recognize barewords as parameter?

In non-strict mode of Perl, barewords could be recognized as string, like below:
$x = hello;
print $x;
But it seems barewords cannot be passed to print directly, like below one doesn't output the string. Why are they different?
print hello;
In cases like this, the B::Deparse module can be very helpful:
$ perl -MO=Deparse -e 'print hello;'
print hello $_;
-e syntax OK
As you see, it interprets the identifier as a filehandle and takes the value to be printed from $_.
Barewords should be avoided like the plague they are. Personally, I'll continue to avoid them all the time. They're a relic left over from the wild west days of Perl (circa 1990) and would have been eliminated from the language except for the need to maintain backwards compatibility.
In any case, in that context, it prints $_ to the file handle hello, doesn't it? That's the sort of reason why barewords are worth avoiding. (Compare: print STDERR "Hello\n"; and print STDERR;, and print hello;).
For example:
open hello, ">junk.out";
while (<>)
{
print STDERR "Hello\n";
print STDERR;
print hello;
}
Sample run:
$ perl hello.pl
abc
Hello
abc
def
Hello
def
$ cat junk.out
abc
def
$
In case it isn't clear, I typed one line of abc, which was followed by Hello and abc on standard error; then I typed def, which was followed by Hello and def on standard error. I typed Control-D to indicate EOF, and showed the contents of the file junk.out (which didn't exist before I ran the script). It contained the two lines that I'd typed.
So, don't use barewords — they're confusing. And do use use strict; and use warnings; so that you have less opportunity to be confused.
An identifier is only a bareword if it has no other meaning.
A word that has no other interpretation in the grammar will be treated as if it were a quoted string. These are known as "barewords".
So, for example, there are no barewords in the following program:
sub f { print STDOUT "f()\n"; }
X: f;
In other circumstances, all of sub, f, print, STDOUT and X could be barewords, but they all have other meanings here. I could add use strict;, and it'll still work fine.
In your code, you used print hello as I used print STDOUT. If an identifier follows print, you are using the print FILEHANDLE LIST syntax of print, where the identifier is the name of a file handle.

What does -l $_ do in Perl, and how does it work?

What is the meaning of the nest code
foreach (#items)
{
if (-l $_) ## this is what I don't understand: the meaning of -l
{
...
}
}
Thanks for any help.
Let's look at each thing:
foreach (#items) {
...
}
This for loop (foreach and for are the same command in Perl) is taking each item from the #items list, and setting it to $_. The $_ is a special variable in Perl that is used as sort of a default variable. The idea is that you could do things like this:
foreach (#items) {
s/foo/bar/;
uc;
print;
}
And each of those command would operate on that $_ variable! If you simply say print with nothing else, it would print whatever is in $_. If you say uc and didn't mention a variable, it would uppercase whatever is in $_.
This is now discouraged for several reasons. First, $_ is global, so there might be side effects that are not intended. For example, imagine you call a subroutine that mucked with the value of $_. You would suddenly be surprised that your program doesn't work.
The other -l is a test operator. This operator checks whether the file given is a symbolic link or not. I've linked to the Perldoc that explains all of the test operators.
If you're not knowledgeable in Unix or BASH/Korn/Bourne shell scripting, having a command that starts with a dash just looks weird. However, much of Perl's syntax was stolen... I mean borrowed from Unix shell and awk commands. In Unix, there's a command called test which you can use like this:
if test -L $FILE
then
....
fi
In Unix, that -L is a parameter to the test command, and in Unix, most parameters to commands start with dashes. Perl simply borrowed the same syntax dash and all.
Interestingly, if you read the Perldoc for these test commands, you will notice that like the foreach loop, the various test commands will use the $_ variable if you don't give it a variable or file name. Whoever wrote that script could have written their loop like this:
foreach (#items)
{
if (-l) ## Notice no mention of the `$_` variable
{
...
}
}
Yeah, that's soooo much clear!
Just for your information, The modern way as recommended by many Perl experts (cough Damian Conway cough) is to avoid the $_ variable whenever possible since it doesn't really add clarity and can cause problems. He also recommends just saying for and forgetting foreach, and using curly braces on the same line:
for my $file (#items) {
if ( -l $file ) {
...
}
}
That might not help with the -l command, but at least you can see you're dealing with files, so you might suspect that -l has something to do with files.
Unfortunately, the Perldoc puts all of these file tests under the -X section and alphabetized under X, so if you're searching the Perldoc for a -l command, or any command that starts with a dash, you won't find it unless you know. However, at least you know now for the future where to look when you see something like this: -s $file.
It's an operator that checks if a file is a symbolic link.
The -l filetest operator checks whether a file is a symbolic link.
The way -l works under the hood resembles the code below.
#! /usr/bin/env perl
use strict;
use warnings;
use Fcntl ':mode';
sub is_symlink {
my($path) = #_;
my $mode = (lstat $path)[2];
die "$0: lstat $path: $!" unless defined $mode;
return S_ISLNK $mode;
}
my #items = #ARGV;
foreach (#items) {
if (is_symlink $_) {
print "$0: link: $_\n";
}
}
Sample output:
$ ln -s foo/bar/baz quux
$ ./flag-links flag-links quux
./flag-links: link: quux
Note the call to lstat and not stat because the latter would attempt to follow symlinks but never identify them!
To understand how Unix mode bits work, see the accepted answer to “understanding and decoding the file mode value from stat function output.”
From perldoc :
-l File is a symbolic link.

What's the use of <> in Perl?

What's the use of <> in Perl. How to use it ?
If we simply write
<>;
and
while(<>)
what is that the program doing in both cases?
The answers above are all correct, but it might come across more plainly if you understand general UNIX command line usage. It is very common to want a command to work on multiple files. E.g.
ls -l *.c
The command line shell (bash et al) turns this into:
ls -l a.c b.c c.c ...
in other words, ls never see '*.c' unless the pattern doesn't match. Try this at a command prompt (not perl):
echo *
you'll notice that you do not get an *.
So, if the shell is handing you a bunch of file names, and you'd like to go through each one's data in turn, perl's <> operator gives you a nice way of doing that...it puts the next line of the next file (or stdin if no files are named) into $_ (the default scalar).
Here is a poor man's grep:
while(<>) {
print if m/pattern/;
}
Running this script:
./t.pl *
would print out all of the lines of all of the files that match the given pattern.
cat /etc/passwd | ./t.pl
would use cat to generate some lines of text that would then be checked for the pattern by the loop in perl.
So you see, while(<>) gets you a very standard UNIX command line behavior...process all of the files I give you, or process the thing I piped to you.
<>;
is a short way of writing
readline();
or if you add in the default argument,
readline(*ARGV);
readline is an operator that reads a line from the specified file handle. Reading from the special file handle ARGV will read from STDIN if #ARGV is empty or from the concatenation of the files named by #ARGV if it's not.
As for
while (<>)
It's a syntax error. If you had
while (<>) { ... }
it get rewritten to
while (defined($_ = <>)) { ... }
And as previously explained, that means the same as
while (defined($_ = readline(*ARGV))) { ... }
That means it will read lines from (previously explained) ARGV until there are no more lines to read.
It is called the diamond operator and feeds data from either stdin if ARGV is empty or each line from the files named in ARGV. This webpage http://docstore.mik.ua/orelly/perl/learn/ch06_02.htm explains it very well.
In many cases of programming with syntactical sugar like this, Deparse of O is helpful to find out what's happening:
$ perl -MO=Deparse -e 'while(<>){print 42}'
while (defined($_ = <ARGV>)) {
print 42;
}
-e syntax OK
Quoting perldoc perlop:
The null filehandle <> is special: it can be used to emulate the
behavior of sed and awk, and any other Unix filter program that takes
a list of filenames, doing the same to each line of input from all of
them. Input from <> comes either from standard input, or from each
file listed on the command line.
it takes the STDIN standard input:
> cat temp.pl
#!/usr/bin/perl
use strict;
use warnings;
my $count=<>;
print "$count"."\n";
>
below is the execution:
> temp.pl
3
3
>
so as soon as you execute the script it will wait till the user gives some input.
after 3 is given as input,it stores that value in $count and it prints the value in the next statement.

Usage of defined with Filehandle and while Loop

While reading a book on advanced Perl programming(1), I came across
this code:
while (defined($s = <>)) {
...
Is there any special reason for using defined here? The documentation for
perlop says:
In these loop constructs, the assigned value (whether assignment is
automatic or explicit) is then tested to see whether it is defined. The
defined test avoids problems where line has a string value that would be
treated as false by Perl, for example a "" or a "0" with no trailing
newline. If you really mean for such values to terminate the loop, they
should be tested for explicitly: [...]
So, would there be a corner case or that's simply because the book is too old
and the automatic defined test was added in a recent Perl version?
(1) Advanced Perl Programming, First Edition, Sriram Srinivasan. O'Reilly
(1997)
Perl has a lot of implicit behaviors, many more than most other languages. Perl's motto is There's More Than One To Do It, and because there is so much implicit behavior, there is often More Than One Way To express the exact same thing.
/foo/ instead of $_ =~ m/foo/
$x = shift instead of $x = shift #_
while (defined($_=<ARGV>)) instead of while(<>)
etc.
Which expressions to use are largely a matter of your local coding standards and personal preference. The more explicit expressions remind the reader what is really going on under the hood. This may or may not improve the readability of the code -- that depends on how knowledgeable the audience is and whether you are using well-known idioms.
In this case, the implicit behavior is a little more complicated than it seems. Sometimes perl will implicitly perform a defined(...) test on the result of the readline operator:
$ perl -MO=Deparse -e 'while($s=<>) { print $s }'
while (defined($s = <ARGV>)) {
print $s;
}
-e syntax OK
but sometimes it won't:
$ perl -MO=Deparse -e 'if($s=<>) { print $s }'
if ($s = <ARGV>) {
print $s;
}
-e syntax OK
$ perl -MO=Deparse -e 'while(some_condition() && ($s=<>)) { print $s }'
while (some_condition() and $s = <ARGV>) {
print $s;
}
-e syntax OK
Suppose that you are concerned about the corner cases that this implicit behavior is supposed to handle. Have you committed perlop to memory so that you understand when Perl uses this implicit behavior and when it doesn't? Do you understand the differences in this behavior between Perl v5.14 and Perl v5.6? Will the people reading your code understand?
Again, there's no right or wrong answer about when to use the more explicit expressions, but the case for using an explicit expression is stronger when the implicit behavior is more esoteric.
Say you have the following file
4<LF>
3<LF>
2<LF>
1<LF>
0
(<LF> represents a line feed. Note the lack of newline on the last line.)
Say you use the code
while ($s = <>) {
chomp;
say $s;
}
If Perl didn't do anything magical, the output would be
4
3
2
1
Note the lack of 0, since the string 0 is false. defined is needed in the unlikely case that
You have a non-standard text file (missing trailing newline).
The last line of the file consists of a single ASCII zero (0x30).
BUT WAIT A MINUTE! If you actually ran the above code with the above data, you would see 0 printed! What many don't know is that Perl automagically translates
while ($s = <>) {
to
while (defined($s = <>)) {
as seen here:
$ perl -MO=Deparse -e'while($s=<DATA>) {}'
while (defined($s = <DATA>)) {
();
}
__DATA__
-e syntax OK
So you technically don't even need to specify defined in this very specific circumstance.
That said, I can't blame someone for being explicit instead of relying on Perl automagically modifying their code. After all, Perl is (necessarily) quite specific as to which code sequences it will change. Note the lack of defined in the following even though it's supposedly equivalent code:
$ perl -MO=Deparse -e'while((), $s=<DATA>) {}'
while ((), $s = <DATA>) {
();
}
__DATA__
-e syntax OK
while($line=<DATA>){
chomp($line);
if(***defined*** $line){
print "SEE:$line\n";
}
}
__DATA__
1
0
3
Try the code with defined removed and you will see the different result.

How can I print source line number in Perl?

Is it possible to get the current source line number in Perl?
The equivalent in C++ is __LINE__.
The __LINE__ literal is documented in the Special Literals section of the perldata man page.
print "File: ", __FILE__, " Line: ", __LINE__, "\n";
or
warn("foo");
Note there's a gotcha with
$ perl -e'warn("foo")'
foo at -e line 1.
If it ends with a newline it won't print the line number
$ perl -e'warn("foo\n")'
foo
This is documented in perldoc -f die, but is perhaps easy to miss in the perldoc -f warn section's reference to die.
This prints out the line where you are, and also the "stack" (list of lines from the calling programs (scripts/modules/etc) that lead to the place you are now)
while(my #where=caller($frame++)) { print "$frame:" . join(",",#where) . "\n"; }
"use Carp" and play with the various routines and you also get a stack - not sure if this way is better or worse than the "caller" method suggested by cnd. I have used the LINE and FILE variables (and probably other similar variables) in C and Perl to show where I got in the code and other information when debugging but have seen little value outside a debug environment.