difference between while loop vs. single use of diamond operator in perl - perl

I am confused at the following:
<>; print;
vs.
while(<>){print;}
The first one does not print anything, but second one does. Doesn't <> always store the input read into $_?
Thank you.

The diamond file input iterator is only magical when it is in the conditional of a while loop:
$ perl -MO=Deparse -e '<>; print;'
<ARGV>;
print $_;
-e syntax OK
$ perl -MO=Deparse -e 'while (<>) {print;}'
while (defined($_ = <ARGV>)) {
print $_;
}
-e syntax OK
This is all documented in perlop

It does not except as the condition of a while statement.
$ perl -MO=Deparse -e 'while(<>) { print }'
while (defined($_ = <ARGV>)) {
print $_;
}
-e syntax OK
$ perl -MO=Deparse -e '<>; print'
<ARGV>;
print $_;
-e syntax OK
perlop documents that the auto-assignment to $_ only happens in this context:
Ordinarily you must assign the returned value to a variable, but there
is one situation where an automatic assignment happens. If and only if
the input symbol is the only thing inside the conditional of a "while"
statement (even if disguised as a "for(;;)" loop), the value is
automatically assigned to the global variable $_ , destroying whatever
was there previously. (This may seem like an odd thing to you, but
you'll use the construct in almost every Perl script you write.) The
$_ variable is not implicitly localized. You'll have to put a "local
$_ ;" before the loop if you want that to happen.

From http://perldoc.perl.org/perlvar.html (Talking about $_) :
"The default place to put an input record when a operation's result is tested by itself as the sole criterion of a while test. Outside a while test, this will not happen."

Related

sed "unterminated `s'command`" error when running from a script

I have a temp file with contents:
a
b
c
d
e
When I run sed 's#b#batman\nRobin#' temp from command line, I get:
a
batman
Robin
c
d
e
However, when I run the command from a Perl scriptL
#!/usr/bin/perl
use strict;
use warnings;
`sed 's#b#batman\nRobin#' temp`
It produces error:
sed: -e expression #1, char 10: unterminated `s' command
What am I doing wrong?
Why run another tool like sed once you are inside a Perl program? If anything, now you have far more tools and power so just do it with Perl.
One simple way to do your sed thing
use warnings;
use strict;
die "Usage: $0 file(s)\n" if not #ARGV;
while (<>) {
s/b/batman\nRobin/;
print;
}
Run this program by supplying the file (temp) to it on the command line. The die line is there merely to support/enforce such usage; it is inessential for script's operation.
This program then is a simple filter
<> operator reads line by line all files submitted on the command line
A line is assigned by it to $_ variable, a default for many things in Perl
The s/// operator by default binds to $_, which gets changed (if pattern matches)
print by default prints the $_ variable
Use nearly anything you want for delimiters in regex, see m// and s/// operators
This can also be done as
while (<>) {
print s/b/batman\nRobin/r
}
With /r modifier s/// returns the changed string (or the original if pattern didn't match)
Finally that's also just
print s/b/batman\nRobin/r while <>;
but I'd expect that with a script you really want to do more and then this probablyisn't it.
On the other side of things you could write it more properly
use warnings;
use strict;
use feature qw(say);
die "Usage: $0 file(s)\n" if not #ARGV;
while (my $line = <>) {
chomp $line;
$line =~ s/b/batman\nRobin/;
say $line;
}
With a line in a lexical variable nicely chomp-ed this is ready for more work.

perl: print to console all the matched pattern

I have mulitple lines
QQQQl123
hsdhjhksd
QQQQl234
ajkdkjsdh
QQQQl564
i want to print all matching QQQQl[0-9]+
like
QQQQl123
QQQQl234
QQQQl564
how to do this using perl
I tried:
$ perl -0777pe '/QQQQl[0-9]+/' filename
it shows nothing
perl -we 'while(<>){ next unless $_=~/QQQQl[0-9]+/; print $_; }' < filename
perl -ne 'print if /QQQQl[0-9]+/' filename
Or, if, for some reason, you insist on using -0777, you could do
perl -0777nE 'say for /QQQQl[0-9]+/g' filename
(or print "$_\n" instead of say)
Your code doesn't work because /QQQQl[0-9]+/ returns true because $_ indeed contains that pattern, but you never asked Perl to do anything based on that return value.
-n is preferable to -p in that case, since you don't want to print every line but only some (-p automatically prints every line, and there is very little you can do about it).

Awk's output in Perl doesn't seem to be working properly

I'm writing a simple Perl script which is meant to output the second column of an external text file (columns one and two are separated by a comma).
I'm using AWK because I'm familiar with it.
This is my script:
use v5.10;
use File::Copy;
use POSIX;
$s = `awk -F ',' '\$1==500 {print \$2}' STD`;
say $s;
The contents of the local file "STD" is:
CIR,BS
60,90
70,100
80,120
90,130
100,175
150,120
200,260
300,500
400,600
500,850
600,900
My output is very strange and it prints out the desired "850" but it also prints a trailer of the line and a new line too!
ka#man01:$ ./test.pl
850
ka#man01:$
The problem isn't just printing. I need to use the variable generated by awk "i.e. the $s variable) but the variable is also being reserved with a long string and a new line!
Could you guys help?
Thank you.
I'd suggest that you're going down a dirty road by trying to inline awk into perl in the first place. Why not instead:
open ( my $input, '<', 'STD' ) or die $!;
while ( <$input> ) {
s/\s+\z//;
my #fields = split /,/;
print $fields[1], "\n" if $fields[0] == 500;
}
But the likely problem is that you're not handling linefeeds, and say is adding an extra one. Try using print instead, or chomp on the resultant string.
perl can do many of the things that awk can do. Here's something similar that replaces your entire Perl program:
$ perl -naF, -le 'chomp; print $F[1] if $F[0]==500' STD
850
The -n creates a while loop around your argument to -e.
The -a splits up each line into #F and -F lets you specify the separator. Since you want to separate the fields on a comma you use -F,.
The -l adds a newline each time you call print.
The -e argument is the program to run (with the added while from -n). The chomp removes the newline from the output. You get a newline in your output because you happen to use the last field in the line. The -l adds a newline when you print; that's important when you want to extract a field in the middle of the line.
The reason you get 2 newlines:
the backtick operator does not remove the trailing newline from the awk output. $s contains "850\n"
the say function appends a newline to the string. You have say "850\n" which is the same as print "850\n\n"

How can I slurp STDIN in Perl?

I piping the output of several scripts. One of these scripts outputs an entire HTML page that gets processed by my perl script. I want to be able to pull the whole 58K of text into the perl script (which will contain newlines, of course).
I thought this might work:
open(my $TTY, '<', '/dev/tty');
my $html_string= do { local( #ARGV, $/ ) = $TTY ; <> } ;
But it just isn't doing what I need. Any suggestions?
my #lines = <STDIN>;
or
my $str = do { local $/; <STDIN> };
I can't let this opportunity to say how much I love IO::All pass without saying:
♥ ♥ __ "I really like IO::All ... a lot" __ ♥ ♥
Variation on the POD SYNOPSIS:
use IO::All;
my $contents < io('-') ;
print "\n printing your IO: \n $contents \n with IO::All goodness ..." ;
Warning: IO::All may begin replacing everything else you know about IO in perl with its own insidious goodness.
tl;dr: see at the bottom of the post. Explanation first.
practical example
I’ve just wondered about the same, but I wanted something suitable for a shell one-liner. Turns out this is (Korn shell, whole example, dissected below):
print -nr -- "$x" | perl -C7 -0777 -Mutf8 -MEncode -e "print encode('MIME-Q', 'Subject: ' . <>);"; print
Dissecting:
print -nr -- "$x" echos the whole of $x without any trailing newline (-n) or backslash escape (-r), POSIX equivalent: printf '%s' "$x"
-C7 sets stdin, stdout, and stderr into UTF-8 mode (you may or may not need it)
-0777 sets $/ so that Perl will slurp the entire file; reference: man perlrun(1)
-Mutf8 -MEncode loads two modules
the remainder is the Perl command itself: print encode('MIME-Q', 'Subject: ' . <>);, let’s look at it from inner to outer, right to left:
<> takes the entire stdin content
which is concatenated with the string "Subject: "
and passed to Encode::encode asking it to convert that to MIME Quoted-Printable
the result of which is printed on stdout (without any trailing newline)
this is followed by ; print, again in Korn shell, which is the same as ; echo in POSIX shell – just echoïng a newline.
tl;dr
Call perl with the -0777 option. Then, inside the script, <> will contain the entire stdin.
complete self-contained example
#!/usr/bin/perl -0777
my $x = <>;
print "Look ma, I got this: '$x'\n";
To get it into a single string you want:
#!/usr/bin/perl -w
use strict;
my $html_string;
while(<>){
$html_string .= $_;
}
print $html_string;
I've always used a bare block.
my $x;
{
undef $/; # Set slurp mode
$x = <>; # Read in everything up to EOF
}
# $x should now contain all of STDIN

Perl substitute with regex

When I run this command over a Perl one liner, it picks up the the regular expression -
so that can't be bad.
more tagcommands | perl -nle 'print /(\d{8}_\d{9})/' | sort
12012011_000005769
12012011_000005772
12162011_000005792
12162011_000005792
But when I run this script over the command invocation below, it does not pick up the
regex.
#!/usr/bin/perl
use strict;
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my #array_old = (<FILE>) ;
my #array_new = #array_old ;
foreach my $line(#array_new) {
$line =~ s/\d{8}_\d{9}/$switch/g;
print $line;
sleep 1;
}
This is the data that I am feeding into the script
/CASPERBOT/START URL=simplefile:///data/tag/squirrels/squirrels /12012011_000005777N.dart.gz CASPER=SeqRashMessage
/CASPERBOT/ADDSERVER simplefile:///data/tag/squirrels/12012011_0000057770.dart.trans.gz
/CASPERRIP/newApp multistitch CASPER_BIN
/CASPER_BIN/START URLS=simplefile:///data/tag/squirrels /12012011_000005777R.rash.gz?exitOnEOF=false;binaryfile:///data/tag/squirrels/12162011_000005792D.binaryBlob.gz?exitOnEOF=false;simplefile:///data/tag/squirrels/12012011_000005777E.bean.trans.gz?exitOnEOF=false EXTRACTORS=rash;island;rash BINARY=T
You should study your one-liner to see how it works. First check perl -h to learn about the switches used:
-l[octal] enable line ending processing, specifies line terminator
-n assume "while (<>) { ... }" loop around program
The first one is not exactly self-explanatory, but what -l actually does is chomp each line, and then change $\ and $/ to newline. So, your one-liner:
perl -nle 'print /(\d{8}_\d{9})/'
Actually does this:
$\ = "\n";
while (<>) {
chomp;
print /(\d{8}_\d{9})/;
}
A very easy way to see this is to use the Deparse command:
$ perl -MO=Deparse -nle 'print /(\d{8}_\d{9})/'
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
chomp $_;
print /(\d{8}_\d{9})/;
}
-e syntax OK
So, that's how you transform that into a working script.
I have no idea how you went from that to this:
use strict;
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my #array_old = (<FILE>) ;
my #array_new = #array_old ;
foreach my $line(#array_new) {
$line =~ s/\d{8}_\d{9}/$switch/g;
print $line;
sleep 1;
}
First of all, why are you opening a pipe from the more command to read a text file? That is like calling a tow truck to fetch you a cab. Just open the file. Or better yet, don't. Just use the diamond operator, like you did the first time.
You don't need to first copy the lines of a file to an array, and then use the array. while(<FILE>) is a simple way to do it.
In your one-liner, you print the regex. Well, you print the return value of the regex. In this script, you print $line. I'm not sure how you thought that would do the same thing.
Your regex here will remove all set of numbers and replace it with the ones in your script. Nothing else.
You may also be aware that sleep 1 will not do what you think. Try this one-liner, for example:
perl -we 'for (1 .. 10) { print "line $_\n"; sleep 1; }'
As you will notice, it will simply wait 10 seconds then print everything at once. That's because perl by default prints to the standard output buffer (in the shell!), and that buffer is not printed until it is full or flushed (when the perl execution ends). So, it's a perception problem. Everything works like it should, you just don't see it.
If you absolutely want to have a sleep statement in your script, you'll probably want to autoflush, e.g. STDOUT->autoflush(1);
However, why are you doing that? Is it so you will have time to read the numbers? If so, put that more statement at the end of your one-liner instead:
perl ...... | more
That will pipe the output into the more command, so you can read it at your own pace. Now, for your one-liner:
Always also use -w, unless you specifically want to avoid getting warnings (which basically you never should).
Your one-liner will only print the first match. If you want to print all the matches on a new line:
perl -wnle 'print for /(\d{8}_\d{9})/g'
If you want to print all the matches, but keep the ones from the same line on the same line:
perl -wnle 'print "#a" if #a = /(\d{8}_\d{9})/g'
Well, that should cover it.
Your open call may be failing (you should always check the result of an open to make sure it succeeded if the rest of the program depends on it) but I believe your problem is in complicating things by opening a pipe from a more command instead of simply opening the file itself. Change the open to simply
open FILE, "/home/shortcasper/work/tagcommands" or die $!;
and things should improve.