perl line-mode oneliner with ARGV [duplicate] - perl

This question already has answers here:
How can I process options using Perl in -n or -p mode?
(2 answers)
Closed last year.
I often need to run some Perl one-liners for fast data manipulations, like
some_command | perl -lne 'print if /abc/'
Reading from a pipe, I don't need a loop around the command arg filenames. How can I achieve the next?
some_command | perl -lne 'print if /$ARGV[0]/' abc
This gives the error:
Can't open abc: No such file or directory.
I understand that the '-n' does the
while(<>) {.... }
around my program, and the <> takes args as filenames, but doing the next every time is a bit impractical
#/bin/sh
while read line
do
some_command | perl -lne 'BEGIN{$val=shift #ARGV} print if /$val/' "$line"
done
Is there some better way to get "inside" the Perl ONE-LINER command line arguments without getting them interpreted as filenames?

Some solutions:
perl -e'while (<STDIN>) { print if /$ARGV[0]/ }' pat
perl -e'$p = shift; while (<>) { print if /$p/ }' pat
perl -e'$p = shift; print grep /$p/, <>' pat
perl -ne'BEGIN { $p = shift } print if /$p/' pat
perl -sne'print if /$p/' -- -p=pat
PAT=pat perl -ne'print if /$ENV{PAT}/'
Of course, it might make more sense to create a pattern that's an ORing or all patterns rather than executing the same command for each pattern.

Also reasonably short:
... | expr=abc perl -lne 'print if /$ENV{expr}/'
Works in bash shell but maybe not other shells.

It depends on what you think will be in the lines you read, but you could play with:
#/bin/sh
while read line
do
some_command | perl -lne "print if /$line/"
done
Clearly, if $line might contain slashes, this is not going to fly. Then, AFAIK, you're stuck with the BEGIN block formulation.

Related

Perl script in bash's HereDoc

Is possible somewhat write a perl script in a bash script as heredoc?
This is not working (example only)
#/bin/bash
perl <<EOF
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
Is here some nice way how to embed a perl script into a bash script? Want run perl script from an bash script and don't want put it into external file.
The problem here is that the script is being passed to perl on stdin, so trying to process stdin from the script doesn't work.
1. String literal
perl -e '
while(<>) {
chomp;
print "xxx: $_\n";
}
'
Using a string literal is the most direct way to write this, though it's not ideal if the Perl script contains single quotes itself.
2. Use perl -e
#/bin/bash
script=$(cat <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
)
perl -e "$script"
If you pass the script to perl using perl -e then you won't have the stdin problem and you can use any characters you like in the script. It's a bit roundabout to do this, though. Heredocs yield input on stdin and we need strings. What to do? Oh, I know! This calls for $(cat <<HEREDOC).
Make sure to use <<'EOF' rather than just <<EOF to keep bash from doing variable interpolation inside the heredoc.
You could also write this without the $script variable, although it's getting awfully hairy now!
perl -e "$(cat <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
)"
3. Process substitution
perl <(cat <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
)
Along the lines of #2, you can use a bash feature called process substitution which lets you write <(cmd) in place of a file name. If you use this you don't need the -e since you're now passing perl a file name rather than a string.
You know I never thought of this.
The answer is "YES!" it does work. As others have mentioned, <STDIN> can't be used, but this worked fine:
$ perl <<'EOF'
print "This is a test\n";
for $i ( (1..3) ) {
print "The count is $i\n";
}
print "End of my program\n";
EOF
This is a test
The count is 1
The count is 2
The count is 3
End of my program
In Kornshell and in BASH, if you surround your end of here document string with single quotes, the here document isn't interpolated by the shell.
Only small corection of #John Kugelman's answer. You can eliminate the useless cat and use:
read -r -d '' perlscript <<'EOF'
while(<>) {
chomp;
print "xxx: $_\n";
}
EOF
perl -e "$perlscript"
Here's another way to use a PERL HEREDOC script within bash, and take full advantage it.
#!/bin/sh
#If you are not passing bash var's and single quote the HEREDOC tag
perl -le "$(cat <<'MYPL'
# Best to build your out vars rather than writing directly
# to the pipe until the end.
my $STDERRdata="", $STDOUTdata="";
while ($i=<STDIN>){ chomp $i;
$STDOUTdata .= "To stdout\n";
$STDERRdata .= "Write from within the heredoc\n";
MYPL
print $STDOUTdata; #Doing the pipe write at the end
warn $STDERRdata; #will save you a lot of frustration.
)" [optional args] <myInputFile 1>prints.txt 2>warns.txt
or
#!/bin/sh
set WRITEWHAT="bash vars"
#If you want to include your bash var's
#Escape the $'s that are not bash vars, and double quote the HEREDOC tag
perl -le "$(cat <<"MYPL"
my $STDERRdata="", $STDOUTdata="";
while (\$i=<STDIN>){ chomp \$i;
\$STDOUTdata .= "To stdout\n";
\$STDERRdata .= "Write $WRITEWHAT from within the heredoc\n";
MYPL
print \$STDOUTdata; #Doing the pipe write at the end
warn \$STDERRdata; #will save you a lot of frustration.
)" [optional args] <myInputFile 1>prints.txt 2>warns.txt

Line number of a file in Perl when multiple files are passed as arguments to perl cli

In awk if I give more than one file as an argument to awk, there are two special variables:
NR=line number corresponding to all the lines in all the files.
FNR=line number of the current file.
I know that in Perl, $. corresponds to NR (current line among lines in all of the files).
Is there anything comparable to FNR of AWK in Perl too?
Let's say I have some command line:
perl -pe 'print filename,<something special which hold the current file's line number>' *.txt
This should give me output like:
file1.txt 1
file1.txt 2
file2.txt 1
Actually, the eof documentation shows a way to do this:
# reset line numbering on each input file
while (<>) {
next if /^\s*#/; # skip comments
print "$.\t$_";
} continue {
close ARGV if eof; # Not eof()!
}
An example one-liner that prints the first line of all files:
$ perl -ne 'print "$ARGV : $_" if $. == 1; } continue { close ARGV if eof;' *txt
There is no such variable in Perl. But you should study eof to be able to write something like
perl -ne 'print join ":", $. + $sum, $., "\n"; $sum += $., $.=0 if eof;' *txt
I find this situation to be a very large drawback to using Perl. While this answer is suboptimal, performance-wise, and only fits situations involving xargs, I've typically found it the workaround I use 95% of the time. So the problem scenario:
git ls-files -z | xargs -0 perl -ne 'print "$ARGV:$.\t$_" if /#define oom/'
file.h:85806 #define oom() exit(-1)
That line number is obviously not correct, and you'll obviously get the same behavior with find Of course, with this regex, or other simpler Perl regexes, I'd just use awk or git grep -P. However, if you're dealing with a fairly complicated regular expression, or need other Perl features, that won't work...and the correct answers to this question further complicate what is already likely to be a complicated Perl one-liner.
So I just use the following:
git ls-files -z | xargs -0 -n1 perl -ne 'print "$ARGV:$.\t$_" if /#define oom/'
file.h:43 #define oom() exit(-1)
The -n1 xargs argument causes xargs to kick off a Perl process for each file, which results in the correct line numbers. You're looking at very significant performance impact, but I've found it acceptable for very large projects with millions of lines of code vs. solving it in Perl, which I find almost always requires an actual script vs. a one-liner. It is not acceptable for system-wide searches in the vast majority of cases.

Perl command line search and replace with multiple expressions

I am using Perl to search and replace multiple regular expressions:
When I execute the following command, I get an error:
prompt> find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g' -pe 's/(\W)##/\1/g'
syntax error at -e line 2, near "s/(\W)##/\1/g"
Execution of -e aborted due to compilation errors.
xargs: perl: exited with status 255; aborting
Having multiple -e is valid in Perl, then why is this not working? Is there a solution to this?
Several -e's are allowed.
You are missing the ';'
find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g;' -pe 's/(\W)##/\1/g;'
Perl statements has to end with ;.
Final statement in a block doesn't need a terminating semicolon.
So a single -e without ; will work, but you will have to add ; when you have multiple -e statements.
Having multiple -e values are valid, but is it useful? The values from the multiple -e are merely combined into one program, and it's up to you to ensure that together they make a syntactically correct program. The B::Deparse program can show you what perl thinks the program is:
$ perl -MO=Deparse -e 'print' -e 'q(Hello' -e ')'
print "Hello\n";
-e syntax OK
A curious thing to note is that a newline snuck in there. Think about how it got there to see what else perl is doing to combine multiple -e values.
In your program, you are substituting on the current line, then taking the modified line and substituting again. That's better written as:
prompt> find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g; s/(\W)##/\1/g'
Now, if you are building up this command line by adding more and more -e through some automated process and you don't know ahead of time what you get, maybe those -e make sense. However, you might consider that you can do the same thing to build up the string you give to -e. I don't know what might be better because you didn't explain why you are doing it that way.
But, I suspect that in some cases, people are actually thinking about having only one substitution work. They want to try one and if its pattern doesn't work, try a different one until one succeeds. In that case you don't want to separate the substitutions by semicolons. Use the short-circuiting || instead. The s/// returns the number of substitutions it made and || will stop (short circuit) when it finds a true value:
prompt> find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g || s/(\W)##/\1/g'
And note, you only need one -p. It only does its job once. Here's the program with multiple -p deparsed:
$ perl -MO=Deparse -i -pe 's/##(\W)/\1/g;' -pe 's/(\W)##/\1/g;'
BEGIN { $^I = ""; }
LINE: while (defined($_ = readline ARGV)) {
s/##(\W)/$1/g;
s/(\W)##/$1/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
It's the same thing as having only one -p:
$ perl -MO=Deparse -pi -e 's/##(\W)/\1/g;' -e 's/(\W)##/\1/g;'
BEGIN { $^I = ""; }
LINE: while (defined($_ = readline ARGV)) {
s/##(\W)/$1/g;
s/(\W)##/$1/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
Thanks so much! You helped me reduce my ascii / decimal / 8-bit binary table printer enough to fit in a tweet:
for i in {32..126};do printf "'\x$(printf %x $i)'(%3i) = " $i; printf '%03o\n' $i | perl \
-pe 's#0#000#g;' -pe 's#1#001#g;' -pe 's#2#010#g;' -pe 's#3#011#g;' \
-pe 's#4#100#g;' -pe 's#5#101#g;' -pe 's#6#110#g;' -pe 's#7#111#g' ; done | \
perl -pe 's#= 0#= #'

Only print matching lines in perl from the command line

I'm trying to extract all ip addresses from a file. So far, I'm just using
cat foo.txt | perl -pe 's/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
but this also prints lines that don't contain a match. I can fix this by piping through grep, but this seems like it ought to be unnecessary, and could lead to errors if the regexes don't match up perfectly.
Is there a simpler way to accomplish this?
Try this:
cat foo.txt | perl -ne 'print if s/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
or:
<foo.txt perl -ne 'print if s/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
It's the shortest alternative I can think of while still using Perl.
However this way might be more correct:
<foo.txt perl -ne 'if (/((\d{1,3}\.){3}\d{1,3})/) { print $1 . "\n" }'
If you've got grep, then just call grep directly:
grep -Po "(\d{1,3}\.){3}\d{1,3}" foo.txt
You've already got a suitable answer of using grep to extract the IP addresses, but just to explain why you were seeing non-matches being printed:
perldoc perlrun will tell you about all the options you can pass Perl on the command line.
Quoting from it:
-p causes Perl to assume the following loop around your program, which makes it
iterate over filename arguments somewhat like sed:
LINE:
while (<>) {
... # your program goes here
} continue {
print or die "-p destination: $!\n";
}
You could have used the -n switch instead, which does similar, but does not automatically print, for example:
cat foo.txt | perl -ne '/((?:\d{1,3}\.){3}\d{1,3})/ and print $1'
Also, there's no need to use cat; Perl will open and read the filenames you give it, so you could say e.g.:
perl -ne '/((?:\d{1,3}\.){3}\d{1,3})/ and print $1' foo.txt
ruby -0777 -ne 'puts $_.scan(/((?:\d{1,3}\.){3}\d{1,3})/)' file

What am I doing wrong in this Perl one-liner?

I have a file that contains a lot of these
"/watch?v=VhsnHIUMQGM"
and I would like to output the letter code using a perl one-liner. So I try
perl -nle 'm/\"\/watch\?v=(.*?)\"/g' filename.txt
but it doesn't print anything.
What am I doing wrong?
The -n option processes each line but doesn't print anything out. So you need to add an explicit print if you successfully match.
perl -ne 'while ( m/\"\/watch\?v=(.+?)\"/g ) { print "$1\n" }' filename.txt
Another approach, if you're sure every line will match, is to use the -p option which prints out the value of $_ after processing, e.g.:
perl -pe 's/\"\/watch\?v=(.+?)\"/$1//' filename.txt
Your regex is fine. You're getting no output because the -n option won't print anything. It simply wraps a while (<>) { ... } loop around your program (run perl --help for brief explanations of the Perl options).
The following uses your regex, but add some printing. In list context, regexes with the /g option return all captures. Effectively, we print each capture.
perl -nle 'print for m/\"\/watch\?v=(.*?)\"/g' data.dat
You can split the string on "=" instead of matching:
perl -paF= -e '$_= #F[1]' filename.txt