How to pipe Bash Shell command's output line by line to Perl for Regex processing? - perl

I have some output data from some Bash Shell commands. The output is delimited line by line with "\n" or "\0". I would like to know that is there any way to pipe the output into Perl and process the data line by line within Perl (just like piping the output to awk, but in my case it is in the Perl context.). I suppose the command may be something like this :
Bash Shell command | perl -e 'some perl commands' | another Bash Shell command
Suppose I want to substitute all ":" character to "#" character in a "line by line" basis (not a global substitution, I may use a condition, e.g. odd or even line, to determine whether the current line should have the substitution or not.), then how could I achieve this.

See perlrun.
perl -lpe's/:/#/g' # assumes \n as input record separator
perl -0 -lpe's/:/#/g' # assumes \0 as input record separator
perl -lne'if (0 == $. % 2) { s/:/#/g; print; }' # modify and print even lines
Yes, Perl may appear at any place in a pipeline, just like awk.

The command line switch -p (if you want automatic printing) or -n (if you don't want it) will do what you want. The line contents are in $_ so:
perl -pe's/\./\#/g'
would be a solution. Generally, you want to read up on the '<>' (diamond) operator which is the way to go for non-oneliners.

Related

way to fetch the argument inside perl script

I am having some trouble of getting the argument passed in in the following script
echo "abc"|perl <<'EOF'
#how to get "abc". it seems not $ARGV[0] nor in <STDIN>
EOF
Thank you.
The precise command line you have there may be your problem, if that is what you're actually executing. What you are saying there is "put 'abc' on the standard input of the next thing in the pipeline. Now run a Perl script consisting of a single comment."
This will do nothing, because there's nothing executable in that Perl script. Try this:
echo "abc" | perl -e 'print <STDIN>'
If you have a short Perl script the -e option is the way to go.
Your example is not using argument, it's using standard input. You can read standard input with the I/O operators. If you actually mean that you want an argument like myscript.pl --arg then I would recommend using Getopt::Long.
You have not passed any argument to the Perl script.
You redirected the Perl script itself so it comes from standard input; that means that the piped output goes nowhere and cannot be seen by Perl.
Reconsider how you're invoking your script. Maybe:
perl script.pl "abc"
where script.pl is a file that contains the Perl script you used as a here-document. Or simply make that script executable (perhaps without the .pl suffix).
Your problem is that both the pipe and the here-document redirect the STDIN. And the here-document wins, so the perl process never sees the pipe; it gets the script on STDIN (and has read to EOF before running the script, so that will see STDIN at EOF).
Observe:
$ echo "abc" | perl <<'EOF'
print "[What have we here?]\n";
seek(STDIN, 0, 0);
print <STDIN>;
print "[Well, what do you know ...]\n";
EOF
[What have we here?]
print "[What have we here?]\n";
seek(STDIN, 0, 0);
print <STDIN>;
print "[Well, what do you know ...]\n";
[Well, what do you know ...]
$
Moral: Don't try to mix pipes and here-documents in the shell. :)

Inserting headers into multiple files

I found some command line with Perl that inserts headers into my files without going through the tedious process of inserting them one by one. Can someone walk me through the Perl aspect of this command line? I'm new to this and can't seem to find the right explanations for what I wrote.
cat header.txt | perl -0 -i -pe 'BEGIN{$h = <STDIN>}; print $h' 1*
-e
rather than provide a script in a xxxx.pl file, provide it on the command line
-p
makes it iterate over filename arguments somewhat like sed but also prints the contents of $_ at the end of the script.
the two above are combined in -pe
-i
indicate you want to edit the file in place and write the output to the same file. In practice, Perl renames the input file and reads from this renamed version while writing to a new file with the original name
-0
redefines the end of record character (\n by default) so that you can read the entire input file as a single line
1*
is the command line argument to your script, so I guess you are modifying any file with a name that starts with 1 (you could have used *.c, or whatever depending on the type of files you are trying to modify)
print $h
prints the variable $h that is the "main" of your script. if it was initialized with the content of the header file (the intent of this one-liner) then it will print the header file
BEGIN{ some code here }
this is stuff you execute before the script starts. this is where I'm stumped. this doesn't seem like valid perl code
so basically:
this will supposedly slurp the entire header file (because of -0) in the BEGIN block and store it in the variable $h
iterate over all the files specified by the wildcards at the end of the command line
for each file: print the header (print $h) then print hte file itself (because of -pe)
so it's equivalent to spelling the script out:
$h = gets content of the entire header file
while (<>){ #loop implied by -pe, iterates over all the 1* files
# the main contents of the "-e" script are inserted below as part of executing -pe
print h$; #print the header we saved
print $_; # implied by -pe, and since we are using -0, this prints the entire content in one shot
# end of the "-e" script. again it was a single print $h statement, the second print is implied by -pe
}
It's a bit hard to explain, take a look at the perlrun documentation for details (run man perlrun).
This is not 100% complete explanation because I don;t think the BEGIN block is right. I tried it on my ubuntu machine and it complained about its syntax too
Here's something similar, with an explanation. The program in the question doesn't run on my mac.
I needed to add the #nullable disable directive to the top of all my csharp files as part of migrating to nullable reference types.
perl -w -i -p -0777 -e 's/^/#nullable disable\n\n/' $(find . -iname '*.cs')
-w enable warnings
-i edit files in place
-p read each file block by block, printing each block after applying a perl expression. the default block size is one line
-0777 changes the default block size to the entire file
-e the perl expression to execute
The final argument uses shell command substitution to create a list of files. It passes that list of file paths to the perl command. The find command searches for files that end in .cs.
The perl program is a single substitution command. It matches the very beginning of the block and replaces (prepends, really) with "#nullable disable" and a couple new-lines.

Perl oneliner match repeating itself

I'm trying to read a specific section of a line out of a file with Perl.
The file in question is of the following syntax.
# Sets $USER1$
$USER1$=/usr/....
# Sets $USER2$
#$USER2$=/usr/...
My oneliner is simple,
perl -ne 'm/^\$USER1\$\s*=\s*(\S*?)\s*$/m; print "$1";' /my/file
For some reason I'm getting the extraction for $1 repeated several times over, apparently once for every line in the file after my match occurs. What am I missing here?
You are executing print for every line of the file because print gets called for every line, whether the regex matches or not. Replace the first ; with an &&.
From perlre:
NOTE: Failed matches in Perl do not reset the match variables, which makes it easier to write code that tests for a series of more specific cases and remembers the best match.
Try this instead:
perl -ne 'print "$1" if m/^\$USER1\$\s*=\s*(\S*?)\s*$/m;' /my/file
$ cat test.txt
# Sets $USER1$
$USER1$=/usr/....
# Sets $USER2$
#$USER2$=/usr/...
$ perl -nle 'print if /^\$USER1/;' test.txt
$USER1$=/usr/....
Try this
perl -ne '/^.*1?=([\w\W].*)$/;print "$1";' file

replacing a variable in shell script using perl

I have a variable in a shell script,
var=1234_number
I want to replace all other than integer of $var .. how can I do it using a perl onliner?
You might be looking for something to edit the shell script, in which case, this might be sufficient:
perl -i.bak -e 's/\b(var=\d+).*/$1/' shellscript.sh
The '-i' overwrites the original file, saving a copy in shellscript.sh.bak; the substitute command finds assignments to 'var' (and not any longer name ending 'var') followed by an equals sign, some digits, and any non-digits, and leaves behind just the assignment of digits.
In the example, it gives:
var=1234
Note that the Perl regex is not foolproof - it will mangle this (dropping the closing brace).
: ${var=1234_number}
Dealing with all such possible variants is extremely fairly tricky:
echo $var=$other
OTOH, you might be looking to eliminate digits from a variable within a shell script, in which case:
var=$(echo $var | perl -e 's/\D//g')
You could also use 'sed' for the job:
var=$(echo $var | sed 's/[^0-9]//g')
No need to use anything but the shell for this
var=1234_abcd
var=${var%_*}
echo $var # => 1234
See 'Parameter Expansion' in the bash manual.

How do I use Perl on the command line to search the output of other programs?

As I understand (Perl is new to me) Perl can be used to script against a Unix command line. What I want to do is run (hardcoded) command line calls, and search the output of these calls for RegEx matches. Is there a way to do this simply in Perl? How?
EDIT: Sequence here is:
-Call another program.
-Run a regex against its output.
my $command = "ls -l /";
my #output = `$command`;
for (#output) {
print if /^d/;
}
The qx// quasi-quoting operator (for which backticks are a shortcut) is stolen from shell syntax: run the string as a command in a new shell, and return its output (as a string or a list, depending on context). See perlop for details.
You can also open a pipe:
open my $pipe, "$command |";
while (<$pipe>) {
# do stuff
}
close $pipe;
This allows you to (a) avoid gathering the entire command's output into memory at once, and (b) gives you finer control over running the command. For example, you can avoid having the command be parsed by the shell:
open my $pipe, '-|', #command, '< single argument not mangled by shell >';
See perlipc for more details on that.
You might be able to get away without Perl, as others have mentioned. However, if there is some Perl feature you need, such as extended regex features or additional text manipulation, you can pipe your output to perl then do what you need. Perl's -e switch let's you specify the Perl program on the command line:
command | perl -ne 'print if /.../'
There are several other switches you can pass to perl to make it very powerful on the command line. These are documented in perlrun. Also check out some of the articles in Randal Schwartz's Unix Review column, especially his first article for them. You can also google for Perl one liners to find lots of examples.
Do you need Perl at all? How about
command -I use | grep "myregexp" && dosomething
right in the shell?
#!/usr/bin/perl
sub my_action() {
print "Implement some action here\n";
}
open PROG, "/path/to/your/command|" or die $!;
while (<PROG>) {
/your_regexp_here/ and my_action();
print $_;
}
close PROG;
This will scan output from your command, match regexps and do some action (which now is printing the line)
In Perl you can use backticks to execute commands on the shell. Here is a document on using backticks. I'm not sure about how to capture the output, but I'm sure there's more than a way to do it.
You indeed use a one-liner in a case like this. I recently coded up one that I use, among other ways, to produce output which lists the directory structure present in a .zip archive (one dir entry per line). So using that output as an example of command output that we'd like to filter, we could put a pipe in and then use perl with the -n -e flags to filter the incoming data (and/or do other things with it):
[command_producing_text_output] | perl -MFile::Path -n -e \
"BEGIN{#PTM=()} if (m{^perl/(bin|lib(?!/site))}) {chomp;push #PTM,$_}" ^
-e "END{#WDD=mkpath (\#PTM,1);" ^
-e "printf qq/Created %u dirs to reflect part of structure present in the .ZIP file\n/, scalar(#WDD);}"
the shell syntax used, including: quoting of perl code and escaping of newlines, reflects CMD.exe usage in Windows NT-like consoles. If you need to, mentally replace
"^" with "\" and " with ' in the appropriate places.
The one-liner above adds only the directory names that start with "perl/bin" or
"perl/lib (not followed by "/site"); it then creates those directories. You wind
up with a (empty) tree that you can use for whatever evil purposes you desire.
The main point is to illustrate that there are flags available (-n, -p) to
allow perl to loop over each input record (line), and that what you can do is unlimited in terms of complexity.