I have a Perl script which runs fine under Perl 5.005 on HPUX 11i v3 and that causes a small problem under Perl 5.10 on my Ubuntu 11.04 box.
It boils down to a line like:
open (CMD, "do something | grep -E 'sometext$' |");
(it's actually capturing ps -ef output to see if a process is running but I don't think that's important (a)).
Now this runs fine on the HPUX environment but, when I try it out under Ubuntu, I get:
sh: Syntax error: Unterminated quoted string
By inserting copious debug statements, I tracked it down to that offending line then started removing characters one-by-one until it stopped complaining. Luckily, $ was the first one I tried and it stopped giving me the error, so I then changed the line to:
open (CMD, "do something | grep -E 'sometext\$' |");
and it worked fine (under Linux anyway - I haven't tested that on HPUX since I don't have access to that machine today - if it does work, I'll just use that approach but I'd still like to know why it's a problem).
So it seems obvious that the $ is "swallowing" the closing single quote under my Linux environment but not apparently on the HPUX one.
My question is simply, why? Surely there weren't any massive changes between 5.005 and 5.10. Or is there some sort of configuration item I'm missing?
(a) But, if you know a better way to do this without external CPAN modules (ie, with just the baseline Perl 5.005 installation), I'd be happy to know about it.
$' is a special variable (see perldoc perlvar). 5.005 was many versions ago, so it's possible that something has changed in the regexp engine to make this variable different (although it appears to be in 5.005 also)
As for the better way, you could at least only run the 'ps -ef' in a pipeline and do the 'grep' in perl.
Use the following!!!
use strict;
use warnings;
You would have gotten
Use of uninitialized value $' in concatenation (.) or string
A sigil followed by any punctuation symbol (on a standard keyboard) is a variable in Perl, regardless of if it is defined or not. So in a double quoted string [$#][symbol] will always be read as one token and interpolated unless the sigil is escaped.
I have a feeling that the difference you are seeing has to do with different system shells rather than different versions of perl.
Consider your line:
open (CMD, "do something | grep -E 'sometext$' |");
When perl sees that, it will interpolate the empty $' variable into the double quoted string, so the string becomes:
open (CMD, "do something | grep -E 'sometext |");
At that point, your shell gets to process a line that looks like:
do something | grep -E 'sometext
And if it succeeds or fails will depend on the shell's rules regarding unterminated strings (some shells will complain loudly, others will automatically terminate the string at eof).
If you were using the warnings pragma in your script, you probably would have gotten a warning about interpolating an undefined variable.
A shorter and cleaner way to read in the output of ps would be:
my #lines = grep /sometext\$/, `ps -ef`;
Or using an explicit open:
my #lines = grep /sometext\$/, do {
open my $fh, '|-', 'ps -ef' or die $!;
<$fh>
};
Because $' is a special variable in recent versions of Perl.
From the official documentation (perlvar):
$':
The string following whatever was matched by the last successful
pattern match (not counting any matches hidden within a BLOCK or
eval() enclosed by the current BLOCK).
If there were no successful pattern matches, $' is empty and your statement essentially interpolates to
open (CMD, "do something | grep -E 'sometext |");
Escaping the dollar sign (the solution that works for you on Linux) should work on HPUX too.
I'm not sure when was this variable added, but I can confirm that it exists in Perl 5.8.5. What's New for Perl 5.005 mentions $' (not as a new feature), so I think it was already there before then.
You should probably use single-quotes rather than double quotes around the string since there is nothing in the string that should be interpolated:
open (CMD, q{do something | grep -E 'sometext$' |});
Better would be to use the 3-argument form of open with a lexical file handle:
open my $cmd, '-|', q{do something | grep -E 'sometext$'} or die 'a horrible death';
I don't have a good explanation for why $' is being recognized as the special variable in 5.10 and yet it was not in 5.005. That is unexpected.
Is there a really good reason you can't upgrade to something like 5.14.1? Even if you don't change the system-provided Perl, there's no obvious reason you can't install a recent version in some other location and use that for all your scripting work.
$' is a special variable.
If you want to avoid variable expansion, just use q():
open (CMD, q(do something | grep -E 'sometext$' |));
Related
I am trying to test the code snippet below for a bigger script that I am writing. However, I can't get the search working with parentheses and variables.
Appreciate any help someone can give me.
Code snippet:
#!/usr/bin/perl
$file="test4.html";
$Search="Help (Test)";
$Replace="Testing";
print "/usr/bin/sed -i cb 's/$Search/$Replace/g' $file\n";
`/usr/bin/sed -i cb 's/$Search/$Replace/g' $file`;
Thanks,
Ash
The syntax to run a command in a child process and wait for its termination in perl is system "cmd", "arg1", "arg2",...:
#!/usr/bin/perl
$file="test4.html";
$Search="Help (Test)";
$Replace="Testing";
print "/usr/bin/sed -icb -e 's/$Search/$Replace/g' -- $file\n";
system "/usr/bin/sed", "-icb", "-e", "s/$Search/$Replace/g", "--", $file;
(error checking left as an exercise, see perldoc -f system for details)
Note that -i is not a standard sed option. The few implementations that support it (yours must be the FreeBSD one as you've separated the cb backup extension from -i) have actually copied it from perl! It does feel a bit silly to be calling sed from perl here.
Looking at your approach:
The `...` operator itself is reminiscent of the equivalent `...` shell operator. In perl, what's inside is evaluated as if inside double quoted, in that $var, #var... perl variables are expanded, and a shell is started with -c and the resulting string as arguments and with its stdout redirected to a pipe.
The shell interprets that argument as code in the shell syntax. Perl reads the output of that inline shell script from the other end of the pipe and that makes up the expansion of `...`. Same as in shell command substitution except that there's is no stripping of zero bytes or of trailing newlines.
sed -i produces no output, so it's pointless to try and capture its output with `...` here.
Now in your case, the code that sh is asked to interpret is:
/usr/bin/sed -i cb 's/Help (Test)/Testing/g' test4.html
That should work fine on FreeBSD or macOS at least. If $file had been test$(reboot).html, that would have been worse though.
Here, because you have the contents of variables that end up interpreted as code in an interpreter (here sh), you have a potential arbitrary command injection vulnerability.
In the system approach, we remove sh, so that particular vulnerability is removed. However sed is also an interpreter of some language. That language is not as omnipotent as that of sh, but for instance sed can write to arbitrary files with its w command. The GNU implementation (which you don't seem to be using) can run arbitrary commands as well.
So you still potentially have a code injection vulnerability in the case of $Search or $Replace coming from an external source.
If that's the case, you'd need to make sure your properly sanitise those values before running sed. See for instance: How to ensure that string interpolated into `sed` substitution escapes all metachars
I'm having an issue with some code and I'm wondering if anyone can assist.
Basically I'm trying to execute an isql query against a database and assign it to a scalar variable. The isql command makes use of the column seperator which is defined as the pipe symbol.
So I have it set-up as such:
my $command = "isql -S -U -s| -i";
my $isql_output = `$command`;
The isql command works in isolation but when issued as a backtick it stops at the pipe. I've tried concatenating the $command string using sub-strings, using single quotes and backslash escaping items such as -s\"\|\" to no avail. I've also tried using qx instead of backticks.
Unfortunately I'm currently using an older version of perl (v5.6.1) with limited scope for upgrade so I'm not sure if I can resolve this.
You have to quote the | in a way that the shell does not recognize it as a special character. Two ways:
Put the -s| into single quotes: '-s|'. Perl will leave single quotes inside double quoted strings alone and pass them to the shell unmodified.
Escape the | with two backslashes: -s\\|. Why two? The first one is seen by Perl telling it to pass the next character through unmodified. Therefore the shell sees -s\| and does something very similar: it sees the single \ and knows not to treat the next char, |, special.
The problem is that the command is being executed through a shell.
You can avoid this by passing the command and arguments in a list rather than a single string.
The backtick construct does not support that, so you would need to use the open() function instead.
I haven't tested the following code but it gives the idea:
my #command = (qw(isql -Sserver -Uuser -Ppassword -s| -w4096), '–i' . $file);
print join(' ', #command), "\n";
open(my $fh, '-|', #command)
or die "failed to run isql command: $#\n";
my #isql_output = <$fh>;
close($fh);
my $isql_output = $isql_output[0]; chomp($isql_output);
If you're working with a 15 year old version of Perl (which Oracle users tend to do) I'm not sure this will all be supported. For instance, you may need to write chop instead of chomp.
UPDATE: the problem is not the Perl version, but this construct not being supported on Windows, according to the documentation. This must be qualified: I use Perl on Cygwin and it works fine there, but I don't know whether you can use Cygwin.
Single quotes should work. Try to run test perl script:
my $cmd = "./test.sh -S -U -s '|' -i";
print `$cmd`;
With test.sh:
#!/bin/sh
echo $#
Output should be -S -U -s | -i
I know this is incorrect. I just want to know how perl parses this.
So, I'm playing around with perl, what I wanted was perl -ne what I typed was perl -ie the behavior was kind of interesting, and I'd like to know what happened.
$ echo 1 | perl -ie'next unless /g/i'
So perl Aborted (core dumped) on that. Reading perl --help I see -i takes an extension for backups.
-i[extension] edit <> files in place (makes backup if extension supplied)
For those that don't know -e is just eval. So I'm thinking one of three things could have happened either it was parsed as
perl -i -e'next unless /g/i' i gets undef, the rest goes as argument to e
perl -ie 'next unless /g/i' i gets the argument e, the rest is hanging like a file name
perl -i"-e'next unless /g/i'" whole thing as an argument to i
When I run
$ echo 1 | perl -i -e'next unless /g/i'
The program doesn't abort. This leads me to believe that 'next unless /g/i' is not being parsed as a literal argument to -e. Unambiguously the above would be parsed that way and it has a different result.
So what is it? Well playing around with a little more, I got
$ echo 1 | perl -ie'foo bar'
Unrecognized switch: -bar (-h will show valid options).
$ echo 1 | perl -ie'foo w w w'
... works fine guess it reads it as `perl -ie'foo' -w -w -w`
Playing around with the above, I try this...
$ echo 1 | perl -ie'foo e eval q[warn "bar"]'
bar at (eval 1) line 1.
Now I'm really confused.. So how is Perl parsing this? Lastly, it seems you can actually get a Perl eval command from within just -i. Does this have security implications?
$ perl -i'foo e eval "warn q[bar]" '
Quick answer
Shell quote-processing is collapsing and concatenating what it thinks is all one argument. Your invocation is equivalent to
$ perl '-ienext unless /g/i'
It aborts immediately because perl parses this argument as containing -u, which triggers a core dump where execution of your code would begin. This is an old feature that was once used for creating pseudo-executables, but it is vestigial in nature these days.
What appears to be a call to eval is the misparse of -e 'ss /g/i'.
First clue
B::Deparse can your friend, provided you happen to be running on a system without dump support.
$ echo 1 | perl -MO=Deparse,-p -ie'next unless /g/i'
dump is not supported.
BEGIN { $^I = "enext"; }
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined(($_ = <ARGV>))) {
chomp($_);
(('ss' / 'g') / 'i');
}
So why does unle disappear? If you’re running Linux, you may not have even gotten as far as I did. The output above is from Perl on Cygwin, and the error about dump being unsupported is a clue.
Next clue
Of note from the perlrun documentation:
-u
This switch causes Perl to dump core after compiling your program. You can then in theory take this core dump and turn it into an executable file by using the undump program (not supplied). This speeds startup at the expense of some disk space (which you can minimize by stripping the executable). (Still, a "hello world" executable comes out to about 200K on my machine.) If you want to execute a portion of your program before dumping, use the dump operator instead. Note: availability of undump is platform specific and may not be available for a specific port of Perl.
Working hypothesis and confirmation
Perl’s argument processing sees the entire chunk as a single cluster of options because it begins with a dash. The -i option consumes the next word (enext), as we can see in the implementation for -i processing.
case 'i':
Safefree(PL_inplace);
[Cygwin-specific code elided -geb]
{
const char * const start = ++s;
while (*s && !isSPACE(*s))
++s;
PL_inplace = savepvn(start, s - start);
}
if (*s) {
++s;
if (*s == '-') /* Additional switches on #! line. */
s++;
}
return s;
For the backup file’s extension, the code above from perl.c consumes up to the first whitespace character or end-of-string, whichever is first. If characters remain, the first must be whitespace, then skip it, and if the next is a dash then skip it also. In Perl, you might write this logic as
if ($$s =~ s/i(\S+)(?:\s-)//) {
my $extension = $1;
return $extension;
}
Then, all of -u, -n, -l, and -e are valid Perl options, so argument processing eats them and leaves the nonsensical
ss /g/i
as the argument to -e, which perl parses as a series of divisions. But before execution can even begin, the archaic -u causes perl to dump core.
Unintended behavior
An even stranger bit is if you put two spaces between next and unless
$ perl -ie'next unless /g/i'
the program attempts to run. Back in the main option-processing loop we see
case '*':
case ' ':
while( *s == ' ' )
++s;
if (s[0] == '-') /* Additional switches on #! line. */
return s+1;
break;
The extra space terminates option parsing for that argument. Witness:
$ perl -ie'next nonsense -garbage --foo' -e die
Died at -e line 1.
but without the extra space we see
$ perl -ie'next nonsense -garbage --foo' -e die
Unrecognized switch: -onsense -garbage --foo (-h will show valid options).
With an extra space and dash, however,
$ perl -ie'next -unless /g/i'
dump is not supported.
Design motivation
As the comments indicate, the logic is there for the sake of harsh shebang (#!) line constraints, which perl does its best to work around.
Interpreter scripts
An interpreter script is a text file that has execute permission enabled and whose first line is of the form:
#! interpreter [optional-arg]
The interpreter must be a valid pathname for an executable which is not itself a script. If the filename argument of execve specifies an interpreter script, then interpreter will be invoked with the following arguments:
interpreter [optional-arg] filename arg...
where arg... is the series of words pointed to by the argv argument of execve.
For portable use, optional-arg should either be absent, or be specified as a single word (i.e., it should not contain white space) …
Three things to know:
'-x y' means -xy to Perl (for some arbitrary options "x" and "y").
-xy, as common for unix tools, is a "bundle" representing -x -y.
-i, like -e absorbs the rest of the argument. Unlike -e, it considers a space to be the end of the argument (as per #1 above).
That means
-ie'next unless /g/i'
which is just a fancy way of writing
'-ienext unless /g/i'
unbundles to
-ienext -u -n -l '-ess /g/i'
^^^^^ ^^^^^^^
---------- ----------
val for -i val for -e
perlrun documents -u as:
This switch causes Perl to dump core after compiling your program. You can then in theory take this core dump and turn it into an executable file by using the undump program (not supplied). This speeds startup at the expense of some disk space (which you can minimize by stripping the executable). (Still, a "hello world" executable comes out to about 200K on my machine.) If you want to execute a portion of your program before dumping, use the dump() operator instead. Note: availability of undump is platform specific and may not be available for a specific port of Perl.
I have a variable in a shell script,
var=1234_number
I want to replace all other than integer of $var .. how can I do it using a perl onliner?
You might be looking for something to edit the shell script, in which case, this might be sufficient:
perl -i.bak -e 's/\b(var=\d+).*/$1/' shellscript.sh
The '-i' overwrites the original file, saving a copy in shellscript.sh.bak; the substitute command finds assignments to 'var' (and not any longer name ending 'var') followed by an equals sign, some digits, and any non-digits, and leaves behind just the assignment of digits.
In the example, it gives:
var=1234
Note that the Perl regex is not foolproof - it will mangle this (dropping the closing brace).
: ${var=1234_number}
Dealing with all such possible variants is extremely fairly tricky:
echo $var=$other
OTOH, you might be looking to eliminate digits from a variable within a shell script, in which case:
var=$(echo $var | perl -e 's/\D//g')
You could also use 'sed' for the job:
var=$(echo $var | sed 's/[^0-9]//g')
No need to use anything but the shell for this
var=1234_abcd
var=${var%_*}
echo $var # => 1234
See 'Parameter Expansion' in the bash manual.
As I understand (Perl is new to me) Perl can be used to script against a Unix command line. What I want to do is run (hardcoded) command line calls, and search the output of these calls for RegEx matches. Is there a way to do this simply in Perl? How?
EDIT: Sequence here is:
-Call another program.
-Run a regex against its output.
my $command = "ls -l /";
my #output = `$command`;
for (#output) {
print if /^d/;
}
The qx// quasi-quoting operator (for which backticks are a shortcut) is stolen from shell syntax: run the string as a command in a new shell, and return its output (as a string or a list, depending on context). See perlop for details.
You can also open a pipe:
open my $pipe, "$command |";
while (<$pipe>) {
# do stuff
}
close $pipe;
This allows you to (a) avoid gathering the entire command's output into memory at once, and (b) gives you finer control over running the command. For example, you can avoid having the command be parsed by the shell:
open my $pipe, '-|', #command, '< single argument not mangled by shell >';
See perlipc for more details on that.
You might be able to get away without Perl, as others have mentioned. However, if there is some Perl feature you need, such as extended regex features or additional text manipulation, you can pipe your output to perl then do what you need. Perl's -e switch let's you specify the Perl program on the command line:
command | perl -ne 'print if /.../'
There are several other switches you can pass to perl to make it very powerful on the command line. These are documented in perlrun. Also check out some of the articles in Randal Schwartz's Unix Review column, especially his first article for them. You can also google for Perl one liners to find lots of examples.
Do you need Perl at all? How about
command -I use | grep "myregexp" && dosomething
right in the shell?
#!/usr/bin/perl
sub my_action() {
print "Implement some action here\n";
}
open PROG, "/path/to/your/command|" or die $!;
while (<PROG>) {
/your_regexp_here/ and my_action();
print $_;
}
close PROG;
This will scan output from your command, match regexps and do some action (which now is printing the line)
In Perl you can use backticks to execute commands on the shell. Here is a document on using backticks. I'm not sure about how to capture the output, but I'm sure there's more than a way to do it.
You indeed use a one-liner in a case like this. I recently coded up one that I use, among other ways, to produce output which lists the directory structure present in a .zip archive (one dir entry per line). So using that output as an example of command output that we'd like to filter, we could put a pipe in and then use perl with the -n -e flags to filter the incoming data (and/or do other things with it):
[command_producing_text_output] | perl -MFile::Path -n -e \
"BEGIN{#PTM=()} if (m{^perl/(bin|lib(?!/site))}) {chomp;push #PTM,$_}" ^
-e "END{#WDD=mkpath (\#PTM,1);" ^
-e "printf qq/Created %u dirs to reflect part of structure present in the .ZIP file\n/, scalar(#WDD);}"
the shell syntax used, including: quoting of perl code and escaping of newlines, reflects CMD.exe usage in Windows NT-like consoles. If you need to, mentally replace
"^" with "\" and " with ' in the appropriate places.
The one-liner above adds only the directory names that start with "perl/bin" or
"perl/lib (not followed by "/site"); it then creates those directories. You wind
up with a (empty) tree that you can use for whatever evil purposes you desire.
The main point is to illustrate that there are flags available (-n, -p) to
allow perl to loop over each input record (line), and that what you can do is unlimited in terms of complexity.