How to check substring in Bourne Shell? - sh

I wanna test whether a string has "substring". Most answers online is based on Bash. I tried
if [ $string == "*substring*" ]
which was not working. Currently
if echo ${string} | grep -q "substring"
worked. Is there any other better way.

Using POSIX compliant parameter-expansion and with the classic test-command.
#!/bin/sh
substring=ab
string=abc
if [ "$string" != "${string%"$substring"*}" ]; then
echo "$substring present in $string"
fi
(or) explicitly using the test operator as
if test "$string" != "${string%$substring*}" ; then

In a POSIX-features only shell you won't be able to do general pattern or regex matching inside a conditional without the help of an external utility.
That said:
Kenster's helpful answer shows how to use the branches of a case ... esac statement for pattern matching.
Inian's helpful answer shows how to match indirectly inside a conditional, using patterns as part of parameter expansions.
Your own grep approach is certainly an option, though you should double-quote ${string}:
if echo "${string}" | grep -q "substring"; ...
A slightly more efficient way is to use the expr utility, but note that per POSIX it is limited to BREs (basic regular expressions), which are limited:
string='This text contains the word "substring".'
if expr "$string" : ".*substring" >/dev/null; then echo "matched"; fi
Note that the regex - the 3rd operand - is implicitly anchored at the start of the input, hence the need for .*.
>/dev/null suppresses expr's default output, which is the length of the matched string in this case. (If nothing matches, the output is 0, and the exit code is set to 1).

If you're just testing for substrings (or anything that can be matched using filename wildcards) you can use case:
#!/bin/sh
while read line; do
case "$line" in
*foo*) echo "$line" contains foo ;;
*bar*) echo "$line" contains bar ;;
*) echo "$line" isnt special ;;
esac
done
$ ./testit.sh
food
food contains foo
ironbar
ironbar contains bar
bazic
bazic isnt special
foobar
foobar contains foo
This is basic Bourne shell functionality. It doesn't require any external programs, it's not bash-specific, and it predates POSIX. So it should be pretty portable.

Short answer is no, not if you are trying to use vanilla sh, without Bash extensions. On many modern systems, /bin/sh is actually a link to /bin/bash, which provides a superset of sh's functionality (for the most part). Your original attempt would have worked with Bash's builtin [[ extended test command: http://mywiki.wooledge.org/BashFAQ/031

Related

Run a sed search and replace inside perl

I am trying to test the code snippet below for a bigger script that I am writing. However, I can't get the search working with parentheses and variables.
Appreciate any help someone can give me.
Code snippet:
#!/usr/bin/perl
$file="test4.html";
$Search="Help (Test)";
$Replace="Testing";
print "/usr/bin/sed -i cb 's/$Search/$Replace/g' $file\n";
`/usr/bin/sed -i cb 's/$Search/$Replace/g' $file`;
Thanks,
Ash
The syntax to run a command in a child process and wait for its termination in perl is system "cmd", "arg1", "arg2",...:
#!/usr/bin/perl
$file="test4.html";
$Search="Help (Test)";
$Replace="Testing";
print "/usr/bin/sed -icb -e 's/$Search/$Replace/g' -- $file\n";
system "/usr/bin/sed", "-icb", "-e", "s/$Search/$Replace/g", "--", $file;
(error checking left as an exercise, see perldoc -f system for details)
Note that -i is not a standard sed option. The few implementations that support it (yours must be the FreeBSD one as you've separated the cb backup extension from -i) have actually copied it from perl! It does feel a bit silly to be calling sed from perl here.
Looking at your approach:
The `...` operator itself is reminiscent of the equivalent `...` shell operator. In perl, what's inside is evaluated as if inside double quoted, in that $var, #var... perl variables are expanded, and a shell is started with -c and the resulting string as arguments and with its stdout redirected to a pipe.
The shell interprets that argument as code in the shell syntax. Perl reads the output of that inline shell script from the other end of the pipe and that makes up the expansion of `...`. Same as in shell command substitution except that there's is no stripping of zero bytes or of trailing newlines.
sed -i produces no output, so it's pointless to try and capture its output with `...` here.
Now in your case, the code that sh is asked to interpret is:
/usr/bin/sed -i cb 's/Help (Test)/Testing/g' test4.html
That should work fine on FreeBSD or macOS at least. If $file had been test$(reboot).html, that would have been worse though.
Here, because you have the contents of variables that end up interpreted as code in an interpreter (here sh), you have a potential arbitrary command injection vulnerability.
In the system approach, we remove sh, so that particular vulnerability is removed. However sed is also an interpreter of some language. That language is not as omnipotent as that of sh, but for instance sed can write to arbitrary files with its w command. The GNU implementation (which you don't seem to be using) can run arbitrary commands as well.
So you still potentially have a code injection vulnerability in the case of $Search or $Replace coming from an external source.
If that's the case, you'd need to make sure your properly sanitise those values before running sed. See for instance: How to ensure that string interpolated into `sed` substitution escapes all metachars

Linux shell: change Perl code to linux shell, grep line by line

The follwoing code is Perl script, grep lines with 'Stage' from hostlog. and then line by line match the content with regex, if find add the count by 1:
$command = 'grep \'Stage \' '. $hostlog;
#stage_info = qx($command);
foreach (#stage_info) {
if ( /Stage\s(\d+)\s(.*)/ ) {
$stage_number = $stage_number+1;
}
}
so how to do this in linux shell? Based on my test, the we can not loop line by line, since there is space inside.
That is a horrible piece of Perl code you've got there. Here's why:
It looks like you are not using use strict; use warnings;. That is a huge mistake, and will not prevent errors, it will just hide them.
Using qx() to grep lines from a file is a completely redundant thing to do, as this is what Perl does best itself. "Shelling out" a process like that most often slows your program down.
Use some whitespace to make your code readable. This is hard to read, and looks more complicated than it is.
You capture strings by using parentheses in your regex, but you never use these strings.
Re: $stage_number=$stage_number+1, see point 3. And also, this can be written $stage_number++. Using the ++ operator will make your code clearer, will prevent the uninitialized warnings, and save you some typing.
Here is what your code should look like:
use strict;
use warnings;
open my $fh, "<", $hostlog or die "Cannot open $hostlog for reading: $!";
while (<$fh>) {
if (/Stage\s\d+/) {
$stage_number++;
}
}
You're not doing anything with the internal captures, so why bother? You could do everything with a grep:
$ stage_number=$(grep -E 'Stage\s\d+\s' | wc -l)
This is using extended regular expressions. I believe the GNU version takes these without a -E parameter, and in Solaris, even the egrep command might not quite allow for this regular expression.
If there's something more you have to do, you've got to explain it in your question.
If I understand the issue correctly, you should be able to do this just fine in the shell:
while read; do
if echo ${REPLY} | grep -q -P "'Stage' "; then
# Do what you need to do
fi
done < test.log
Note that if your grep command supports the -P option you may be able to use the Perl regular expression as-is for the second test.
this is almost it. bash has no expression for multiple digits.
#!/bin/bash
command=( grep 'Stage ' "$hostlog" )
while read line
do
[ "$line" != "${line/Stage [0-9]/}" ] && (( ++stage_number ))
done < <( "${command[#]}" )
On the other hand taking the function of the perl script into account rather than the operations it performs the whole thing could be rewritten as
(( stage_number += ` grep -c 'Stage \d\+\s' "$hostlog" ` ))
or this
stage_number=` grep -c 'Stage \d\+\s' "$hostlog" `
if, in the original perl, stage_number is uninitialised, or is initalised to 0.

Perl ambiguous command line options, and security implications of eval with -i?

I know this is incorrect. I just want to know how perl parses this.
So, I'm playing around with perl, what I wanted was perl -ne what I typed was perl -ie the behavior was kind of interesting, and I'd like to know what happened.
$ echo 1 | perl -ie'next unless /g/i'
So perl Aborted (core dumped) on that. Reading perl --help I see -i takes an extension for backups.
-i[extension] edit <> files in place (makes backup if extension supplied)
For those that don't know -e is just eval. So I'm thinking one of three things could have happened either it was parsed as
perl -i -e'next unless /g/i' i gets undef, the rest goes as argument to e
perl -ie 'next unless /g/i' i gets the argument e, the rest is hanging like a file name
perl -i"-e'next unless /g/i'" whole thing as an argument to i
When I run
$ echo 1 | perl -i -e'next unless /g/i'
The program doesn't abort. This leads me to believe that 'next unless /g/i' is not being parsed as a literal argument to -e. Unambiguously the above would be parsed that way and it has a different result.
So what is it? Well playing around with a little more, I got
$ echo 1 | perl -ie'foo bar'
Unrecognized switch: -bar (-h will show valid options).
$ echo 1 | perl -ie'foo w w w'
... works fine guess it reads it as `perl -ie'foo' -w -w -w`
Playing around with the above, I try this...
$ echo 1 | perl -ie'foo e eval q[warn "bar"]'
bar at (eval 1) line 1.
Now I'm really confused.. So how is Perl parsing this? Lastly, it seems you can actually get a Perl eval command from within just -i. Does this have security implications?
$ perl -i'foo e eval "warn q[bar]" '
Quick answer
Shell quote-processing is collapsing and concatenating what it thinks is all one argument. Your invocation is equivalent to
$ perl '-ienext unless /g/i'
It aborts immediately because perl parses this argument as containing -u, which triggers a core dump where execution of your code would begin. This is an old feature that was once used for creating pseudo-executables, but it is vestigial in nature these days.
What appears to be a call to eval is the misparse of -e 'ss /g/i'.
First clue
B::Deparse can your friend, provided you happen to be running on a system without dump support.
$ echo 1 | perl -MO=Deparse,-p -ie'next unless /g/i'
dump is not supported.
BEGIN { $^I = "enext"; }
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined(($_ = <ARGV>))) {
chomp($_);
(('ss' / 'g') / 'i');
}
So why does unle disappear? If you’re running Linux, you may not have even gotten as far as I did. The output above is from Perl on Cygwin, and the error about dump being unsupported is a clue.
Next clue
Of note from the perlrun documentation:
-u
This switch causes Perl to dump core after compiling your program. You can then in theory take this core dump and turn it into an executable file by using the undump program (not supplied). This speeds startup at the expense of some disk space (which you can minimize by stripping the executable). (Still, a "hello world" executable comes out to about 200K on my machine.) If you want to execute a portion of your program before dumping, use the dump operator instead. Note: availability of undump is platform specific and may not be available for a specific port of Perl.
Working hypothesis and confirmation
Perl’s argument processing sees the entire chunk as a single cluster of options because it begins with a dash. The -i option consumes the next word (enext), as we can see in the implementation for -i processing.
case 'i':
Safefree(PL_inplace);
[Cygwin-specific code elided -geb]
{
const char * const start = ++s;
while (*s && !isSPACE(*s))
++s;
PL_inplace = savepvn(start, s - start);
}
if (*s) {
++s;
if (*s == '-') /* Additional switches on #! line. */
s++;
}
return s;
For the backup file’s extension, the code above from perl.c consumes up to the first whitespace character or end-of-string, whichever is first. If characters remain, the first must be whitespace, then skip it, and if the next is a dash then skip it also. In Perl, you might write this logic as
if ($$s =~ s/i(\S+)(?:\s-)//) {
my $extension = $1;
return $extension;
}
Then, all of -u, -n, -l, and -e are valid Perl options, so argument processing eats them and leaves the nonsensical
ss /g/i
as the argument to -e, which perl parses as a series of divisions. But before execution can even begin, the archaic -u causes perl to dump core.
Unintended behavior
An even stranger bit is if you put two spaces between next and unless
$ perl -ie'next unless /g/i'
the program attempts to run. Back in the main option-processing loop we see
case '*':
case ' ':
while( *s == ' ' )
++s;
if (s[0] == '-') /* Additional switches on #! line. */
return s+1;
break;
The extra space terminates option parsing for that argument. Witness:
$ perl -ie'next nonsense -garbage --foo' -e die
Died at -e line 1.
but without the extra space we see
$ perl -ie'next nonsense -garbage --foo' -e die
Unrecognized switch: -onsense -garbage --foo (-h will show valid options).
With an extra space and dash, however,
$ perl -ie'next -unless /g/i'
dump is not supported.
Design motivation
As the comments indicate, the logic is there for the sake of harsh shebang (#!) line constraints, which perl does its best to work around.
Interpreter scripts
An interpreter script is a text file that has execute permission enabled and whose first line is of the form:
#! interpreter [optional-arg]
The interpreter must be a valid pathname for an executable which is not itself a script. If the filename argument of execve specifies an interpreter script, then interpreter will be invoked with the following arguments:
interpreter [optional-arg] filename arg...
where arg... is the series of words pointed to by the argv argument of execve.
For portable use, optional-arg should either be absent, or be specified as a single word (i.e., it should not contain white space) …
Three things to know:
'-x y' means -xy to Perl (for some arbitrary options "x" and "y").
-xy, as common for unix tools, is a "bundle" representing -x -y.
-i, like -e absorbs the rest of the argument. Unlike -e, it considers a space to be the end of the argument (as per #1 above).
That means
-ie'next unless /g/i'
which is just a fancy way of writing
'-ienext unless /g/i'
unbundles to
-ienext -u -n -l '-ess /g/i'
^^^^^ ^^^^^^^
---------- ----------
val for -i val for -e
perlrun documents -u as:
This switch causes Perl to dump core after compiling your program. You can then in theory take this core dump and turn it into an executable file by using the undump program (not supplied). This speeds startup at the expense of some disk space (which you can minimize by stripping the executable). (Still, a "hello world" executable comes out to about 200K on my machine.) If you want to execute a portion of your program before dumping, use the dump() operator instead. Note: availability of undump is platform specific and may not be available for a specific port of Perl.

replacing a variable in shell script using perl

I have a variable in a shell script,
var=1234_number
I want to replace all other than integer of $var .. how can I do it using a perl onliner?
You might be looking for something to edit the shell script, in which case, this might be sufficient:
perl -i.bak -e 's/\b(var=\d+).*/$1/' shellscript.sh
The '-i' overwrites the original file, saving a copy in shellscript.sh.bak; the substitute command finds assignments to 'var' (and not any longer name ending 'var') followed by an equals sign, some digits, and any non-digits, and leaves behind just the assignment of digits.
In the example, it gives:
var=1234
Note that the Perl regex is not foolproof - it will mangle this (dropping the closing brace).
: ${var=1234_number}
Dealing with all such possible variants is extremely fairly tricky:
echo $var=$other
OTOH, you might be looking to eliminate digits from a variable within a shell script, in which case:
var=$(echo $var | perl -e 's/\D//g')
You could also use 'sed' for the job:
var=$(echo $var | sed 's/[^0-9]//g')
No need to use anything but the shell for this
var=1234_abcd
var=${var%_*}
echo $var # => 1234
See 'Parameter Expansion' in the bash manual.

How do I use Perl on the command line to search the output of other programs?

As I understand (Perl is new to me) Perl can be used to script against a Unix command line. What I want to do is run (hardcoded) command line calls, and search the output of these calls for RegEx matches. Is there a way to do this simply in Perl? How?
EDIT: Sequence here is:
-Call another program.
-Run a regex against its output.
my $command = "ls -l /";
my #output = `$command`;
for (#output) {
print if /^d/;
}
The qx// quasi-quoting operator (for which backticks are a shortcut) is stolen from shell syntax: run the string as a command in a new shell, and return its output (as a string or a list, depending on context). See perlop for details.
You can also open a pipe:
open my $pipe, "$command |";
while (<$pipe>) {
# do stuff
}
close $pipe;
This allows you to (a) avoid gathering the entire command's output into memory at once, and (b) gives you finer control over running the command. For example, you can avoid having the command be parsed by the shell:
open my $pipe, '-|', #command, '< single argument not mangled by shell >';
See perlipc for more details on that.
You might be able to get away without Perl, as others have mentioned. However, if there is some Perl feature you need, such as extended regex features or additional text manipulation, you can pipe your output to perl then do what you need. Perl's -e switch let's you specify the Perl program on the command line:
command | perl -ne 'print if /.../'
There are several other switches you can pass to perl to make it very powerful on the command line. These are documented in perlrun. Also check out some of the articles in Randal Schwartz's Unix Review column, especially his first article for them. You can also google for Perl one liners to find lots of examples.
Do you need Perl at all? How about
command -I use | grep "myregexp" && dosomething
right in the shell?
#!/usr/bin/perl
sub my_action() {
print "Implement some action here\n";
}
open PROG, "/path/to/your/command|" or die $!;
while (<PROG>) {
/your_regexp_here/ and my_action();
print $_;
}
close PROG;
This will scan output from your command, match regexps and do some action (which now is printing the line)
In Perl you can use backticks to execute commands on the shell. Here is a document on using backticks. I'm not sure about how to capture the output, but I'm sure there's more than a way to do it.
You indeed use a one-liner in a case like this. I recently coded up one that I use, among other ways, to produce output which lists the directory structure present in a .zip archive (one dir entry per line). So using that output as an example of command output that we'd like to filter, we could put a pipe in and then use perl with the -n -e flags to filter the incoming data (and/or do other things with it):
[command_producing_text_output] | perl -MFile::Path -n -e \
"BEGIN{#PTM=()} if (m{^perl/(bin|lib(?!/site))}) {chomp;push #PTM,$_}" ^
-e "END{#WDD=mkpath (\#PTM,1);" ^
-e "printf qq/Created %u dirs to reflect part of structure present in the .ZIP file\n/, scalar(#WDD);}"
the shell syntax used, including: quoting of perl code and escaping of newlines, reflects CMD.exe usage in Windows NT-like consoles. If you need to, mentally replace
"^" with "\" and " with ' in the appropriate places.
The one-liner above adds only the directory names that start with "perl/bin" or
"perl/lib (not followed by "/site"); it then creates those directories. You wind
up with a (empty) tree that you can use for whatever evil purposes you desire.
The main point is to illustrate that there are flags available (-n, -p) to
allow perl to loop over each input record (line), and that what you can do is unlimited in terms of complexity.