I need to concate map result with a string.
perl -le 'print (map { (q(a)..q(z))[rand(26)] } 1..3) . "123"'
Expected 3 random symbols and 123. But there is no 123 just 3 random symbols.
In general, I need to add a variable there.
With warnings:
print (...) interpreted as function at -e line 1.
Useless use of concatenation (.) or string in void context at -e line 1.
This is because your code is of the following form:
print(...) . "123"
Solutions:
perl -le'print( map( { (q(a)..q(z))[rand(26)] } 1..3 ) . "123" )' # Fully parenthesized
perl -le'print map( { (q(a)..q(z))[rand(26)] } 1..3 ) . "123"' # Opt parens dropped
perl -le'print( ( map { (q(a)..q(z))[rand(26)] } 1..3 ) . "123" )'
perl -le'print +( map { (q(a)..q(z))[rand(26)] } 1..3 ) . "123"' # "Disambiguated"
Except those aren't right either. While they fix the problem you asked about, they reveal a second problem. They invariably print 3123 because map in scalar context returns the number of scalars it would otherwise return in list context.
Solutions:
perl -le'print( map( { (q(a)..q(z))[rand(26)] } 1..3 ), "123" )' # . => ,
perl -le'print map( { (q(a)..q(z))[rand(26)] } 1..3 ), "123"' # . => ,
perl -le'print( ( map { (q(a)..q(z))[rand(26)] } 1..3 ), "123" )' # . => ,
perl -le'print +( map { (q(a)..q(z))[rand(26)] } 1..3 ), "123"' # . => ,
perl -le'print join "", ( map { (q(a)..q(z))[rand(26)] } 1..3 ), "123"' # join
There are a couple of interesting things going on here. First let's ask Perl to help up track down any problems by turning on warnings.
$ perl -Mwarnings -le 'print (map { (q(a)..q(z))[rand(26)] } 1..3) . "123"'
print (...) interpreted as function at -e line 1.
Useless use of concatenation (.) or string in void context at -e line 1.
lfy
Two warnings there. Let's look at both of them.
print (...) interpreted as function
If the first non-whitespace character following print (or any other list operator) is a opening parenthesis, then Perl assumes that you want to call print as a function and it will look for the balancing closing parenthesis to end the list of arguments to print.
Useless use of concatenation (.) or string in void context
Because the print call is assumed to end with the closing parenthesis, the . "123" isn't doing anything useful. And is therefore ignored.
The standard way to tell Perl that an opening parenthesis isn't marking a function call is to use a +.
$ perl -Mwarnings -le 'print +(map { (q(a)..q(z))[rand(26)] } 1..3) . "123"'
3123
Well, we lost the warnings. But we got '3' where we were hoping to see three symbols. What we have here now is basically this:
print +(map ...) . "123";
Because of the concatenation, map is being called in scalar context. And in scalar context, map no longer returns a list of values, but the size of that list (an integer - 3 in this case).
The fix for that is to replace the . with a comma, so map is called in list context.
$ perl -Mwarnings -le 'print +(map { (q(a)..q(z))[rand(26)] } 1..3), "123"'
ntg123
So you were being burnt by a) the parentheses not doing what you wanted them to do and b) map being called in scalar context.
perl -le 'print join("",map { (q(a)..q(z))[rand(26)] } 1..3) . "123"'
Enclose the whole line to be printed inside parenthesis and use a comma as separator:
perl -le 'print ( (map { (q(a)..q(z))[rand(26)] } 1..3) , "123")'
Related
Is there a way in perl to replace all text in input line except ones within single quotes(There could be more than one) using regex, I have achieved this using the code below but would like to see if it can be done with regex and map.
while (<>) {
my $m=0;
for (split(//)) {
if (/'/ and ! $m) {
$m=1;
print;
}
elsif (/'/ and $m) {
$m=0;
print;
}
elsif ($m) {
print;
}
else {
print lc;
}
}
}
**Sample input:**
and (t.TARGET_TYPE='RAC_DATABASE' or (t.TARGET_TYPE='ORACLE_DATABASE' and t.TYPE_QUALIFIER3 != 'racinst'))
**Sample output:**
and (t.target_type='RAC_DATABASE' or (t.target_type='ORACLE_DATABASE' and t.type_qualifier3 != 'racinst'))
You can give this a shot. All one regexp.
$str =~ s/(?:^|'[^']*')\K[^']*/lc($&)/ge;
Or, cleaner and more documented (this is semantically equivalent to the above)
$str =~ s/
(?:
^ | # Match either the start of the string, or
'[^']*' # some text in quotes.
)\K # Then ignore that part,
# because we want to leave it be.
[^']* # Take the text after it, and
# lowercase it.
/lc($&)/gex;
The g flag tells the regexp to run as many times as necessary. e tells it that the substitution portion (lc($&), in our case) is Perl code, not just text. x lets us put those comments in there so that the regexp isn't total gibberish.
Don't you play too hard with regexp for such a simple job?
Why not get the kid 'split' for it today?
#!/usr/bin/perl
while (<>)
{
#F = split "'";
#F = map { $_ % 2 ? $F[$_] : lc $F[$_] } (0..#F);
print join "'", #F;
}
The above is for understanding. We often join the latter two lines reasonably into:
print join "'", map { $_ % 2 ? $F[$_] : lc $F[$_] } (0..#F);
Or enjoy more, making it a one-liner? (in bash shell) In concept, it looks like:
perl -pF/'/ -e 'join "'", map { $_ % 2 ? $F[$_] : lc $F[$_] } (0..#F);' YOUR_FILE
In reality, however, we need to respect the shell and do some escape (hard) job:
perl -pF/\'/ -e 'join "'"'"'", map { $_ % 2 ? $F[$_] : lc $F[$_] } (0..#F);' YOUR_FILE
(The single-quoted single quote needs to become 5 letters: '"'"')
If it doesn't help your job, it helps sleep.
One more variant with Perl one-liner. I'm using hex \x27 for single quotes
$ cat sql_str.txt
and (t.TARGET_TYPE='RAC_DATABASE' or (t.TARGET_TYPE='ORACLE_DATABASE' and t.TYPE_QUALIFIER3 != 'racinst'))
$ perl -ne ' { #F=split(/\x27/); for my $val (0..$#F) { $F[$val]=lc($F[$val]) if $val%2==0 } ; print join("\x27",#F) } ' sql_str.txt
and (t.target_type='RAC_DATABASE' or (t.target_type='ORACLE_DATABASE' and t.type_qualifier3 != 'racinst'))
$
When I run the below script I get
$VAR1 = 'ssh -o Compression=yes -o ConnectTimeout=333 remoteIp \'mbuffer -q -s 128k -m mbufferSize -4 -I mbufferPort|zfs recv recvOpt dstDataSet\'';
which leads me to think, that all $shellQuote does is converting an array to a string and adding a ' in the beginning and end. Plus adding a | between two arrays. But the purpose of the map function can't I figure out.
The script is a super simplified version of this in order to figure out what exactly $shellQuote does.
Question
$shellQuote looks very complicated. Does it do anything else I am missing?
#!/usr/bin/perl
use Data::Dumper;
use warnings;
use strict;
my $shellQuote = sub {
my #return;
for my $group (#_){
my #args = #$group;
for (#args){
s/'/'"'"'/g;
}
push #return, join ' ', map {/^[-\/#=_0-9a-z]+$/i ? $_ : qq{'$_'}} #args;
}
return join '|', #return;
};
sub buildRemoteRefArray {
my $remote = shift;
my #sshCmdArray = (qw(ssh -o Compression=yes -o), 'ConnectTimeout=' . '333');
if ($remote){
return [#sshCmdArray, $remote, $shellQuote->(#_)];
}
return #_;
};
my #recvCmd = buildRemoteRefArray('remoteIp', ['mbuffer', (qw(-q -s 128k -m)), 'mbufferSize', '-4', '-I', 'mbufferPort'], ['zfs', 'recv', 'recvOpt', 'dstDataSet']);
my $cmd = $shellQuote->(#recvCmd);
print Dumper $cmd;
The map function, by which I assume you mean
map {/^[-\/#=_0-9a-z]+$/i ? $_ : qq{'$_'}} #args
checks each argument to see if it is a legal shell token or not. Legal shell tokens are passed through; anything with a suspicious character gets enclosed on '' quotes.
Bear in mind that your example has two calls to $shellQuote, not just one; you're printing:
print Dumper($shellQuote->(
[
qw(ssh -o Compression=yes -o),
'ConnectTimeout=' . '333',
'remoteIp',
$shellQuote->(
[
'mbuffer',
(qw(-q -s 128k -m)),
'mbufferSize',
'-4',
'-I',
'mbufferPort',
],
[
'zfs',
'recv',
'recvOpt',
'dstDataSet',
],
),
]
));
Where I've indented the arguments to each shell command one step further than the command for clarity of the structure of the list. So your '' quotes are coming from the outer $shellQuote, which is recognizing that the inner $shellQuote has put spaces into its result; the | is comming from the inner $shellQuote, which is using them to combine the the two array refs passed to it.
Breaking the map function down, map { expr } #args means 'evaluation expr for each element of #args and make a list of the results.
/^[-\/#=_0-9a-z]+$/i ? $_ : qq{'$_'} is a ternary expression (Googleable term). $_ is the current element of #args, and /re/i is true if and only if $_ matches the given regular expression (Googleable term) (case insensitive). The whole expression means 'if the current element of #args contains only the listed characters (ASCII letters, ASCII digits, and the characters -, /, #, and =), return it as-is; otherwise return it wrapped in single quotes'.
The for loop, before that, replaces each ' in each element of #args with '"'"', which is a particular way of embedding a single quote into a single-quoted string in sh.
Ignore your code for a second and look at this one as it's a bit clearer.
# shell_quote_arg("foo bar") => 'foo bar'
sub shell_quote_arg {
my ($s) = #_;
return $s if $s !~ /[^-\/#=_0-9a-z]/i;
$s =~ s/'/'"'"'/g; # '
return qq{'$s'}
}
# shell_quote("echo", "foo bar") => echo 'foo bar'
sub shell_quote {
return join ' ', map { shell_quote_arg($_) } #_;
}
my $remote_shell_cmd1 = shell_quote('mbuffer', 'arg1a', 'arg1b');
my $remote_shell_cmd2 = shell_quote('zfs', 'arg2a', 'arg2b');
my $remote_shell_cmd = join(' | ', $remote_shell_cmd1, $remote_shell_cmd2);
my $local_shell_cmd = shell_quote('ssh', $host, $remote_shell_cmd);
My shell_quote is used to build a shell command from a program name and argument. For example,
shell_quote('zfs', 'recv', 'recvOpt', 'dstDataSet')
returns
zfs recv recvOpt dstDataSet
So why not just use join(' ', 'zfs', 'recv', 'recvOpt', 'dstDataSet')? Because characters such as spaces, $ and ' have special meaning to the shell. shell_quote needs to do extra work if these are present. For example,
shell_quote('echo', q{He's got $100})
returns
echo 'He'"'"'s got $100' # When run, uses echo to output: He's got $100
The shellQuote you showed does the same thing as my shell_quote, but it also does the join('|', ...) you see in my code.
By the way, notice that shellQuote is called twice. The first time, it's used to build the command to execute on the remote machine, as the following does:
my $remote_shell_cmd1 = shell_quote('mbuffer', 'arg1a', 'arg1b');
my $remote_shell_cmd2 = shell_quote('zfs', 'arg2a', 'arg2b');
my $remote_shell_cmd = join(' | ', $remote_shell_cmd1, $remote_shell_cmd2);
The second time, it's used to build the command to execute on the local machine, as the following does:
my $local_shell_cmd = shell_quote('ssh', $host, $remote_shell_cmd);
Kindly shed some light on these two ways of grep'ping in Perl as how they differ from each other
eval {grep /pattern/, ....};
and the normal one,
grep {/pattern/} ....
First of all, there are 2 independent differences between your alternatives, and they have different purposes. Wrapping the grep in eval allows you to catch errors that are normally fatal (like a syntax error in the regular expression). Putting a block after the grep keyword lets you use a matching rule that is more complex than a single expression.
Here are the 4 combinations that can be made out of your 2 examples:
#y = grep /pattern/, #x; # grep EXPR, no eval
#y = grep { /pattern/ } #x; # grep BLOCK, no eval
eval { #y = grep /pattern/, #x }; # grep EXPR inside eval BLOCK
eval { #y = grep { /pattern/ } #x }; # grep BLOCK inside eval BLOCK
Now we can look in more detail at 2 separate questions: what do you gain from the eval, and what do you gain from using the grep BLOCK syntax? In the simple cases shown above, you gain nothing from either one.
When you want to do a grep where the matching condition is more complicated than a simple regexp, grep BLOCK gives you more flexibility in how you express the condition. You can put multiple statements in the block and use temporary variables. For example this grep within a grep:
# Note: not the most efficient method for finding an intersection of arrays.
my #a = qw/A E I O U/;
my #b = qw/A B D O P Q R/;
my #intersection = grep { my $x = $_; grep { $_ eq $x } #b } #a;
print "#intersection\n";
In the above example, we needed a temporary $x to hold the value being tested by the outer grep so it could be compared to $_ in the inner grep. The inner grep could have been written without a BLOCK as grep $_ eq $x, #b but I think having using the same syntax for both looks better.
The eval block would be useful if you were looking for matches of a regexp that is determined at runtime, and you don't want your program to abort when the regexp is invalid. For example:
#x = qw/foo bar baz quux xyzzy/;
do {
print STDERR 'Enter pattern: ';
$pat = <STDIN>;
chomp $pat;
eval {
#y = grep /$pat/, #x;
};
} while($#);
print "result: #y\n";
We ask the user for a pattern and print the list of matches from #x. If the pattern is not a valid regexp, the eval catches the error and puts it into $#, and the program keeps running (The "Invalid" message is printed and the loop continues so the user can try again.) When a valid regexp is entered, there is no error so $# is false the "result" line is printed. Sample run:
Enter pattern: z$
result: baz
Enter pattern: ^(?!....)
result: foo bar baz
Enter pattern: ([^z])\1
result: foo quux
Enter pattern: [xyz
Invalid pattern
Enter pattern: [xyz]
result: baz quux xyzzy
Enter pattern: ^C
Note that eval doesn't catch syntax errors in a fixed regexp. Those are compiled when the script is compiled, so if you have a simple script like
perl -ne 'print if eval { /[xyz/ } or eval { /^ba/ }'
it fails immediately. The evals don't help. Compare to
perl -ne '$x = "[xyz"; $y = "^ba"; print if eval { /$x/ } or eval { /$y/ }'
which is the same thing but with regexps built from variables - this one runs and prints matches for /^ba/. The first eval always returns false (and sets $# which doesn't matter if you don't look at it).
Here is the thing I don't understand.
This script works correctly (notice the concatenation in the map functin):
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %aa = map { 'a' . '' => 1 } (1..3);
print Dumper \%aa;
__END__
output:
$VAR1 = {
'a' => 1
};
But without concatenation the map does not work. Here is the script I expect to work, but it does not:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %aa = map { 'a' => 1 } (1..3);
print Dumper \%aa;
__END__
output:
Not enough arguments for map at e.pl line 7, near "} ("
syntax error at e.pl line 7, near "} ("
Global symbol "%aa" requires explicit package name at e.pl line 9.
Execution of e.pl aborted due to compilation errors.
Can you please explain such behaviour?
Perl uses heuristics to decide whether you're using:
map { STATEMENTS } LIST; # or
map EXPR, LIST;
Because although "{" is often the start of a block, it might also be the start of a hashref.
These heuristics don't look ahead very far in the token stream (IIRC two tokens).
You can force "{" to be interpreted as a block using:
map {; STATEMENTS } LIST; # the semicolon acts as a disambigator
You can force "{" to be interpreted as a hash using:
map +{ LIST }, LIST; # the plus sign acts as a disambigator
grep suffers similarly. (Technically so does do, in that a hashref can be given as an argument, which will then be stringified and treated as if it were a filename. That's just weird though.)
Per the Documentation for map:
Because Perl doesn't look ahead for the closing } it has to take a guess at which it's dealing with based on what it finds just after the {. Usually it gets it right, but if it doesn't it won't realize something is wrong until it gets to the }
Giving the examples:
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
So adding + will give you the same as the first example you've given
my %aa = map { +'a'=> 1 } (1..3);
Perl's manpage entry for map() explains this:
"{" starts both hash references and blocks, so "map { ..."
could be either the start of map BLOCK LIST or map EXPR, LIST.
Because Perl doesn't look ahead for the closing "}" it has to
take a guess at which it's dealing with based on what it finds
just after the "{". Usually it gets it right, but if it doesn't
it won't realize something is wrong until it gets to the "}"
and encounters the missing (or unexpected) comma. The syntax
error will be reported close to the "}", but you'll need to
change something near the "{" such as using a unary "+" to give
Perl some help:
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
%hash = map { ("\L$_" => 1) } #array # this also works
%hash = map { lc($_) => 1 } #array # as does this.
%hash = map +( lc($_) => 1 ), #array # this is EXPR and works!
%hash = map ( lc($_), 1 ), #array # evaluates to (1, #array)
or to force an anon hash constructor use "+{":
#hashes = map +{ lc($_) => 1 }, #array # EXPR, so needs comma at end
to get a list of anonymous hashes each with only one entry
apiece.
Based on this, to get rid of the concatenation kludge, you'd need to adjust your syntax to one of these instead:
my %aa = map { +'a' => 1 } (1..3);
my %aa = map { ('a' => 1) } (1..3);
my %aa = map +( 'a' => 1 ), (1..3);
The braces are a little ambiguous in the context of map. They can be surrounding a block as you are intending, or they can be an anonymous hash constructor. There is some fuzzy logic in the perl parser which tries to guess which one you mean.
Your second case looks more like an anonymous hash to perl.
See the perldoc for map which explains this and gives some workarounds.
I'm trying to scan a file for lines containing a specific string, and print the lines to another file.
However, I need to print out multiple lines until ")" character IF the line containing the string ended in "," ignoring whitespaces.
Currently I'm using
for func in $fnnames
do
sed/"$func"/p <$file >>$CODEBASEDIR/function_signature -n
done
where $func contains the string I look for, but of course it doesn't work for the restriction.
Is there a way to do this? Currently using bash, but perl is fine also.
Thanks.
Your question is tricky because your restrictions are not precise. You say - I think - that a block should look like this:
foo,
bar,
baz)
Where foo is the string that starts the block, and closing parenthesis ends it. However, you could also be saying:
foo bar baz) xxxxxxxxxxx,
And you only want to print until the ), which is to say foo bar baz), IF the line ends with comma.
You could also be saying that only lines that end with a comma should be continued:
foo, # print + is continued
bar # print + is not continued
xxxxx # ignored line
foo # print + is not continued
foo,
bar,
baz) # closing parens also end block
Since I can only guess that you mean the first alternative, I give you two options:
use strict;
use warnings;
sub flip {
while (<DATA>) {
print if /^foo/ .. /\)\s*$/;
}
}
sub ifchain {
my ($foo, $print);
while (<DATA>) {
if (/^foo/) {
$foo = 1; # start block
print;
} elsif ($foo) {
if (/,\s*$/) {
print;
} elsif (/\)\s*$/) {
$foo = 0; # end block
print;
}
# for catching input errors:
else { chomp; warn "Mismatched line '$_'" }
}
}
}
__DATA__
foo1,
bar,
baz)
sadsdasdasdasd,
asda
adaffssd
foo2,
two,
three)
yada
The first one will print any lines found between a line starting with foo and a line ending with ). It will ignore the "lines end with comma" restriction. On the positive side, it can be simplified to a one-liner:
perl -ne 'print if /^foo/ .. /\)\s*$/' file.txt
The second one is just a simplistic if-structure that will consider both restrictions, and warn (print to STDERR) if it finds a line inside a block that does not match both.
perl -ne 'print if 1 .. /\).*,\s*$/'