Is there a Perl idiom which is the functional equivalent of calling a subroutine from within the substitution operator? - perl

Perl allows ...
$a = "fee";
$result = 1 + f($a) ; # invokes f with the argument $a
but disallows, or rather doesn't do what I want ...
s/((fee)|(fie)|(foe)|(foo))/f($1)/ ; # does not invoke f with the argument $1
The desired-end-result is a way to effect a substitution geared off what the regex matched.
Do I have to write
sub lala {
my $haha = shift;
return $haha . $haha;
}
my $a = "the giant says foe" ;
$a =~ m/((fee)|(fie)|(foe)|(foo))/;
my $result = lala($1);
$a =~ s/$1/$result/;
print "$a\n";

See perldoc perlop. You need to specify the e modifier so that the replacement part is evaluated.
#!/usr/bin/perl
use strict; use warnings;
my $x = "the giant says foe" ;
$x =~ s/(f(?:ee|ie|o[eo]))/lala($1)/e;
print "$x\n";
sub lala {
my ($haha) = #_;
return "$haha$haha";
}
Output:
C:\Temp> r
the giant says foefoe
Incidentally, avoid using $a and $b outside of sort blocks as they are special package scoped variables special-cased for strict.

Related

How can I extract the number from the output of a shell command?

The output for the command is ent3, and from that output I want 3 to be stored in a variable
Perl code
sub {
if ( $exit == 1 )
{
$cmdStr = "lsdev | grep en | grep VLAN | awk '{ print \$1 }'\r";
$result =_run_cmd($cmdStr);
my #PdAt_val = split("\r?\n", $result);
my $num = $result =~ /([0-9]+)/;
print "The char is $num\n";
$exit = 0;
exp_continue;
Tidied code
sub {
if ( $exit == 1 ) {
$cmdStr = "lsdev | grep en | grep VLAN | awk '{ print \$1 }'\r";
$result = _run_cmd($cmdStr);
my #PdAt_val = split("\r?\n", $result);
my $num = $result =~ /([0-9]+)/;
print "The char is $num\n";
$exit = 0;
exp_continue;
Your code that is doing the work here is:
my $num = $result =~ /([0-9]+)/;
Let's put that into a simple program so we can see what's going on.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my $result = 'ext3';
my $num = $result =~ /([0-9]+)/;
say $num;
And that prints 1. Which isn't what we want. What's going on?
Well, if you read the documentation for the match operator (in the section Regexp Quote-Like Operators in "perlop"), you'll see what the operator returns under different circumstances. It says:
Searches a string for a pattern match, and in scalar context returns true if it succeeds, false if it fails.
So that explains the behaviour we're seeing. That "1" is just a true value saying that the match succeeded. But how do we get the value that we have captured in our parentheses. There are a couple of ways. Firstly, it's written into the $1 variable.
my $num;
if ($result =~ /([0-9]+)/) {
$num = $1;
}
say $num;
But I think the other approach is what you were looking for. If you read on, you'll see what the operator returns in list context:
m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, that is, ($1, $2, $3 ...)
So if we put the match operator in list context, then we'll get the contents of $1 returned. How do we put a match into list context? By making the expression a list assignment - which we can do by putting parentheses around the left-hand side of the assignment.
my ($num) = $result =~ /([0-9]+)/;
say $num;
Using regex, something like this should work:
if($result =~ /([0-9]+)/) {
$num = $1;
}
print $num;

Perl unexpected result

Imagine I have this Perl script
my $name = " foo ";
my $sn = " foosu";
trim($name, \$sn);
print "name: [$name]\n";
print "sn: [$sn]\n";
exit 0;
sub trim{
my $fref_trim = sub{
my ($ref_input) = #_;
${$ref_input} =~ s/^\s+// ;
${$ref_input} =~ s/\s+$// ;
};
foreach my $input (#_){
if (ref($input) eq "SCALAR"){
$fref_trim->($input);
} else {
$fref_trim->(\$input);
}
}
}
Result:
name: [foo]
sn: [foosu]
I would expect $name to be "[ foo ]" when printing the value after calling trim, but the sub is setting $name as I would want it. Why is this working, when it really shouldn't?
I'm not passing $name by reference and the trim sub is not returning anything. I'd expect the trim sub to create a copy of the $name value, process the copy, but then the original $name would still have the leading and trailing white spaces when printed in the main code.
I assume it is because of the alias with #_, but shouldn't the foreach my $input (#_) force the sub to copy the value and only treat the value not the alias?
I know I can simplify this sub and I used it only as an example.
Elements of #_ are aliases to the original variables. What you are observing is the difference between:
sub ltrim {
$_[0] =~ s/^\s+//;
return $_[0];
}
and
sub ltrim {
my ($s) = #_;
$s =~ s/^\s+//;
return $s;
}
Compare your code to:
#!/usr/bin/env perl
my $name = " foo ";
my $sn = " foosu";
trim($name, \$sn);
print "name: [$name]\n";
print "sn: [$sn]\n";
sub trim {
my #args = #_;
my $fref_trim = sub{
my ($ref_input) = #_;
${$ref_input} =~ s/^\s+//;
${$ref_input} =~ s/\s+\z//;
};
for my $input (#args) {
if (ref($input) eq "SCALAR") {
$fref_trim->($input);
}
else {
$fref_trim->(\$input);
}
}
}
Output:
$ ./zz.pl
name: [ foo ]
sn: [foosu]
Note also that the loop variable in for my $input ( #array ) does not create a new copy for each element of the array. See perldoc perlsyn:
The foreach loop iterates over a normal list value and sets the scalar variable VAR to be each element of the list in turn. ...
...
the foreach loop index variable is an implicit alias for each item in the list that you're looping over.
In your case, this would mean that, at each iteration $input is an alias to the corresponding element of #_ which itself is an alias to the variable that was passed in as an argument to the subroutine.
Making a copy of #_ thus prevents the variables in the calling context from being modified. Of course, you could do something like:
sub trim {
my $fref_trim = sub{
my ($ref_input) = #_;
${$ref_input} =~ s/^\s+//;
${$ref_input} =~ s/\s+\z//;
};
for my $input (#_) {
my $input_copy = $input;
if (ref($input_copy) eq "SCALAR") {
$fref_trim->($input_copy);
}
else {
$fref_trim->(\$input_copy);
}
}
}
but I find making a wholesale copy of #_ once to be clearer and more efficient assuming you do not want to be selective.
I assume it is because of the alias with #_, but shouldn't the foreach my $input (#_) force the sub to copy the value and only treat the value not the alias?
You're right that #_ contains aliases. The part that's missing is that foreach also aliases the loop variable to the current list element. Quoting perldoc perlsyn:
If any element of LIST is an lvalue, you can modify it by modifying VAR inside the loop. Conversely, if any element of LIST is NOT an lvalue, any attempt to modify that element will fail. In other words, the foreach loop index variable is an implicit alias for each item in the list that you're looping over.
So ultimately $input is an alias for $_[0], which is an alias for $name, which is why you see the changes appearing in $name.

what is the usage of \& and $expr->()

sub reduce(&#) {
my $expr = \&{shift #ARG};
my $result = shift #ARG;
while (scalar #ARG > 0) {
our $a = $result;
our $b = shift #ARG;
$result = $expr->();
}
return $result;
}
I cannot really understand some grammar in this code. Anyone can explain to me? like \& and $result = $expr->()
\&name returns a reference to the subroutine named name.
$code_ref->() calls the subroutine referenced by $code_ref.
$ perl -e'
sub f { CORE::say "Hi" }
my $code_ref = \&f;
$code_ref->();
'
Hi
In your case, shift #ARG returns a subroutine reference. \&{ $code_ref } simply returns the code ref. As such,
my $expr = \&{shift #ARG};
could have been written as
my $expr = shift #ARG;
Note that reduce's prototype allows it to be called as
reduce { ... } ...
but what is actually executed is
reduce( sub { ... }, ... )
Note that this version of reduce is buggy. You should use the one provided by List::Util.
local $a and local $b should have been used to avoid clobbering the values its caller might have in $a and $b.
This version of reduce expects its callback to have been compiled in the same package as reduce itself. Otherwise, the callback sub won't be able to simply use $a and $b.
Declaring the variables using our is actually completely useless in this version case since $a and $b are exempt from use strict; checks, and the undeclared use of $a and $b would access the very same package variables.
Having a look some List::Util::reduce() examples will probably help.
Let's take the first one:
$foo = reduce { $a > $b ? $a : $b } 1..10;
So reduce takes a BLOCK followed by a LIST, which the function signature declares: sub reduce(&#) {. The block in our case is the statement $a > $b ? $a : $b, while the list is 1..10. From the docs:
Reduces #list by calling "BLOCK" in a scalar context multiple times,
setting $a and $b each time. The first call will be with $a and $b set to
the first two elements of the list, subsequent calls will be done by
setting $a to the result of the previous call and $b to the next element
in the list.
Returns the result of the last call to the "BLOCK". If #list is empty then
"undef" is returned. If #list only contains one element then that element
is returned and "BLOCK" is not executed.
And now to an annotated version of the code:
$foo = reduce { $a > $b ? $a : $b } 1..10; # $foo will be set to 10
sub reduce(&#) {
# reduce() takes a BLOCK followed by a LIST
my $expr = \&{shift #ARG};
# $expr is now a subroutine reference, i.e.
# $expr = sub { $a > $b ? $a : $b };
# Start by setting $result to the first item in the list, 1
my $result = shift #ARG;
# While there are more items in the list...
while (scalar #ARG > 0) {
# Set $a to the current result
our $a = $result;
# Set $b to the next item in the list
our $b = shift #ARG;
# Set $result to the result of $a > $b ? $a : $b
$result = $expr->();
}
# List has now been reduced by the operation $a > $b ? $a : $b
return $result;
}

Is there a better way to count occurrence of char in a string?

I felt there must a better way to count occurrence instead of writing a sub in perl, shell in Linux.
#/usr/bin/perl -w
use strict;
return 1 unless $0 eq __FILE__;
main() if $0 eq __FILE__;
sub main{
my $str = "ru8xysyyyyyyysss6s5s";
my $char = "y";
my $count = count_occurrence($str, $char);
print "count<$count> of <$char> in <$str>\n";
}
sub count_occurrence{
my ($str, $char) = #_;
my $len = length($str);
$str =~ s/$char//g;
my $len_new = length($str);
my $count = $len - $len_new;
return $count;
}
If the character is constant, the following is best:
my $count = $str =~ tr/y//;
If the character is variable, I'd use the following:
my $count = length( $str =~ s/[^\Q$char\E]//rg );
I'd only use the following if I wanted compatibility with versions of Perl older than 5.14 (as it is slower and uses more memory):
my $count = () = $str =~ /\Q$char/g;
The following uses no memory, but might be a bit slow:
my $count = 0;
++$count while $str =~ /\Q$char/g;
Counting the occurences of a character in a string can be performed with one line in Perl (as compared to your 4 lines). There is no need for a sub (although there is nothing wrong with encapsulating functionality in a sub). From perlfaq4 "How can I count the number of occurrences of a substring within a string?"
use warnings;
use strict;
my $str = "ru8xysyyyyyyysss6s5s";
my $char = "y";
my $count = () = $str =~ /\Q$char/g;
print "count<$count> of <$char> in <$str>\n";
In a beautiful* Bash/Coreutils/Grep one-liner:
$ str=ru8xysyyyyyyysss6s5s
$ char=y
$ fold -w 1 <<< "$str" | grep -c "$char"
8
Or maybe
$ grep -o "$char" <<< "$str" | wc -l
8
The first one works only if the substring is just one character long; the second one works only if the substrings are non-overlapping.
* Not really.
toolic has given a correct answer, but you might consider not hardcoding your values to make the program reusable.
use strict;
use warnings;
die "Usage: $0 <text> <characters>" if #ARGV < 1;
my $search = shift; # the string you are looking for
my $str; # the input string
if (#ARGV && -e $ARGV[0] || !#ARGV) { # if str is file, or there is no str
local $/; # slurp input
$str = <>; # use diamond operator
} else { # else just use the string
$str = shift;
}
my $count = () = $str =~ /\Q$search\E/gms;
print "Found $count of '$search' in '$str'\n";
This will allow you to use the program to count for the occurrence of a character, or a string, inside a string, a file, or standard input. For example:
count.pl needles haystack.txt
some_process | count.pl foo
count.pl x xyzzy

What does dot-equals mean in Perl?

What does ".=" mean in Perl (dot-equals)? Example code below (in the while clause):
if( my $file = shift #ARGV ) {
$parser->parse( Source => {SystemId => $file} );
} else {
my $input = "";
while( <STDIN> ) { $input .= $_; }
$parser->parse( Source => {String => $input} );
}
exit;
Thanks for any insight.
The period . is the concatenation operator. The equal sign to the right means that this is an assignment operator, like in C.
For example:
$input .= $_;
Does the same as
$input = $input . $_;
However, there's also some perl magic in this, for example this removes the need to initialize a variable to avoid "uninitialized" warnings. Try the difference:
perl -we 'my $x; $x = $x + 1' # Use of uninitialized value in addition ...
perl -we 'my $x; $x += 1' # no warning
This means that the line in your code:
my $input = "";
Is quite redundant. Albeit some people might find it comforting.
For pretty much any binary operator X, $a X= $b is equivalent to $a = $a X $b. The dot . is a string concatenation operator; thus, $a .= $b means "stick $b at the end of $a".
In your code, you start with an empty $input, then repeatedly read a line and append it to $input until there's no lines left. You should end up with the entire file as the contents of $input, one line at a time.
It should be equivalent to the loopless
local $/;
$input = <STDIN>;
(define line separator as a non-defined character, then read until the "end of line" that never comes).
EDIT: Changed according to TLP's comment.
You have found the string concatenation operator.
Let's try it :
my $string = "foo";
$string .= "bar";
print $string;
foobar
This performs concatenation to the $input var. Whatever is coming in via STDIN is being assigned to $input.