Perl && do { last; }; - perl

In my book it uses something like this:
for($ARGV[0])
{
Expression && do { print "..."; last; };
...
}
Isn't the for-loop incomplete? Also, what's the point of the do, couldn't it just be { ... }, or does the do have some importance here?

There are two forms of for statement in Perl. The one you're seeing here is often written as foreach, but for and foreach are synonyms. It normally iterates over a list, setting $_ to each element. In this case, the "list" is a single value, so it has the effect of setting $_ to $ARGV[0] for the body of the loop.
The do is needed to make the block { ... } into an expression, so it can be an operand of the && operator. (See what happens if you omit the word do.)
(And you were missing a semicolon; I've edited the question to fix that.)

Related

Perl sub returns a subroutine

I haven't used Perl for around 20 years, and this is confusing me. I've g******d for it, but I obviously haven't used a suitable search string because I haven't found anything relating to this...
Why would I want to do the following? I understand what it's doing, but the "why" escapes me. Why not just return 0 or 1 to begin with?
I'm working on some code where a sub uses "return sub"; here's a very truncated example e.g.
sub test1 {
$a = shift #_;
if ($a eq "cat") {
return sub {
print("cat test OK\n");
return 0;
}
}
# default if "cat" wasn't the argument
return sub {
print("test for cat did not work\n");
return 1;
}
}
$c = test1("cat");
print ("received: $c\n");
print ("value is: ",&$c,"\n");
$c = test1("bat");
print ("received: $c\n");
print ("value is: ",&$c,"\n");
In your code there is no reason to return a sub. However, with a little tweak
sub test1 {
my $animal = shift #_;
if ($animal eq "cat" || $animal eq "dog") {
return sub {
print("$animal test OK\n");
return 0;
};
}
# default if "cat" or "dog" wasn't the argument
return sub {
print("test for cat or dog did not work\n");
return 1;
};
}
We now have a closure around $animal this saves memory as the test for cat and dog share the same code. Note that this only works with my variables. Also note that $a and $b are slightly special to Perl, they are used in the block of code that you can pass to the sort function and bypass some of the checks on visibility so it's best to avoid them for anything except sort.
You probably want to search "perl closures".
There are many reasons that you'd want to return a code reference, but it's not something I can shortly answer in a StackOverflow question. Mark Jason Dominus's Higher Order Perl is a good way to expand your mind, and we cover a little of that in Intermediate Perl.
I wrote File::Find::Closures as a way to demonstrate this is class. Each subroutine in that module returns two code references—one for the callback to File::Find and the other as a way to access the results. The two share a common variable which nothing else can access.
Notice in your case, you aren't merely calling a subroutine to "get a zero". It's doing other things. Even in your simple example there's some output. Some behavior is then deferred until you actually use the result for something.
Having said that, we have no chance of understanding why the programmer who wrote your particular code did it. One plausible guess was that the system was set up for more complex situations and you're looking at a trivial example that fits into that. Another plausible guess was that the programmer just learned that feature and fell in love with it then used it everywhere for a year. There's always a reason, but that doesn't mean there's always a good reason.

how to find multiple regex patterns in a single way using Perl

Question Updated
I have list of (few more) regex patterns like: (Note: Sequence is very Important)
([a-z]+)(\d+)
\}([a-z]+)
([a-z]+)(\+|\-)
([0-9])\](\+|\-)
...
...
my input file like :
\ce{CO2}
\ce{2CO}
\ce{H2O}
\ce{Sb2O3}
...
...
In my code I am finding the each and every regex patterns like
if($string=~m/([a-z]+)(\d+)/g) { my statements ... }
if($string=~m/\}([a-z]+)/g) { my statements ... }
if($string=~m/([a-z]+)(\+|\-)/g) { my statements ... }
if($string=~m/([0-9])\](\+|\-)/g) { my statements ... }
Instead of doing the above code Is there any other way to simplify the code?
Could you someone please share your thoughts for my improvement for better coding.
Disclaimer: Your question is very hard to read, so this is pretty much guesswork. I am not sure I understand what you want to do.
When you are processing data in a dynamic way, a typical approach is to use a dispatch table. We can do something similar here. Often a hash or hash reference is used for that, but since we want a specific order, I will be using an array instead.
my #dispatch = (
{
pattern => qr/f(o)(o)/,
callback => sub {
my ($one, $two) = #_;
print "Found $one and $two\n";
},
},
{
pattern => qr/(bar)/,
callback => sub {
my $capture = shift;
print "Saw $capture";
},
},
);
This basically is a list of search patterns and associated instructions. Each pattern has a callback, which is a code reference. I decided it would make sense to pass in the capture variables, because your patterns have capture groups.
Now in order to call them, we iterate over the dispatch array, match the pattern and then call the associated callback, passing in all the captures.
my $text = "Foo bar foo bar baz.";
foreach my $search (#dispatch) {
if ($text =~ $search->{pattern}) {
$search->{callback}->(#{^CAPTURE}); # this requires Perl 5.26
}
}
Please note that I am using #{^CAPTURE}, which was added to Perl in version 5.25.7, so you would require at least the stable Perl 5.26 release to use it. (On an older Perl, my #capture = $t =~ $search->{pattern} and $search->{callback}->(#capture) will behave similarly).
This is way more elegant than having a list of if () {} statement because it's very easy to extend. The dispatch table could be created on the fly, based on some input, or entirely read from disk.
When we run this code, it creates the following output
Found o and o
Saw bar
This is not very spectacular, but you should be able to adapt it to your patterns. On the other hand I don't know what you are actually trying to do. If you wanted to modify the string instead of only matching, you might need additional arguments for your callbacks.
If you want to learn more about dispatch tables, I suggest you read the second chapter of Mark Jason Dominus' excellent book Higher Order Perl, which is available for free as a PDF on his website.
Your question is hard to read, mainly because you have the /g at the end of your regex searches (which returns a list), however, you only check if it matches once.
I'm making the following assumptions
All matches are required
the code can be a single or double match
both groups captured in one line
i think you want
while ( $string =~ /(([a-z]+)(\d+)|\}([a-z]+)|([a-z]+)(\+|\-)|([0-9])\](\+|\-))/g )
{
#$1 has the whole match
#$2 has the first group if defined
#$3 has the second group if defined
}
However, I prefer the method below. this will capture in one line
while ($string =~ /([a-z]+\d+|\}[a-z]+|[a-z]+\+|\-|[0-9]\]\+|\-)/g )
{
# in here split the match if required
}
I recommend you use regex comments to make this clearer.
if you just want a single match, use
if(
$string=~m/([a-z]+)(\d+)/ ||
$string=~m/\}([a-z]+)/ ||
$string=~m/([a-z]+)(\+|\-)/ ||
$string=~m/([0-9])\](\+|\-)/
)
{
#some code
}

Is there something like `last` for `map`?

In Perl, is it possible to arbitrarily end a map execution, e.g. something equivalent to last in a loop?
It would be a bit like this:
map {
if (test == true) { last; } dosomething
} #myarray
Nope. You can't last, next, etc. out of a map, grep or sort blocks because they are not loops.
Having said that, the code snippet in the question is poor practice because map should never be called in void context. They are supposed to be used to transform one list into another.
Here's a better way to write it if you really must inline it (modified after ysth's comment):
$_ == 10 && last, print for 1..15; # prints '123456789'
No. Use an ordinal foreach loop and the last statement.
Since 5.8.1, map is context aware - in void context, no lists are constructed.
Anyway, map is generally used to get a list from another list, evaluating expr for each element of the original list.
You could use a do-block with a for statement modifier:
do {
last if test;
dosomething;
} for (#myarray);
Though using a foreach block would probably be clearer, and future maintainers of your code will thank you.
foreach (#myarray) {
last if test;
dosomething;
}
You can use a long jump (eval/die pair) to bail out of any nested construct that doesn't directly support it:
eval { map{ die if test; dosomething } #myarray };
But as Zaid said, using a for/foreach loop in this case is better because you are not using the return value of map.
You want a for loop:
foreach ( #myarray ) {
last if test;
...
}
It does the same thing. map is for transforming one list into other lists.
There are map-like constructs that do exactly what you want to do. Take a look at List::Util and List::MoreUtils (conveniently also packaged together as List::AllUtils):
use List::MoreUtils 'first';
# get first element with a {foo} key
my $match = map { $_->{foo} eq 'some string' } #elements;
If you don't want to extract an element(s) from the list, then use foreach, as per the previous answers.
Try goto LABEL. However I do not know how safe is that.

What does "Useless use of a variable in void context" mean in this Perl script?

The following script gives me what I want but Perl also throws me a warning saying "Useless use of a variable in void context". What does it mean?
use strict;
use warnings;
my $example = 'http\u003a//main\u002egslb\u002eku6\u002ecom/c0/q7LmJPfV4DfXeTYf/1260269522170/93456c39545857a15244971e35fba83a/1279582254980/v632/6/28/a14UAJ0CeSyi3UTEvBUyMuBxg\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c1/q7LmJPfV4DfXeTYf/1260269522170/3cb143612a0050335c0d44077a869fc0/1279582254980/v642/10/20/7xo2MJ4tTtiiTOUjEpCJaByg\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c2/q7LmJPfV4DfXeTYf/1260269522170/799955b45c8c32c955564ff9bc3259ea/1279582254980/v652/32/4/6pzkCf4iqTSUVElUA5A3PpMAoA\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c3/q7LmJPfV4DfXeTYf/1260269522170/cebbb619dc61b3eabcdb839d4c2a4402/1279582254980/v567/36/19/MBcbnWwkSJu46UoYCabpvArA\u002ef4v\u002chttp\u003a//main\u002egslb\u002eku6\u002ecom/c4/q7LmJPfV4DfXeTYf/1260269522170/1365c39355424974dbbe4ae8950f0e73/1279582254980/v575/17/15/EDczAa0GTjuhppapCLFjtaQ\u002ef4v';
my #raw_url = $example =~ m{(http\\u003a.+?f4v)}g;
my #processed_url = map {
s{\\u003a}{:}g,$_;
s{\\u002e}{.}g,$_;
s{\\u002d}{#}g,$_;
} #raw_url;
print join("\n",#processed_url);
And why this map thing doesn't work if I omit those dollar underscores like so?
my #processed_url = map {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
} #raw_url;
When I omit those dollar underscores, I get nothing except for a possibly success flag "1". What am I missing? Any ideas? Thanks like always :)
What you want is...
my #processed_url = map {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
$_;
} #raw_url;
A map block returns the value composed of the last statement evaluated as its result. Thats why we pass the $_ as the last statement. The substitution operator s{}{} returns the number of substitutions made.
In your prior setup, you had by itself the following statement. Which is pretty much meaningless and that is what Perl is warning about.
s{\\u003a}{:}g, $_;
You already have the answer you were looking for, but I wanted to point out a subtlety about using the substitution operator inside a map block: your original array is also being modified. If you want to preserve the original array, one way to do it is to make a copy of the array, then modify only the copy:
my #processed_url = #raw_url;
for (#processed_url) {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
}
Or, if you only need one array, and you want the original to be modified:
for (#raw_url) {
s{\\u003a}{:}g;
s{\\u002e}{.}g;
s{\\u002d}{#}g;
}

Is there a simple way to validate a hash of hash element exists and is defined?

I need to validate a Perl hash of hash element such as $Table{$key1}{$key2} to exist and be defined. Here is what I do. (I have no idea $key1 even exists)
if
((defined $Table{$key1}) &&
(exists $Table{$key1}) &&
(defined $Table{$key1}{$key2}) &&
(exists $Table{$key1}{$key2}))
{
#do whatever
}
Is there an easier and cleaner way to do it?
You don't need to check each level of the heirarchy: you can just go for the value you care about. exists doesn't check for definedness, only if the slot in the hash exists (it could exist with an undefined value), so if you care that the value is defined, you would need to call defined rather than exists. If a value is not defined, it evaluates in boolean context to false, so we can type a little less and reduce your example to:
if ($Table{$key1}{$key2})
{
# do whatever
}
However, if the value in that key is defined but is "false" (numerically evaluates to zero, or is the empty string), this can cause a false negative, so we should explicitly check for definedness if this is a possibility:
if (defined $Table{$key1}{$key2})
{
# do whatever
}
If you don't want to autovivify $Table{$key1}, you can check for its existence first, which brings us to the "best" way for the general case:
if (exists $Table{$key1} and defined $Table{$key1}{$key2})
{
# do whatever
}
If you're going to do this a lot for various fields in a hash, you may want to add some OO-style accessor methods which would do this work for you:
sub has_field
{
my ($this, $fieldName) = #_;
return exists $this->{data} && defined $this->{data}{$fieldName});
}
I'm sure you've read it already, but it can't hurt to read the relevant documentation again:
perldoc -f exists
perldoc perldata
perldoc perldsc
Given an expression that specifies a hash element or array element, exists returns true if the specified element in the hash or array has ever been initialized, even if the corresponding value is undefined. The element is not autovivified if it doesn't exist.
...
A hash or array element can be true only if it's defined, and defined if it exists, but the reverse doesn't necessarily hold true.
The following is shorter and will protect from autovivifcation:
if (exists $table{$key1} and defined $table{$key1}{$key2}) {...}
The other checks in your code are not needed.
Check existence first, then defined-ness. (A value can exist without being defined but not be defined without existing.) You should test the intermediate levels with exists to prevent unintended autovivification. For the last level you only need to call defined. When there aren't too many layers it's easy to code directly:
if (exists $hash{a} && defined $hash{a}{b}) {...}
This gets awkward if there are many layers:
if (exists $hash{a} && exists $hash{a}{b} && exists $hash{a}{b}{c} ...) {...}
In that case, you can write a version of defined that doesn't autovivify intermediate values:
sub safe_defined {
my $h = shift;
foreach my $k (#_) {
if (ref $h eq ref {}) {
return unless exists $h->{$k};
$h = $h->{$k};
}
else {
return;
}
}
return defined $h;
}
You use it this way:
if (safe_defined(\%hash, qw(a b c))) {
say $hash{a}{b}{c};
}
Note: This version of the function is limited.
It only handles nested hashes. Perl lets you construct arbitrary data
structures, like a hash of arrays of scalar references...
It doesn't support blessed references (i.e. objects).
A truly generic version is left as an exercise for the reader. ;)
You could check out Data::Diver. It dives into data structures without autovivifying. The syntax would be:
if ( defined Dive(\%Table, $key1, $key2) ) { ... }
or even:
if ( defined(my $value = Dive(\%Table, $key1, $key2) ) ) {
...do something with $value...
}
Great! Thanks you all for the reply.
Since the autovivifying is an issue for me, currently i am using the "awkward" approach, i.e.
if (exists $Table{$key1} && defined $Table{$key1}{$key2}) {
Do whatever
}
It works for me, however as you guys said, i have 3-4 level deep of nested hash, the code is bit of messy.
I will check out Data:Diver. That one looks nicer.
Thanks, again,