Perl syntactical details - list context with or without parens? - perl

This works:
my $r = someSubroutine( map { ( 0 => $_ ) } #hosts)
This does not work, giving a syntax error:
my $r = someSubroutine( map { 0 => $_ } #hosts)
What I think I understand is that the { } after the map amounts to a closure or anonymous subroutine.
But if I put a "value, value" at the end of a normal subroutine, it will return a list of those values. If I use this brevity with the map, it is a syntax error.

First of all, this is a very strange statement. The list that map produces will look like
0, $hosts[0], 0, $hosts[1], 0, $hosts[2], ...
so it's useless for assignment to a hash as it would be the same as
my %hash = (0 => $hosts[-1])
map will accept either a BLOCK (which is what you're using) or a simple EXPRESSION for its first parameter. The problem here is that { 0 => $_ } looks very like an anonymous hash with a single element, which is an EXPRESSION, and that is what the parser guesses it is. An EXPRESSION requires a comma after it, before the second parameter, but when perl gets to the closing brace in map { 0 => $_ } #hosts it doesn't find one so it has to throw a syntax error as it is too far to backtrack to the opening brace and assume a block instead
The documentation puts it like this
{ starts both hash references and blocks, so map { ... could be either the start of map BLOCK LIST or map EXPR, LIST. Because Perl doesn't look ahead for the closing } it has to take a guess at which it's dealing with based on what it finds just after the {. Usually it gets it right, but if it doesn't it won't realize something is wrong until it gets to the } and encounters the missing (or unexpected) comma. The syntax error will be reported close to the }, but you'll need to change something near the { such as using a unary + or semicolon to give Perl some help
The solution is to disambiguate it as you discovered. Any of these will work
map +( 0 => $_ ), #hosts
map(( 0 => $_ ), #hosts)
map { +0 => $_ } #hosts
map { ( 0 => $_ ) } #hosts
map { ; 0 => $_ } #hosts

map has two syntax:
map BLOCK LIST e.g. map { f() } g()
map EXPR, LIST e.g. map f(), g()
When Perl encounters map, it needs to determine which syntax was used. Let's say the first token after map is {. That's the start of a BLOCK, right? Hold on! Expressions can start with { too!
my $hash_ref = { key => 'val' };
The grammar is ambiguous. Perl has to "guess" which syntax you are using. Perl looks ahead at the next token to help guess, but sometimes it guesses incorrectly nonetheless. This is one of those cases.
The following are the standard workarounds for this:
map {; ... } LIST # Force Perl to recognize the curly as the start of a BLOCK
map +{ ... }, LIST # Force Perl to recognize the curly as the start of a hash constructor
; can't be part of a hash constructor, so the { can only start a BLOCK.
+ necessarily starts an EXPR (and not a BLOCK). It's an operator that does nothing but help in situations like this.
For example,
map {; +{ $row->{id} => $row->{val} } } #rows

This is described in perldoc on map: http://perldoc.perl.org/functions/map.html
In short you should use little helper like parens or +-symbol so perl will be able to parse {...} construct correctly:
my $r = someSubroutine( map { + 0 => $_ } #hosts)

Related

Odd use of False constant in if-then statement

Python is my main language, but have to maintain a rather large legacy Perl codebase.
I have an odd logic statement that I can't make heads or tails over.
At top, a constant is defined as:
use constant FALSE => 0;
sub thisFunc {
FALSE if ($self->{_thisVar} ne "tif");
...
...
return statement,etc..
}
So I'm reading that as a kinda' fancy, non-standard if-then statement,
that if $thisVar string is not equal to "tif", then FALSE. Huh?
Not something like $that = FALSE, just FALSE.
The form of this statement appears in the file several times.
This codebase is in use, and vetted over the years by very good team,
so I think it is valid and has meaning. "use strict;" is set at top.
Could someone be so kind as to explain what is meant by logic.
I've Google'd it but no joy.
Thanks in advance,
"if" logic in Perl can be constructed in couple of ways:
the obvious one:
if ($flag) { do_something() }
less obvious one:
do_something() if ($flag);
This example shows how exactly behaves that odd "FALSE if" statement - which only meaning is found when it is LAST statement in subroutine:
use strict;
use constant FALSE => 0;
sub thisFunc {
my $arg = shift;
FALSE if ($arg ne "tif");
}
print "return val: ".thisFunc("ble")."\n";
print "return val: ".thisFunc("tif")."\n";
output from running above is:
return val: 0
return val:
It is pointless. I suspect it's suppose to be
return FALSE if $self->{_thisVar} ne "tif";
There is a similar construct that isn't pointless. If the loop condition has side-effects, the following isn't pointless:
1 while some_sub();
aka
while (some_sub()) { }
aka
while (1) {
some_sub()
or last;
}
Practical example:
$ perl -E'$_ = "xaabbx"; 1 while s/ab//; say'
xx

Folding Array of hashes to HoH

I have $maps as AoH which I wish to make a $new_map to be a HoH based on a member of the enclosing hashes.
I currently have:
map { $new_map->{$_->{TYPE}} = $_; delete $_->{TYPE} } #$maps;
This does the job..
I wonder if there's a better/simpler/cleaner way to get the intent. Perhaps, by getting the return value from map?
$new_map = map { ... } #$maps;
Thanks
Your original solution is a misuse of map as it doesn't use the list that the operator returns. for is the correct tool here, and I think it reads much better that way too, especially if you use the fact that delete returns the value of the element it has removed
$new_map->{ delete $_->{TYPE} } = $_ for #$maps;
Or you could translate the array using map properly, as here
my %new_map = map { delete $_->{TYPE} => $_ } #$maps;
The choice is your own
Using map in void context obfuscates the intent, and altering original #$maps may not be a good idea (map with side effects?), thus
my $new_map = {
map { my %h = %$_; delete $h{TYPE} => \%h } #$maps
};

Context and the Comma Operator

One of my colleagues used a comma after a statement instead of a semi-colon, what resulted was similar to the below code:
my $SPECIAL_FIELD = 'd';
my %FIELD_MAP = (
1 => 'a',
2 => 'b',
3 => 'c',
);
sub _translate_to_engine {
my ($href) = #_;
my %mapped_hash
= map { $FIELD_MAP{$_} => $href->{$_} } keys %$href;
$mapped_hash{$SPECIAL_FIELD} = FakeObject->new(
params => $mapped_hash{$SPECIAL_FIELD}
), # << comma here
return \%mapped_hash;
}
At first I was surprised that this passed perl -c, then I remembered the comma operator and thought I understood what was going on, but the results of the two print statements below made me doubt again.
my $scalar_return = _translate_to_engine(
{ 1 => 'uno', 2 => 'dos', 3 => 'tres' }
);
print Dumper $scalar_return;
# {'c' => 'tres','a' => 'uno','b' => 'dos','d' => bless( {}, 'FakeObject' )}
This call was made in scalar context and the result that I get is the expected result. The comma operator evaluated the LHS of the comma discarded, then evaluated the RHS. I don't believe that it can return the value of the RHS here, because evaluating the return statements leaves the subroutine.
my #list_return = _translate_to_engine(
{ 1 => 'uno', 2 => 'dos', 3 => 'tres' }
);
print Dumper \#list_return;
# [{'c' => 'tres','a' => 'uno','b' => 'dos','d' => bless( {}, 'FakeObject' )}]
This call was made in list context but the result I get is effectively the same as the call in scalar context. What I think is happening here: Both arguments are evaluated since the sub was called in list context, when the RHS is evaluated the return statement is executed so the LHS is effectively discarded.
Any clarification on the specific semantics that happen in either case would be appreciated.
Your explanation is accurate.
The context in which _translate_to_engine is called affects is the context in which all final expressions of the function are evaluated, including the argument to all return. There are two expressions affected in this case: the comma you mentioned, and \%mapped_hash.
In the first test, the returned value is \%mapped_hash evaluated in scalar context. And in the second, the returned value is \%mapped_hash evaluated in list context. \%mapped_hash evaluates to a reference to the hash, regardless of context. As such, the result of the sub is the same regardless of context.
The LHS of the expression is $mapped_hash{$SPECIAL_FIELD} = FakeObject->new(...) and the RHS is return \%mapped_hash. As you said, the comma operator evaluates the left hand side (which assigns a FakeObject instance to the hash key d), and then evaluates the right hand side, causing the sub to return the hashref. Makes sense to me.
Calling the sub in list or scalar context doesn't matter. It isn't going to change the context of the comma operator, which is the same in both cases.

Perl nesting hash of hashes

I'm having some trouble figuring out how to create nested hashes in perl based on the text input.
i need something like this
my % hash = {
key1 => \%inner-hash,
key2 => \%inner-hash2
}
However my problem is I don't know apriori how many inner-hashes there would be. To that end I wrote the following piece of snippet to test if a str variable can be created in a loop and its reference stored in an array and later dereferenced.
{
if($line =~ m/^Limit\s+$mc_lim\s+$date_time_lim\s+$float_val\s+$mc\s+$middle_junk\s+$limit \s+$value/) {
my $str = $1 . ' ' . $2 . ' ' . $7;
push (#test_array_reference, \$str);
}
}
foreach (#test_array_reference) {
say $$_;
}
Perl dies with a not a scalar run-time error. I'm a bit lost here. Any help will be appreciated.
To answer your first (main?) question, you don't need to know how many hashes to create if you walk through the text and create them as you go. This example uses words of a string, delimited by spaces, as keys but you can use whatever input text for your purposes.
my $text = 'these are just a bunch of words';
my %hash;
my $hashRef = \%hash; # create reference to initial hash
foreach (split('\s', $text)){
$hashRef->{$_} = {}; # create anonymous hash for current word
$hashRef = $hashRef->{$_}; # walk through hash of hashes
}
You can also refer to any arbitrary inner hash and set the value by,
$hash{these}{are}{just}{a}{bunch}{of}{words} = 88;
$hash{these}{are}{just}{a}{bunch}{of}{things} = 42;
$hash{these}{things} = 33;
To visualize this, Data:Dumper may help,
print Dumper %hash;
Which generates,
$VAR1 = 'these';
$VAR2 = {
'things' => 33,
'are' => {
'just' => {
'a' => {
'bunch' => {
'of' => {
'things' => 42,
'words' => 88
}
}
}
}
}
};
my $hashref = { hash1 => { key => val,... },
hash2 => { key => val,..} };
also you may want to use the m//x modifier with your regex, its barely readable as it is.
Creating a hash of hashes is pretty simple:
my %outer_hash = {};
Not entirely necessary, but this basically means that each element of your hash is a reference to another hash.
Imagine an employee hash keyed by employee number:
$employee{$emp_num}{first} = "Bob";
$employee{$emp_num}{last} = "Smith";
$employee{$emp_num}{phones}{cell} = "212-555-1234";
$employee{$emp_num}{phones}{desk} = "3433";
The problem with this notation is that it gets rather hard to read after a while. Enter the arrow notation:
$employee{$emp_num}->{first} = "Bob";
$employee{$emp_num}->{last} = "Smith";
$employee{$emp_num}->{phones}->{cell} = "212-555-1234";
$employee{$emp_num}->{phones}->{desk} = "3433";
The big problem with complex structures like this is that you lose the use strict ability to find errors:
$employee{$emp_num}->{Phones}->{cell} = "212-555-1234";
Whoops! I used Phones instead of phones. When you start using this type of complex structure, you should use object oriented syntax. Fortunately, the perlobj tutorial is pretty easy to understand.
By the way, complex data structure handling and the ability to use object oriented Perl puts you into the big leagues. It's the first step into writing more powerful and complex Perl.

How can I cleanly handle error checking in Perl?

I have a Perl routine that manages error checking. There are about 10 different checks and some are nested, based on prior success. These are typically not exceptional cases where I would need to croak/die. Also, once an error occurs, there's no point in running through the rest of the checks.
However, I can't seem to think of a neat way to solve this issue except by using something analogous to the following horrid hack:
sub lots_of_checks
{
if(failcond)
{
goto failstate:
}
elsif(failcond2)
{
goto failstate;
}
#This continues on and on until...
return 1; #O happy day!
failstate:
return 0; #Dead...
}
What I would prefer to be able to do would be something like so:
do
{
if(failcond)
{
last;
}
#...
};
An empty return statement is a better way of returning false from a Perl sub than returning 0. The latter value will actually be true in list context:
sub lots_of_checks {
return if fail_condition_1;
return if fail_condition_2;
# ...
return 1;
}
Perhaps you want to have a look at the following articles about exception handling in perl5:
perl.com: Object Oriented Exception Handling in Perl
perlfoundation.com: Exception Handling in Perl
You absolutely can do what you prefer.
Check: {
last Check
if failcond1;
last Check
if failcond2;
success();
}
Why would you not use exceptions? Any case where the normal flow of the code should not be followed is an exception. Using "return" or "goto" is really the same thing, just more "not what you want".
(What you really want are continuations, which "return", "goto", "last", and "throw" are all special cases of. While Perl does not have full continuations, we do have escape continuations; see http://metacpan.org/pod/Continuation::Escape)
In your code example, you write:
do
{
if(failcond)
{
last;
}
#...
};
This is probably the same as:
eval {
if(failcond){
die 'failcond';
}
}
If you want to be tricky and ignore other exceptions:
my $magic = [];
eval {
if(failcond){
die $magic;
}
}
if ($# != $magic) {
die; # rethrow
}
Or, you can use the Continuation::Escape module mentioned above. But
there is no reason to ignore exceptions; it is perfectly acceptable
to use them this way.
Given your example, I'd write it this way:
sub lots_of_checks {
local $_ = shift; # You can use 'my' here in 5.10+
return if /condition1/;
return if /condition2/;
# etc.
return 1;
}
Note the bare return instead of return 0. This is usually better because it respects context; the value will be undef in scalar context and () (the empty list) in list context.
If you want to hold to a single-exit point (which is slightly un-Perlish), you can do it without resorting to goto. As the documentation for last states:
... a block by itself is semantically identical to a loop that executes once.
Thus "last" can be used to effect an early exit out of such a block.
sub lots_of_checks {
local $_ = shift;
my $all_clear;
{
last if /condition1/;
last if /condition2/;
# ...
$all_clear = 1; # only set if all checks pass
}
return unless $all_clear;
return 1;
}
If you want to keep your single in/single out structure, you can modify the other suggestions slightly to get:
sub lots_of_checks
{
goto failstate if failcond1;
goto failstate if failcond2;
# This continues on and on until...
return 1; # O happy day!
failstate:
# Any clean up code here.
return; # Dead...
}
IMO, Perl's use of the statement modifier form "return if EXPR" makes guard clauses more readable than they are in C. When you first see the line, you know that you have a guard clause. This feature is often denigrated, but in this case I am quite fond of it.
Using the goto with the statement modifier retains the clarity, and reduces clutter, while it preserves your single exit code style. I've used this form when I had complex clean up to do after failing validation for a routine.