Why does this map block contain an apparently useless +? - perl

While browsing the source code I saw the following lines:
my #files_to_keep = qw (file1 file2);
my %keep = map { + $_ => 1 } #files_to_keep;
What does the + do in this code snippet? I used Data::Dumper to see whether taking out the plus sign does anything, but the results were the same:
$ perl cleanme.pl
$VAR1 = {
'file1' => 1,
'file2' => 1
};

This is used to prevent a parsing problem. The plus symbol forces the interpreter to behave like a normal block and not an expression.
The fear is that perhaps you are trying to create a hashreference using the other (expression) formulation of map like so.
#array_of_hashrefs = map { "\L$_" => 1 }, #array
Notice the comma. Then if the parser guesses that you are doing this given the statement in the OP there will a syntax error for missing the comma! To see the difference try quoting "$_". For whatever reason, the parser takes this as enough to trigger the expression behavior.
Yes its an oddity. Therefore many extra-paranoid Perl programmers toss in the extra plus sign more often than needed (me included).
Here are the examples from the map documentation.
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
%hash = map { ("\L$_" => 1) } #array # this also works
%hash = map { lc($_) => 1 } #array # as does this.
%hash = map +( lc($_) => 1 ), #array # this is EXPR and works!
%hash = map ( lc($_), 1 ), #array # evaluates to (1, #array)
For a fun read (stylistically) and a case where the parser gets it wrong read this: http://blogs.perl.org/users/tom_wyant/2012/01/the-case-of-the-overloaded-curlys.html

The unary-plus operator simply returns its operand unchanged. Adding one doesn't even change the context.
In the example you gave, it is completely useless. But there are situations where it is useful to make the next token something that's undeniably an operator.
For example, map has two syntaxes.
map EXPR, LIST
and
map BLOCK LIST
A block starts with {, but so can an expression. For example, { } can be a block or a hash constructor.
So how can map tell the difference? It guesses. Which means it's sometimes wrong.
One occasion where is guesses wrong is the following:
map { $_ => 1 }, #list
You can prod it in to guessing correctly using + or ;.
map {; ... # BLOCK
map +{ ... # EXPR
So in this case, you could use
map +{ foo => $_ }, #list
Note that you could also use the following:
map({ foo => $_ }, #list)
Another example is when you omit the parens around arguments, and the first argument expression starts with a paren.
print ($x+$y)*2; # Same as: 2 * print($x+$y)
It can be fixed using
print +($x+$y)*2;
But why pile on a hack just to avoid parens? I prefer
print(($x+$y)*2);

Related

Perl "reverse comma operator" (Example from the book Programming Perl, 4th Edition)

I'm reading "Programming Perl" and ran into a strange example that does not seem to make sense. The book describes how the comma operator in Perl will return only the last result when used in a scalar context.
Example:
# After this statement, $stuff = "three"
$stuff = ("one", "two", "three");
The book then gives this example of a "reverse comma operator" a few pages later (page 82)
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
However to me this doesn't seem to be reverse at all..?
Example:
# After this statement, $stuff = "three"
$stuff = reverse_comma("one", "two", "three");
# Implementation of the "reverse comma operator"
sub reverse_comma {
return (pop(#_), pop(#_))[0];
}
How is this in any way reverse of the normal comma operator? The results are the same, not reversed!
Here is a link to the exact page. The example is near the bottom.
It's a bad example, and should be forgotten.
What it's demonstrating is simple:
Normally, if you have a sequence of expressions separated by commas in scalar context, that can be interpreted an instance of the comma operator, which evaluates to the last thing in the sequence.
However, if you put that sequence in parentheses and stick [0] at the end, it turns that sequence into a list and takes its first element, e.g.
my $x = (1, 2, 3)[0];
For some reason, the book calls this the "reverse comma operator". This is a misnomer; it's just a list that's having its first element taken.
The book is confusing matters further by using the pop function twice in the arguments. These are evaluated from left to right, so the first pop evaluates to "three" and the second one to "two".
In any case: Don't ever use either the comma or "reverse comma" operators in real code. Both are likely to prove confusing to future readers.
It's a cute and clever example, but thinking too hard about it distracts from its purpose. That section of the book is showing off list slices. Anything beyond slicing a list, no matter what is in the list, is not germane to the purpose of the examples.
You're only on page 82 of a very big book (we literally couldn't fit any more pages in because we were at the limit of the binding method), so there's not much we could throw at you. Among the other list slices examples, there this clever one that I wouldn't use in real code. That's the curse of contrived examples though.
But let's say there were a reverse comma operator. It would have to evaluate both side of the comma. Many answers go right for "just return the first thing". That's not the feature though. You have to visit every expression even though you keep one of them.
Consider this much much advanced version with a series of anonymous subroutines that I immediately dereference, each of which prints something then returns a result:
use v5.10;
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
);
The parens are there only for precedence since the assignment operator binds more tightly than the comma operator. There's no list here, even though it looks like there is one.
The output shows that Perl evaluated each even though it kept on the last one:
First
wantarray is 0
Second
Third
Scalar is [137]
The poorly-named wantarray built-in returns false, noting that the subroutine thinks it is in scalar context.
Now, suppose that you wanted to flip that around so it still evaluates every expression but keeps the first one. You can use a literal list access:
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
)[0];
With the addition on the subscription, the righthand side is now a list and I pull out the first item. Notice that the second subroutine thinks it is in list context now. I get the result of the first subroutine:
First
wantarray is 1
Second
Third
Scalar is [35]
But, let's put this in a subroutine. I still need to call each subroutine even though I won't use the results of the other ones:
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
sub reverse_comma { ( map { $_->() } #_ )[0] }
Or, would I use the results of the other ones? What if I did something slightly different. I'll add a side effect of setting $last to the evaluated expression:
use v5.10;
my $last;
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
say "Last is [$last]";
sub reverse_comma { ( map { $last = $_->() } #_ )[0] }
Now I see the feature that makes the scalar comma interesting. It evaluates all the expressions, and some of them might have side effects:
First
wantarray is 0
Second
Third
Scalar is [35]
Last is [137]
It's not a huge secret that tchrist is handy with shell scripts. The Perl to csh converter was basically his inbox (or comp.lang.perl). You'll see some shell like idioms popping up in his examples. Something like that trick to swap two numerical values with one statement:
use v5.10;
my $x = 17;
my $y = 137;
say "x => $x, y => $y";
$x = (
$x = $x + $y,
$y = $x - $y,
$x - $y,
);
say "x => $x, y => $y";
The side effects are important there.
So, back to the Camel, we have an example of where the thing that has the side effect is the pop array operator:
use v5.10;
my #foo = qw( g h j k );
say "list: #foo";
my $scalar = sub { return ( pop(#foo), pop(#foo) )[0] }->();
say "list: #foo";
This shows off that each expression on either side of all the commas are evaluated. I threw in the subroutine wrapper since we didn't show a complete example:
list: g h j k
list: g h
But, none of this was the point of that section, which was indexing into a list literal. The point of the example was not to return a different result than the other examples or Perl features. It was to return the same thing assuming the comma operator acted differently. The section is about list slices and is showing off list slices, so the stuff in the list wasn't the important part.
Let's make the examples more similar.
The comma operator:
$scalar = ("one", "two", "three");
# $scalar now contains "three"
The reverse comma operator:
$scalar = ("one", "two", "three")[0];
# $scalar now contains "one"
It's a "reverse comma" because $scalar gets the result of the first expression, where the normal comma operator gives the last expression. (If you know anything about Lisp, it's like the difference between progn and prog1.)
An implementation of the "reverse comma operator" as a subroutine would look something like this:
sub reverse_comma {
return shift #_;
}
An ordinary comma will evaluate its operands and then return the value of the right-hand operand
my $v = $a, $b
sets $v to the value of $b
For the purpose of demonstrating list slices, the Camel is proposing some code that behaves like the comma operator but instead evaluates its operands and then return the value of the left-hand operand
Something like that can be done with a list slice, like this
my $v = ($a, $b)[0]
which sets $v to the value of $a
That's all there is to it really. The book isn't trying to suggest that there should be a reverse comma subroutine, it is simply considering the problem of evaluating two expressions in order and returning the first. The order of evaluation is relevant only when the two expressions have side effects, which is why the example in the book uses pop, which changes the array as well as returning a value
The imaginary problem is this
Suppose I want a subroutine to remove the last two elements of an array, and then return the value of what used to be the last element
Ordinarily that would require a temporary variable, like this
my $last = pop #foo;
pop #foo;
return $last;
But as an example of a list slice the code suggests that this would also work
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
Please understand that this isn't a recommendation. There are a few ways to do this. Another single-statement way is
return scalar splice #foo, -2;
but that doesn't use a list slice, which is the topic of that section of the book. In reality I doubt if the book's authors would propose anything other than the simple solution with the temporary variable. It is purely an example of what a list slice can do
I hope that helps

Use of uninitialized value $_ in hash element

This is regarding a warning message I received when running a Perl script.
I understand why I'm receiving this warning: probably because $element is undefined when being called but I don't see it.
for ( my $element->{$_}; #previous_company_names; ) {
map { $element => $previous_company_names->{$_} }
0 .. $previous_company_names;
The result is this message
Use of uninitialized value $_ in hash element
First and foremost - for a new programmer, absolutely the most important thing you must do, is use strict; and use warnings;. You've got my in there, which suggests you might be, but it pays to re-iterate it.
$_ is a special variable, called the implicit variable. It doesn't really make sense to use it in the way you're doing like that, in a for loop. Take a look at perlvar for some more detail.
Indeed, I'd suggest steering clear of map entirely until you really grok it, because it's a good way to confuse yourself.
With a for (or foreach) loop you can either:
for my $thing ( #list_of_things ) {
print $thing;
}
Or you can do:
for ( #list_of_things ) {
print $_;
}
$_ is set implicitly by each iteration of the second loop, which can be quite useful because lots of things default to using it.
E.g.
for ( #list_of_things ) {
chomp;
s/ /_/g;
print;
}
When it comes to map - map is a clever little function, that lets you evaluate a code block for each element in a list. Personally - I still get confused by it, and tend to stick with for or foreach loops instead, most of the time.
But what you're doing with it, isn't really going to work - map makes a hash.
So something like:
use Data::Dumper;
my %things = map { $_ => 1 } 1..5;
print Dumper \%things;
This creates the hash 'things':
$VAR1 = {
'1' => 1,
'3' => 1,
'5' => 1,
'4' => 1,
'2' => 1
};
Again, $_ is used inside, because it's the magic variable - it's set to 'whatever was in the second bit' (e.g 1,2,3,4,5) each loop, and then the block is evaluated.
So your map expression doesn't really make a lot of sense, because you don't have $element defined... and even if you did, you'd repeatedly overwrite it.
I would also note - $previous_company_names would need to be numeric, and is in NO way related to #previous_company_names. You might be meaning to use $#previous_company_names which is the last element index.

Description of Perl hash syntax in use statements

In Perl the following is allowed
use constant MY_CONSTANT => 1
however this does not match the documentation of "use" which states that it can take a list. The above is however not a list in the normal way as shown by the following command.
perl -e 'use strict; my #l = "test" => 1; print "#l\n"
This will print "test" and not "test 1".
So is this some special list syntax that can be used together with the use statement or is it also allowed in other cases?
MY_CONSTANT => 1 isn't "a hash".
The => is essentially just a comma, with the additional property that a “bareword” on the left side will be autoquoted: foo => 42 is exactly the same as 'foo', 42. Therefore we can do silly stuff like foo => bar => baz => 42. The “fat comma” should be used to indicate a relation between the left and the right value, e.g. between a hash key and value.
LIST in use Module LIST doesn't mean you need to use the list operator
LIST simply refers to an arbitrary expression that will be evaluated in list context, so not only does list operator MY_CONSTANT => 1 match the specified syntax, but so would the following:
sub f { MY_CONSTANT => 1 }
use constant f();
Be wary of precedence
The next problem you're running into is that the = operator has higher precedence than ,:
my #array = 1, 2, 3;
parses as
(my #array = 1), 2, 3;
As => is the same as ,, the line my #array = test => 1; will parse as
(my #array = "test"), 1;
Use parens to indicate the correct precedence:
my #array = (test => 1);
which will produce your expected output.

Why does Perl function "map" give the error "Not enough arguments for map"

Here is the thing I don't understand.
This script works correctly (notice the concatenation in the map functin):
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %aa = map { 'a' . '' => 1 } (1..3);
print Dumper \%aa;
__END__
output:
$VAR1 = {
'a' => 1
};
But without concatenation the map does not work. Here is the script I expect to work, but it does not:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %aa = map { 'a' => 1 } (1..3);
print Dumper \%aa;
__END__
output:
Not enough arguments for map at e.pl line 7, near "} ("
syntax error at e.pl line 7, near "} ("
Global symbol "%aa" requires explicit package name at e.pl line 9.
Execution of e.pl aborted due to compilation errors.
Can you please explain such behaviour?
Perl uses heuristics to decide whether you're using:
map { STATEMENTS } LIST; # or
map EXPR, LIST;
Because although "{" is often the start of a block, it might also be the start of a hashref.
These heuristics don't look ahead very far in the token stream (IIRC two tokens).
You can force "{" to be interpreted as a block using:
map {; STATEMENTS } LIST; # the semicolon acts as a disambigator
You can force "{" to be interpreted as a hash using:
map +{ LIST }, LIST; # the plus sign acts as a disambigator
grep suffers similarly. (Technically so does do, in that a hashref can be given as an argument, which will then be stringified and treated as if it were a filename. That's just weird though.)
Per the Documentation for map:
Because Perl doesn't look ahead for the closing } it has to take a guess at which it's dealing with based on what it finds just after the {. Usually it gets it right, but if it doesn't it won't realize something is wrong until it gets to the }
Giving the examples:
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
So adding + will give you the same as the first example you've given
my %aa = map { +'a'=> 1 } (1..3);
Perl's manpage entry for map() explains this:
"{" starts both hash references and blocks, so "map { ..."
could be either the start of map BLOCK LIST or map EXPR, LIST.
Because Perl doesn't look ahead for the closing "}" it has to
take a guess at which it's dealing with based on what it finds
just after the "{". Usually it gets it right, but if it doesn't
it won't realize something is wrong until it gets to the "}"
and encounters the missing (or unexpected) comma. The syntax
error will be reported close to the "}", but you'll need to
change something near the "{" such as using a unary "+" to give
Perl some help:
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
%hash = map { ("\L$_" => 1) } #array # this also works
%hash = map { lc($_) => 1 } #array # as does this.
%hash = map +( lc($_) => 1 ), #array # this is EXPR and works!
%hash = map ( lc($_), 1 ), #array # evaluates to (1, #array)
or to force an anon hash constructor use "+{":
#hashes = map +{ lc($_) => 1 }, #array # EXPR, so needs comma at end
to get a list of anonymous hashes each with only one entry
apiece.
Based on this, to get rid of the concatenation kludge, you'd need to adjust your syntax to one of these instead:
my %aa = map { +'a' => 1 } (1..3);
my %aa = map { ('a' => 1) } (1..3);
my %aa = map +( 'a' => 1 ), (1..3);
The braces are a little ambiguous in the context of map. They can be surrounding a block as you are intending, or they can be an anonymous hash constructor. There is some fuzzy logic in the perl parser which tries to guess which one you mean.
Your second case looks more like an anonymous hash to perl.
See the perldoc for map which explains this and gives some workarounds.

How can I find out if a hash has an odd number of elements in assignment?

How could I find out if this hash has an odd number of elements?
my %hash = ( 1, 2, 3, 4, 5 );
Ok, I should have written more information.
sub routine {
my ( $first, $hash_ref ) = #_;
if ( $hash_ref refers to a hash with odd numbers of elements ) {
"Second argument refers to a hash with odd numbers of elements.\nFalling back to default values";
$hash_ref = { option1 => 'office', option2 => 34, option3 => 'fast' };
}
...
...
}
routine( [ 'one', 'two', 'three' ], { option1 =>, option2 => undef, option3 => 'fast' );
Well, I suppose there is some terminological confusion in the question that should be clarified.
A hash in Perl always has the same number of keys and values - because it's fundamentally an engine to store some values by their keys. I mean, key-value pair should be considered as a single element here. )
But I guess that's not what was asked really. ) I suppose the OP tried to build a hash from a list (not an array - the difference is subtle, but it's still there), and got the warning.
So the point is to check the number of elements in the list which will be assigned to a hash. It can be done as simple as ...
my #list = ( ... there goes a list ... );
print #list % 2; # 1 if the list had an odd number of elements, 0 otherwise
Notice that % operator imposes the scalar context on the list variable: it's simple and elegant. )
UPDATE as I see, the problem is slightly different. Ok, let's talk about the example given, simplifying it a bit.
my $anhash = {
option1 =>,
option2 => undef,
option3 => 'fast'
};
See, => is just a syntax sugar; this assignment could be easily rewritten as...
my $anhash = {
'option1', , 'option2', undef, 'option3', 'fast'
};
The point is that missing value after the first comma and undef are not the same, as lists (any lists) are flattened automatically in Perl. undef can be a normal element of any list, but empty space will be just ignored.
Take note the warning you care about (if use warnings is set) will be raised before your procedure is called, if it's called with an invalid hash wrapped in reference. So whoever caused this should deal with it by himself, looking at his own code: fail early, they say. )
You want to use named arguments, but set some default values for missing ones? Use this technique:
sub test_sub {
my ($args_ref) = #_;
my $default_args_ref = {
option1 => 'xxx',
option2 => 'yyy',
};
$args_ref = { %$default_args_ref, %$args_ref, };
}
Then your test_sub might be called like this...
test_sub { option1 => 'zzz' };
... or even ...
test_sub {};
The simple answer is: You get a warning about it:
Odd number of elements in hash assignment at...
Assuming you have not been foolish and turned warnings off.
The hard answer is, once assignment to the hash has been done (and warning issued), it is not odd anymore. So you can't.
my %hash = (1,2,3,4,5);
use Data::Dumper;
print Dumper \%hash;
$VAR1 = {
'1' => 2,
'3' => 4,
'5' => undef
};
As you can see, undef has been inserted in the empty spot. Now, you can check for undefined values and pretend that any existing undefined values constitutes an odd number of elements in the hash. However, should an undefined value be a valid value in your hash, you're in trouble.
perl -lwe '
sub isodd { my $count = #_ = grep defined, #_; return ($count % 2) };
%a=(a=>1,2);
print isodd(%a);'
Odd number of elements in hash assignment at -e line 1.
1
In this one-liner, the function isodd counts the defined arguments and returns whether the amount of arguments is odd or not. But as you can see, it still gives the warning.
You can use the __WARN__ signal to "trap" for when a hash assignment is incorrect.
use strict ;
use warnings ;
my $odd_hash_length = 0 ;
{
local $SIG{__WARN__} = sub {
my $msg = shift ;
if ($msg =~ m{Odd number of elements in hash assignment at}) {
$odd_hash_length = 1 ;
}
} ;
my %hash = (1, 2, 3, 4, 5) ;
}
# Now do what you want based on $odd_hash_length
if ($odd_hash_length) {
die "the hash had an odd hash length assignment...aborting\n" ;
} else {
print "the hash was initialized correctly\n";
}
See also How to capture and save warnings in Perl.