Perl backticks in hash ref giving different results - perl

I have a file called a.gz which is a gzipped file which contains the following lines when unzipped:
a
b
Below are two blocks of perl code which I think "should" give the same results but they don't.
Code #1:
use Data::Dumper;
my $s = {
status => 'ok',
msg => `zcat a.gz`
};
print Dumper($s),"\n";
Code #2:
use Data::Dumper;
my $content = `zcat a.gz`;
my $s = {
status => 'ok',
msg => $content
};
print Dumper($s), "\n";
Code #1 gives the following result:
Odd number of elements in anonymous hash at ./x.pl line 8.
$VAR1 = {
'msg' => 'a
',
'b
' => undef,
'status' => 'ok'
};
Code #2 returns the following result:
$VAR1 = {
'msg' => 'a
b
',
'status' => 'ok'
};
I'm using perl 5.10.1 running in Linux

perldoc perlop:
In scalar context, it comes back as a single (potentially multi-line) string, or undef if the command failed. In list context, returns a list of lines (however you've defined lines with $/ or $INPUT_RECORD_SEPARATOR), or an empty list if the command failed.
Assigning to a scalar puts `` in scalar context; using it in { ... } puts it in list context.
{ LIST } takes a list and interprets its contents alternating between keys and values, i.e. key1, value1, key2, value2, key3, value3, .... If the number of elements is odd, you get a warning (and the missing value is taken to be undef).
LIST , LIST (the comma operator in list context) concatenates two lists.
=> works just like , but automatically quotes the identifier to its left (if there is one).

Related

Perl: How does hash assignment with 'map' work?

I have trouble understanding how to assign to a hash using the map function.
Why does
my %a = map {$_=>1 if $_>=2} (1..4);
give me an Odd number of elements in hash assignment error while
my %a = map {$_=>1 if $_>2} (1..4);
gives me
$VAR1 = {
'' => '',
'4' => 1,
'3' => 1
};
and why is there only one empty string in the hash? If I assign to an array
my #a = map {$_ if $_>2} (1..4);
$VAR1 = [
'',
'',
3,
4
];
I get two empty strings, which makes more sense to me.
Is there a possibility to return no empty string if the condition is not met?
Although map is not the best way to do this job (grep as pointed out would be better), it is still possible just using map with the ? comparison:
#!/usr/bin/perl
use strict ;
use warnings ;
use Data::Dumper ;
my %a = map { $_>2 ? ( $_ => 1 ) : () } (1..4) ;
print Dumper( \%a ) ;
Returning the empty list makes map behave like grep when condition is not met.
>perl test.pl
$VAR1 = {
'4' => 1,
'3' => 1
};
map transforms a list into another list. In the first case, your input list is 1, 2, 3, 4. For each member, you return a tuple if the member is >= 2, but otherwise, you return just a single value. The single value is returned for 1 only and causes the "odd number of elements".
In the second case, the transformation works as follows:
input | output
------+-------
1 | ''
2 | ''
3 | 3 => 1
4 | 4 => 1
If you make a hash from it, you take the first empty string as the key, the second empty string as the value, which creates "one empty string in the hash" - there are in fact two.

Description of Perl hash syntax in use statements

In Perl the following is allowed
use constant MY_CONSTANT => 1
however this does not match the documentation of "use" which states that it can take a list. The above is however not a list in the normal way as shown by the following command.
perl -e 'use strict; my #l = "test" => 1; print "#l\n"
This will print "test" and not "test 1".
So is this some special list syntax that can be used together with the use statement or is it also allowed in other cases?
MY_CONSTANT => 1 isn't "a hash".
The => is essentially just a comma, with the additional property that a “bareword” on the left side will be autoquoted: foo => 42 is exactly the same as 'foo', 42. Therefore we can do silly stuff like foo => bar => baz => 42. The “fat comma” should be used to indicate a relation between the left and the right value, e.g. between a hash key and value.
LIST in use Module LIST doesn't mean you need to use the list operator
LIST simply refers to an arbitrary expression that will be evaluated in list context, so not only does list operator MY_CONSTANT => 1 match the specified syntax, but so would the following:
sub f { MY_CONSTANT => 1 }
use constant f();
Be wary of precedence
The next problem you're running into is that the = operator has higher precedence than ,:
my #array = 1, 2, 3;
parses as
(my #array = 1), 2, 3;
As => is the same as ,, the line my #array = test => 1; will parse as
(my #array = "test"), 1;
Use parens to indicate the correct precedence:
my #array = (test => 1);
which will produce your expected output.

Why does Perl function "map" give the error "Not enough arguments for map"

Here is the thing I don't understand.
This script works correctly (notice the concatenation in the map functin):
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %aa = map { 'a' . '' => 1 } (1..3);
print Dumper \%aa;
__END__
output:
$VAR1 = {
'a' => 1
};
But without concatenation the map does not work. Here is the script I expect to work, but it does not:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %aa = map { 'a' => 1 } (1..3);
print Dumper \%aa;
__END__
output:
Not enough arguments for map at e.pl line 7, near "} ("
syntax error at e.pl line 7, near "} ("
Global symbol "%aa" requires explicit package name at e.pl line 9.
Execution of e.pl aborted due to compilation errors.
Can you please explain such behaviour?
Perl uses heuristics to decide whether you're using:
map { STATEMENTS } LIST; # or
map EXPR, LIST;
Because although "{" is often the start of a block, it might also be the start of a hashref.
These heuristics don't look ahead very far in the token stream (IIRC two tokens).
You can force "{" to be interpreted as a block using:
map {; STATEMENTS } LIST; # the semicolon acts as a disambigator
You can force "{" to be interpreted as a hash using:
map +{ LIST }, LIST; # the plus sign acts as a disambigator
grep suffers similarly. (Technically so does do, in that a hashref can be given as an argument, which will then be stringified and treated as if it were a filename. That's just weird though.)
Per the Documentation for map:
Because Perl doesn't look ahead for the closing } it has to take a guess at which it's dealing with based on what it finds just after the {. Usually it gets it right, but if it doesn't it won't realize something is wrong until it gets to the }
Giving the examples:
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
So adding + will give you the same as the first example you've given
my %aa = map { +'a'=> 1 } (1..3);
Perl's manpage entry for map() explains this:
"{" starts both hash references and blocks, so "map { ..."
could be either the start of map BLOCK LIST or map EXPR, LIST.
Because Perl doesn't look ahead for the closing "}" it has to
take a guess at which it's dealing with based on what it finds
just after the "{". Usually it gets it right, but if it doesn't
it won't realize something is wrong until it gets to the "}"
and encounters the missing (or unexpected) comma. The syntax
error will be reported close to the "}", but you'll need to
change something near the "{" such as using a unary "+" to give
Perl some help:
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
%hash = map { ("\L$_" => 1) } #array # this also works
%hash = map { lc($_) => 1 } #array # as does this.
%hash = map +( lc($_) => 1 ), #array # this is EXPR and works!
%hash = map ( lc($_), 1 ), #array # evaluates to (1, #array)
or to force an anon hash constructor use "+{":
#hashes = map +{ lc($_) => 1 }, #array # EXPR, so needs comma at end
to get a list of anonymous hashes each with only one entry
apiece.
Based on this, to get rid of the concatenation kludge, you'd need to adjust your syntax to one of these instead:
my %aa = map { +'a' => 1 } (1..3);
my %aa = map { ('a' => 1) } (1..3);
my %aa = map +( 'a' => 1 ), (1..3);
The braces are a little ambiguous in the context of map. They can be surrounding a block as you are intending, or they can be an anonymous hash constructor. There is some fuzzy logic in the perl parser which tries to guess which one you mean.
Your second case looks more like an anonymous hash to perl.
See the perldoc for map which explains this and gives some workarounds.

Why does this map block contain an apparently useless +?

While browsing the source code I saw the following lines:
my #files_to_keep = qw (file1 file2);
my %keep = map { + $_ => 1 } #files_to_keep;
What does the + do in this code snippet? I used Data::Dumper to see whether taking out the plus sign does anything, but the results were the same:
$ perl cleanme.pl
$VAR1 = {
'file1' => 1,
'file2' => 1
};
This is used to prevent a parsing problem. The plus symbol forces the interpreter to behave like a normal block and not an expression.
The fear is that perhaps you are trying to create a hashreference using the other (expression) formulation of map like so.
#array_of_hashrefs = map { "\L$_" => 1 }, #array
Notice the comma. Then if the parser guesses that you are doing this given the statement in the OP there will a syntax error for missing the comma! To see the difference try quoting "$_". For whatever reason, the parser takes this as enough to trigger the expression behavior.
Yes its an oddity. Therefore many extra-paranoid Perl programmers toss in the extra plus sign more often than needed (me included).
Here are the examples from the map documentation.
%hash = map { "\L$_" => 1 } #array # perl guesses EXPR. wrong
%hash = map { +"\L$_" => 1 } #array # perl guesses BLOCK. right
%hash = map { ("\L$_" => 1) } #array # this also works
%hash = map { lc($_) => 1 } #array # as does this.
%hash = map +( lc($_) => 1 ), #array # this is EXPR and works!
%hash = map ( lc($_), 1 ), #array # evaluates to (1, #array)
For a fun read (stylistically) and a case where the parser gets it wrong read this: http://blogs.perl.org/users/tom_wyant/2012/01/the-case-of-the-overloaded-curlys.html
The unary-plus operator simply returns its operand unchanged. Adding one doesn't even change the context.
In the example you gave, it is completely useless. But there are situations where it is useful to make the next token something that's undeniably an operator.
For example, map has two syntaxes.
map EXPR, LIST
and
map BLOCK LIST
A block starts with {, but so can an expression. For example, { } can be a block or a hash constructor.
So how can map tell the difference? It guesses. Which means it's sometimes wrong.
One occasion where is guesses wrong is the following:
map { $_ => 1 }, #list
You can prod it in to guessing correctly using + or ;.
map {; ... # BLOCK
map +{ ... # EXPR
So in this case, you could use
map +{ foo => $_ }, #list
Note that you could also use the following:
map({ foo => $_ }, #list)
Another example is when you omit the parens around arguments, and the first argument expression starts with a paren.
print ($x+$y)*2; # Same as: 2 * print($x+$y)
It can be fixed using
print +($x+$y)*2;
But why pile on a hack just to avoid parens? I prefer
print(($x+$y)*2);

How can I find out if a hash has an odd number of elements in assignment?

How could I find out if this hash has an odd number of elements?
my %hash = ( 1, 2, 3, 4, 5 );
Ok, I should have written more information.
sub routine {
my ( $first, $hash_ref ) = #_;
if ( $hash_ref refers to a hash with odd numbers of elements ) {
"Second argument refers to a hash with odd numbers of elements.\nFalling back to default values";
$hash_ref = { option1 => 'office', option2 => 34, option3 => 'fast' };
}
...
...
}
routine( [ 'one', 'two', 'three' ], { option1 =>, option2 => undef, option3 => 'fast' );
Well, I suppose there is some terminological confusion in the question that should be clarified.
A hash in Perl always has the same number of keys and values - because it's fundamentally an engine to store some values by their keys. I mean, key-value pair should be considered as a single element here. )
But I guess that's not what was asked really. ) I suppose the OP tried to build a hash from a list (not an array - the difference is subtle, but it's still there), and got the warning.
So the point is to check the number of elements in the list which will be assigned to a hash. It can be done as simple as ...
my #list = ( ... there goes a list ... );
print #list % 2; # 1 if the list had an odd number of elements, 0 otherwise
Notice that % operator imposes the scalar context on the list variable: it's simple and elegant. )
UPDATE as I see, the problem is slightly different. Ok, let's talk about the example given, simplifying it a bit.
my $anhash = {
option1 =>,
option2 => undef,
option3 => 'fast'
};
See, => is just a syntax sugar; this assignment could be easily rewritten as...
my $anhash = {
'option1', , 'option2', undef, 'option3', 'fast'
};
The point is that missing value after the first comma and undef are not the same, as lists (any lists) are flattened automatically in Perl. undef can be a normal element of any list, but empty space will be just ignored.
Take note the warning you care about (if use warnings is set) will be raised before your procedure is called, if it's called with an invalid hash wrapped in reference. So whoever caused this should deal with it by himself, looking at his own code: fail early, they say. )
You want to use named arguments, but set some default values for missing ones? Use this technique:
sub test_sub {
my ($args_ref) = #_;
my $default_args_ref = {
option1 => 'xxx',
option2 => 'yyy',
};
$args_ref = { %$default_args_ref, %$args_ref, };
}
Then your test_sub might be called like this...
test_sub { option1 => 'zzz' };
... or even ...
test_sub {};
The simple answer is: You get a warning about it:
Odd number of elements in hash assignment at...
Assuming you have not been foolish and turned warnings off.
The hard answer is, once assignment to the hash has been done (and warning issued), it is not odd anymore. So you can't.
my %hash = (1,2,3,4,5);
use Data::Dumper;
print Dumper \%hash;
$VAR1 = {
'1' => 2,
'3' => 4,
'5' => undef
};
As you can see, undef has been inserted in the empty spot. Now, you can check for undefined values and pretend that any existing undefined values constitutes an odd number of elements in the hash. However, should an undefined value be a valid value in your hash, you're in trouble.
perl -lwe '
sub isodd { my $count = #_ = grep defined, #_; return ($count % 2) };
%a=(a=>1,2);
print isodd(%a);'
Odd number of elements in hash assignment at -e line 1.
1
In this one-liner, the function isodd counts the defined arguments and returns whether the amount of arguments is odd or not. But as you can see, it still gives the warning.
You can use the __WARN__ signal to "trap" for when a hash assignment is incorrect.
use strict ;
use warnings ;
my $odd_hash_length = 0 ;
{
local $SIG{__WARN__} = sub {
my $msg = shift ;
if ($msg =~ m{Odd number of elements in hash assignment at}) {
$odd_hash_length = 1 ;
}
} ;
my %hash = (1, 2, 3, 4, 5) ;
}
# Now do what you want based on $odd_hash_length
if ($odd_hash_length) {
die "the hash had an odd hash length assignment...aborting\n" ;
} else {
print "the hash was initialized correctly\n";
}
See also How to capture and save warnings in Perl.