$ and % operator being used in conjunction - perl

I'm busy learning Perl at the moment and I've been given some code to look at and "solve".
foreach $field (keys %$exam)
The code above is the area Im having difficulty in understanding. I thought $ was scalar and % was a hash and so I'm unsure what %$ is.
Any help appreciated!
Thanks guys.

%$exam says that you are using not a normal hash, but a dereferenced one, i.e. somewhere before this statement $exam became the reference of a hash (for example $exam = \%somehash or $exam = { a => 1 } for an anonymous hashref). Now, in order to use the previously referenced hash you have to use this syntax to dereference it. To use it unambiguously, it could be written as %{$exam}.

$exam = {a=>1, b=>2}; # anonym hash, $exam is ref for this hash
In order to use this ref like hash you have to use dereferencing operator % before ref
foreach $field (keys %$exam)
For example the same for array ref.
$a = [1,2,3,4]; # anonym arr, $a is ref for this array
So that you have to use operator # before ref $a for dereferencing
foreach $element (#$a) {print $element;}

This is the syntax for dereferencing $exam reference variable.
See
http://perldoc.perl.org/perlreftut.html
http://perldoc.perl.org/perlref.html

Related

Understanding the 'foreach' syntax for the keys of a hash

I have created a simple Perl hash
Sample.pl
$skosName = 'foo';
$skosId = 'abc123';
$skosFile{'type'}{$skosId} = $skosName;
Later on I try to print the hash values using foreach.
This variant works
foreach $skosfile1type ( keys %{skosFile} ){
print ...
}
While this one doesn't
foreach $skosfile1type ( keys %{$skosFile} ) {
print ...
}
What is the difference between the two foreach statements?
In particular, what is the significance of the dollar sign $ in the statement that doesn't work?
Is it something to do with scope, or perhaps my omission of the my or our keywords?
%{skosfile} is the same as %skosfile. It refers to a hash variable with that name. Usually that form isn't used for a simple variable name, but it's allowable.
%{$skosfile} means to look at the scalar variable $skosfile (remember, in perl, $foo, %foo, and #foo are distinctvariables), and, expecting $skosfile to be a hashref, it returns the hash that the reference points to. It is equivalent to %$skosfile, but in fact any expression that returns a hashref can appear inside of %{...}.
The syntax %{ $scalar } is used to tell Perl that the type of $scalar is a hash ref and you want to undo the reference. That is why you need the dollar sign $: $skosfile is the variable you are trying to dereference.
In the same fashion, #{ $scalar } serves to dereference an array.
Although it does not work for complex constructions, in simple cases you may also abbreviate %{$scalar} to %$scalar and #{$scalar} to #$scalar.
In the case of the expression keys %{$skosfile}, keys needs a hash which you obtain by dereferencing $skosfile, a hash ref. In fact, the typical foreach loop for a hash looks like:
foreach my $key ( keys %hash ) {
# do something with $key
}
When you iterate a hash ref:
foreach my $key ( keys %{ $hashref } ) {
# do something with $key
}

Anonymous Hash Slices - syntax?

I love hash slices and use them frequently:
my %h;
#h{#keys}=#vals;
Works brilliantly! But 2 things have always vexed me.
First, is it possible to combine the 2 lines above into a single line of code? It would be nice to declare the hash and populate it all at once.
Second, is it possible to slice an existing anonymous hash... something like:
my $slice=$anonh->{#fields}
First question:
my %h = map { $keys[$_] => $vals[$_] } 0..$#keys;
or
use List::MoreUtils qw( mesh );
my %h = mesh #keys, #vals;
Second question:
If it's ...NAME... for a hash, it's ...{ $href }... for a hash ref, so
my #slice = #hash{#fields};
is
my #slice = #{ $anonh }{#fields};
The curlies are optional if the reference expression is a variable.
my #slice = #$anonh{#fields};
Mini-Tutorial: Dereferencing Syntax
References quick reference
perlref
perlreftut
perldsc
perllol
For your first question, to do it in a single line of code:
#$_{#keys}=#vals for \my %h;
or
map #$_{#keys}=#vals, \my %h;
but I wouldn't do that; it's a confusing way to write it.
Either version declares the variable and immediately takes a reference to it and aliases $_ to the reference so that the hash reference can be used in a slice. This lets you declare the variable in the existing scope; #{ \my %h }{#keys} = #vals; also "works", but has the unfortunate drawback of scoping %h to that tiny block in the hash slice.
For your second question, as shown above, slices can be used on hash references; see http://perlmonks.org/?node=References+quick+reference for some easy to remember rules.
my #slice = #$anonh{#fields};
or maybe you meant:
my $slice = [ #$anonh{#fields} ];
but #slice/$slice there is a copy of the values. To get an array of aliases to the hash values, you can do:
my $slice = sub { \#_ }->( #$anonh{#fields} );
Hash slice syntax is
# <hash-name-or-hash-ref> { LIST }
When you are slicing a hash reference, enclose it in curly braces so it doesn't get dereferenced as an array. This gives you:
my #values = #{$anonh}{#fields}
for a hash reference $anonh.

How to manipulate a hash-ref with Perl?

Take a look at this code. After hours of trial and error. I finally got a solution. But have no idea why it works, and to be quite honest, Perl is throwing me for a loop here.
use Data::Diff 'Diff';
use Data::Dumper;
my $out = Diff(\#comparr,\#grabarr);
my #uniq_a;
#temp = ();
my $x = #$out{uniq_a};
foreach my $y (#$x) {
#temp = ();
foreach my $z (#$y) {
push(#temp, $z);
}
push(#uniq_a, [$temp[0], $temp[1], $temp[2], $temp[3]]);
}
Why is it that the only way I can access the elements of the $out array is to pass a hash key into a scalar which has been cast as an array using a for loop? my $x = #$out{uniq_a}; I'm totally confused. I'd really appreciate anyone who can explain what's going on here so I'll know for the future. Thanks in advance.
$out is a hash reference, and you use the dereferencing operator ->{...} to access members of the hash that it refers to, like
$out->{uniq_a}
What you have stumbled on is Perl's hash slice notation, where you use the # sigil in front of the name of a hash to conveniently extract a list of values from that hash. For example:
%foo = ( a => 123, b => 456, c => 789 );
$foo = { a => 123, b => 456, c => 789 };
print #foo{"b","c"}; # 456,789
print #$foo{"c","a"}; # 789,123
Using hash slice notation with a single element inside the braces, as you do, is not the typical usage and gives you the results you want by accident.
The Diff function returns a hash reference. You are accessing the element of this hash that has key uniq_a by extracting a one-element slice of the hash, instead of the correct $out->{uniq_a}. Your code should look like this
my $out = Diff(\#comparr, \#grabarr);
my #uniq_a;
my $uniq_a = $out->{uniq_a};
for my $list (#$uniq_a) {
my #temp = #$list;
push #uniq_a, [ #temp[0..3] ];
}
In the documentation for Data::Diff it states:
The value returned is always a hash reference and the hash will have
one or more of the following hash keys: type, same, diff, diff_a,
diff_b, uniq_a and uniq_b
So $out is a reference and you have to access the values through the mentioned keys.

consecutive operators and brackets

I'm just trying to learn a bit of Perl and have come across this:
foreach $element (#{$records})
{
do something;
}
To my newbie eyes, this reads:
"for each element in an array named #{$records}, do something"
but, since that seems an unlikely name for an array (with "#{$" altogether), I imagine it isn't that simple?
I've also come across "%$" used together.
I know % signifies a hash and $ signifies a scalar but don't know what they mean together.
Can anyone shed any light on these?
In Perl you can have a reference (a pointer) to a data structure:
# an array
my #array;
# a reference to an array
my $ref = \#array;
When you have a reference to be able to use the array you need to dereference it
#{ $ref }
If you need to access an element as in
$array[0]
you can do the same with a reference
${$ref}[0]
The curly brackets {} are optional and you can also use
$$ref[0]
#$ref
but I personally find them less readable.
The same applies to every other type (as %$ for a hash reference).
See man perlref for the details and man perlreftut for a tutorial.
Edit
The arrow operator -> can also be used to dereference an array or an hash
$array_ref->[0]
or
$hash_ref->{key}
See man perlop for details
If you have a reference to an array or a hash, you would use a scalar to hold the reference:
my $href = \%hash;
my $aref = \#array;
When you want to de-reference these references, you would use the symbol appropriate for the reference type:
for my $element (#$aref) {
}
for my $key (keys %$href) {
}

What does `$hash{$key} |= {}` do in Perl?

I was wrestling with some Perl that uses hash references.
In the end it turned out that my problem was the line:
$myhash{$key} |= {};
That is, "assign $myhash{$key} a reference to an empty hash, unless it already has a value".
Dereferencing this and trying to use it as a hash reference, however, resulted in interpreter errors about using a string as a hash reference.
Changing it to:
if( ! exists $myhash{$key}) {
$myhash{$key} = {};
}
... made things work.
So I don't have a problem. But I'm curious about what was going on.
Can anyone explain?
The reason you're seeing an error about using a string as a hash reference is because you're using the wrong operator. |= means "bitwise-or-assign." In other words,
$foo |= $bar;
is the same as
$foo = $foo | $bar
What's happening in your example is that your new anonymous hash reference is getting stringified, then bitwise-ORed with the value of $myhash{$key}. To confuse matters further, if $myhash{$key} is undefined at the time, the value is the simple stringification of the hash reference, which looks like HASH(0x80fc284). So if you do a cursory inspection of the structure, it may look like a hash reference, but it's not. Here's some useful output via Data::Dumper:
perl -MData::Dumper -le '$hash{foo} |= { }; print Dumper \%hash'
$VAR1 = {
'foo' => 'HASH(0x80fc284)'
};
And here's what you get when you use the correct operator:
perl -MData::Dumper -le '$hash{foo} ||= { }; print Dumper \%hash'
$VAR1 = {
'foo' => {}
};
Perl has shorthand assignment operators. The ||= operator is often used to set default values for variables due to Perl's feature of having logical operators return the last value evaluated. The problem is that you used |= which is a bitwise or instead of ||= which is a logical or.
As of Perl 5.10 it's better to use //= instead. // is the logical defined-or operator and doesn't fail in the corner case where the current value is defined but false.
I think your problem was using "|=" (bitwise-or assignment) instead of "||=" (assign if false).
Note that your new code is not exactly equivalent. The difference is that "$myhash{$key} ||= {}" will replace existing-but-false values with a hash reference, but the new one won't. In practice, this is probably not relevant.
Try this:
my %myhash;
$myhash{$key} ||= {};
You can't declare a hash element in a my clause, as far as I know. You declare the hash first, then add the element in.
Edit: I see you've taken out the my. How about trying ||= instead of |=? The former is idiomatic for "lazy" initialisation.