Storing a hash in a hash - perl

I'm having troubles with a Perl script. I try to store a hash in a hash. The script is trivial:
use Data::Dumper;
my %h1=();
$h1{name}="parent";
my %h2=();
$h2{name}="child";
$h1{nested}=%h2; # store hash h2 in hash h1
print "h2:\n";
print Dumper(%h2); # works
print "h1{nested}:\n";
print Dumper($h1{nested}); # fails
The results:
h2:
$VAR1 = 'name';
$VAR2 = 'child';
h1{nested}:
$VAR1 = '1/8';
Why is the $h1{nested} not dumped as a hash, but as some kind of weird scalar (1/8)?
PS: even if this question sounds trivial - I searched SO but did not find that it was asked before.
PPS: my Perl is v5.10.1 (*) built for x86_64-linux-gnu-thread-multi
(with 53 registered patches, see perl -V for more detail)

You can only store a hashref in a hash:
$h1{nested}=\%h2;
and then you would access %h2's name by doing
$h1{nested}->{name}
In your code, %h2 is forced to scalar context, which shows you that "1/8" value, and stores that.

In perl the values stored in a list (hash or array) are always scalars. Given this, the only way to store a hash inside another hash is to store a reference to it.
$h1{'nested'} = \%h2;
or also
$h1{'nested'} = { 'name'=>'child' };
(the braces in the right hand side is a reference to an anonymous hash).
BTW, to not quote the literals in the keys is usually considered bad practice, see here

Why is the $h1{nested} not dumped as a hash, but as some kind of weird scalar (1/8)?
Because you're storing it in a scalar context!
When you do this:
$h1{nested} = %h2;
You're storing a scalar. Since %h2 is a hash, you're given the ol' fraction string. According to the Perldoc website
If you evaluate a hash in scalar context, it returns false if the hash is empty. If there are any key/value pairs, it returns true; more precisely, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash.
That explains the 1/8 you're getting.
What you need to do is store the hash as a reference in the other hash. As others pointed out, it should be:
$h1{nested} = \%h2;
The backslash before the hash's name gives you the memory location where the hash is stored. You can use the curly braces, but I prefer the backslash notation.
Take a look at perldoc prelreftut on your computer (or on the webpage I've linked to). This will tell you how to make such things as a list of lists, hashes or hashes, lists of hashes, and hashes of lists. Just a word o` warning: If you get too complex, it'll be hard to maintain, so once you've had your fun, take a look at perldoc's Perl Object Orientation Programming Tutorial.
The perldoc command contains lots of Perl documentation including for all Perl function, Perl modules installed on your system, and even basic information about the Perl language.

Related

Perl assign array elements to hash user defined key

Below is code in which I need help.
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
my #arrayElements = ('Array Functions');
print join(", ", #arrayElements);
### Output => Array Functions
my %hashElements = ();
I want to assign the content of #arrayElements to $hashElements{Item}
Missing some core concepts or trying wrong and been a while struggling with this.
You seem to be missing some core concepts of Perl (or programming in general). If you are learning Perl through a book or online tutorial, I suggest you re-read the chapters on arrays and hashes.
Let's look at the things involved here. You have:
#arrayElements, which is an array. It contains a list with one elements, the string 'Array Functions'.
%hashElements, which is a hash. It's empty.
$hashElements{Item}, which is a scalar value. You want to set this.
You say you want $hashElements{Item} to have the value 'Array Functions', which you have as the first element in your array #arrayElements.
$hashElements{Item} = $arrayElements[0];
And that's it. Both $hashElements{Item} and $arrayElements[0] are scalar values. That's why their sigils (the sign at the front) changes from an # (for array) or % (for hash) to a $. You can distinguish whether the value came from a hash or an array by the brackets used to access the elements. [] is for arrays, and {} is for hashes.
You cannot do the following though.
$hashElements{Item} = #arrayElements;
Because $hashElements{Item} is a scalar, the thing on the right hand side of the assignment will be treated in scalar context. An array in scalar context gets converted to the number of elements in the array, so this would assign 1. That's not what you want.
You should really read up more about this, and also pick better names for your variables. Your example is very confusing. In general, we don't do $CamelCase for variable names in Perl, but instead use $snake_case, which is easier to read and type.
Take a look at the following resources to learn more about the concepts I've mentioned above.
Perl Maven, perldata, perldsc

Perl - Two questions regarding proper syntax for dereferencing

as a newbie I am trying to explore perl data structures using this material from atlanta perl mongers, avaliable here Perl Data Structures
Here is the sample code that I've writen, 01.pl is the same as 02.pl but 01.pl contains additional two pragmas: use strict; use warnings;.
#!/usr/bin/perl
my %name = (name=>"Linus", forename=>"Torvalds");
my #system = qw(Linux FreeBSD Solaris NetBSD);
sub passStructure{
my ($arg1,$arg2)=#_;
if (ref($arg1) eq "HASH"){
&printHash($arg1);
}
elsif (ref($arg1) eq "ARRAY"){
&printArray($arg1);
}
if (ref($arg2) eq "HASH"){
&printHash($arg2);
}
elsif (ref($arg2) eq "ARRAY"){
&printArray($arg2);
}
}
sub printArray{
my $aref = $_[0];
print "#{$aref}\n";
print "#{$aref}->[0]\n";
print "$$aref[0]\n";
print "$aref->[0]\n";
}
sub printHash{
my $href = $_[0];
print "%{$href}\n";
print "%{$href}->{'name'}\n";
print "$$href{'name'}\n";
print "$href->{'name'}\n";
}
&passStructure(\#system,\%name);
There are several points mentioned in above document that I misunderstood:
1st
Page 44 mentions that those two syntax constructions: "$$href{'name'}" and "$$aref[0]" shouldn't never ever been used for accessing values. Why ? Seems in my code they are working fine (see bellow), moreover perl is complaining about using #{$aref}->[0] as deprecated, so which one is correct ?
2nd
Page 45 mentions that without "use strict" and using "$href{'SomeKey'}" when "$href->{'SomeKey'}" should be used, the %href is created implictly. So if I understand it well, both following scripts should print "Exists"
[pista#HP-PC temp]$ perl -ale 'my %ref=(SomeKey=>'SomeVal'); print $ref{'SomeKey'}; print "Exists\n" if exists $ref{'SomeKey'};'
SomeVal
Exists
[pista#HP-PC temp]$ perl -ale ' print $ref{'SomeKey'}; print "Exists\n" if exists $ref{'SomeKey'};'
but second wont, why ?
Output of two beginning mentioned scripts:
[pista#HP-PC temp]$ perl 01.pl
Using an array as a reference is deprecated at 01.pl line 32.
Linux FreeBSD Solaris NetBSD
Linux
Linux
Linux
%{HASH(0x1c33ec0)}
%{HASH(0x1c33ec0)}->{'name'}
Linus
Linus
[pista#HP-PC temp]$ perl 02.pl
Using an array as a reference is deprecated at 02.pl line 32.
Linux FreeBSD Solaris NetBSD
Linux
Linux
Linux
%{HASH(0x774e60)}
%{HASH(0x774e60)}->{'name'}
Linus
Linus
Many people think $$aref[0] is ugly and $aref->[0] not ugly. Others disagree; there is nothing wrong with the former form.
#{$aref}->[0], on the other hand, is a mistake that happens to work but is deprecated and may not continue to.
You may want to read http://perlmonks.org/?node=References+quick+reference
A package variable %href is created simply by mentioning such a hash without use strict "vars" in effect, for instance by leaving the -> out of $href->{'SomeKey'}. That doesn't mean that particular key is created.
Update: looking at the Perl Best Practices reference (a book that inspired much more slavish adoption and less actual thought than the author intended), it is recommending the -> form specifically to avoid the possibility of leaving off a sigil, leading to the problem mentioned on p45.
Perl has normal datatypes, and references to data types. It is important that you are aware of the differerence between them, both in their meaning, and in their syntax.
Type |Normal Access | Reference Access | Debatable Reference Access
=======+==============+==================+===========================
Scalar | $scalar | $$scalar_ref |
Array | $array[0] | $arrayref->[0] | $$arrayref[0]
Hash | $hash{key} | $hashref->{key} | $$hashref{key}
Code | code() | $coderef->() | &$coderef()
The reason why accessing hashrefs or arrayrefs with the $$foo[0] syntax can be considered bad is that (1) the double sigil looks confusingly like a scalar ref access, and (2) this syntax hides the fact that references are used. The dereferencing arrow -> is clear in its intent. I covered the reason why using the & sigil is bad in this answer.
The #{$aref}->[0] is extremely wrong, because you are dereferencing a reference to an array (which cannot, by definition, be a reference itself), and then dereferencing the first element of that array with the arrow. See the above table for the right syntax.
Interpolating hashes into strings seldom makes sense. The stringification of a hash denotes the number of filled and available buckets, and so can tell you about the load. This isn't useful in most cases. Also, not treating the % character as special in strings allows you to use printf…
Another interesting thing about Perl data structures is to know when a new entry in a hash or array is created. In general, accessing a value does not create a slot in that hash or array, except when you are using the value as reference.
my %foo;
$foo{bar}; # access, nothing happens
say "created at access" if exists $foo{bar};
$foo{bar}[0]; # usage as arrayref
say "created at ref usage" if exists $foo{bar};
Output: created at ref usage.
Actually, the arrayref spings into place, because you can use undef values as references in certain cases. This arrayref then populates the slot in the hash.
Without use strict 'refs', the variable (but not a slot in that variable) springs into place, because global variables are just entries in a hash that represents the namespace. $foo{bar} is the same as $main::foo{bar} is the same as $main::{foo}{bar}.
The major advantage of the $arg->[0] form over the $$arg[0] form, is that it's much clearer with the first type as to what is going on... $arg is an ARRAYREF and you're accessing the 0th element of the array it refers to.
At first reading, the second form could be interpreted as ${$arg}[0] (dereferencing an ARRAYREF) or ${$arg[0]} (dereferencing whatever the first element of #arg is.
Naturally, only one interpretation is correct, but we all have those days (or nights) where we're looking at code and we can't quite remember what order operators and other syntactic devices work in. Also, the confusion would compound if there were additional levels of dereferencing.
Defensive programmers will tend to err towards efforts to make their intentions explicit and I would argue that $arg->[0] is a much more explicit representation of the intention of that code.
As to the automatic creation of hashes... it's only the hash that would be created (so that the Perl interpreter and check to see if the key exists). The key itself is not created (naturally... you wouldn't want to create a key that you're checking for... but you may need to create the bucket that would hold that key, if the bucket doesn't exist. The process is called autovivification and you can read more about it here.
I believe you should be accessing the array as: #{$aref}[0] or $aref->[0].
a print statement does not instantiate an object. What is meant by the implicit creation is you don't need to predefine the variable before assigning to it. Since print does not assign the variable is not created.

Error using intermediate variable to access Spreadsheet::Read sheets

I'm no expert at Perl, wondering why the first way of obtaining numSheets is okay, while the following way isn't:
use Spreadsheet::Read;
my $spreadsheet = ReadData("blah.xls");
my $n1 = $spreadsheet->[1]{sheets}; # okay
my %sh = %spreadsheet->[1]; # bad
my $n2 = $sh{label};
The next to last line gives the error
Global symbol "%spreadsheet" requires explicit package name at newexcel_display.pl line xxx
I'm pretty sure I have the right sigils; if I experiment I can only get different errors. I know spreadsheet is a reference to an array not directly an array. I don't know about the hash for the metadata or individual sheets, but experimenting with different assumptions leads nowhere (at least with my modest perl skill.)
My reference on Spreadsheet::Read workings is http://search.cpan.org/perldoc?Spreadsheet::Read If there are good examples somewhere online that show how to properly use Spreadsheet, I'd like to know where they are.
It's not okay because it's not valid Perl syntax. The why is because that's not how Larry defined his language.
The sigils in front of variables tell you what you are trying to do, not what sort of variable it is. A $ means single item, as in $scalar but also single element accesses to aggregates such as $array[0] and $hash{$key}. Don't use the sigils to coerce types. Perl 5 doesn't do that.
In your case, $spreadsheet is an array reference. The %spreadsheet variable, which is a named hash, is a completely separate variable unrelated to all other variables with the same identifier. $foo, #foo, and %foo come from different namespaces. Since you haven't declared a %spreadsheet, strict throws the error that you see.
It looks like you want to get a hash reference from $spreadsheet->[1]. All references are scalars, so you want to assign to a scalar:
my $hash_ref = $spreadsheet->[1];
Once you have the hash reference in the scalar, you dereference it to get its values:
my $n2 = $hash_ref->{sheets};
This is the stuff we cover in the first part of Intermediate Perl.

What is the difference between `$this`, `#that`, and `%those` in Perl?

What is the difference between $this, #that, and %those in Perl?
A useful mnemonic for Perl sigils are:
$calar
#rray
%ash
Matt Trout wrote a great comment on blog.fogus.me about Perl sigils which I think is useful so have pasted below:
Actually, perl sigils don’t denote variable type – they denote conjugation – $ is ‘the’, # is
‘these’, % is ‘map of’ or so – variable type is denoted via [] or {}. You can see this with:
my $foo = 'foo';
my #foo = ('zero', 'one', 'two');
my $second_foo = $foo[1];
my #first_and_third_foos = #foo[0,2];
my %foo = (key1 => 'value1', key2 => 'value2', key3 => 'value3');
my $key2_foo = $foo{key2};
my ($key1_foo, $key3_foo) = #foo{'key1','key3'};
so looking at the sigil when skimming perl code tells you what you’re going to -get- rather
than what you’re operating on, pretty much.
This is, admittedly, really confusing until you get used to it, but once you -are- used to it
it can be an extremely useful tool for absorbing information while skimming code.
You’re still perfectly entitled to hate it, of course, but it’s an interesting concept and I
figure you might prefer to hate what’s -actually- going on rather than what you thought was
going on :)
$this is a scalar value, it holds 1 item like apple
#that is an array of values, it holds several like ("apple", "orange", "pear")
%those is a hash of values, it holds key value pairs like ("apple" => "red", "orange" => "orange", "pear" => "yellow")
See perlintro for more on Perl variable types.
Perl's inventor was a linguist, and he sought to make Perl like a "natural language".
From this post:
Disambiguation by number, case and word order
Part of the reason a language can get away with certain local ambiguities is that other ambiguities are suppressed by various mechanisms. English uses number and word order, with vestiges of a case system in the pronouns: "The man looked at the men, and they looked back at him." It's perfectly clear in that sentence who is doing what to whom. Similarly, Perl has number markers on its nouns; that is, $dog is one pooch, and #dog is (potentially) many. So $ and # are a little like "this" and "these" in English. [emphasis added]
People often try to tie sigils to variable types, but they are only loosely related. It's a topic we hit very hard in Learning Perl and Effective Perl Programming because it's much easier to understand Perl when you understand sigils.
Many people forget that variables and data are actually separate things. Variables can store data, but you don't need variables to use data.
The $ denotes a single scalar value (not necessarily a scalar variable):
$scalar_var
$array[1]
$hash{key}
The # denotes multiple values. That could be the array as a whole, a slice, or a dereference:
#array;
#array[1,2]
#hash{qw(key1 key2)}
#{ func_returning_array_ref };
The % denotes pairs (keys and values), which might be a hash variable or a dereference:
%hash
%$hash_ref
Under Perl v5.20, the % can now denote a key/value slice or either a hash or array:
%array[ #indices ]; # returns pairs of indices and elements
%hash{ #keys }; # returns pairs of key-values for those keys
You might want to look at the perlintro and perlsyn documents in order to really get started with understanding Perl (i.e., Read The Flipping Manual). :-)
That said:
$this is a scalar, which can store a number (int or float), a string, or a reference (see below);
#that is an array, which can store an ordered list of scalars (see above). You can add a scalar to an array with the push or unshift functions (see perlfunc), and you can use a parentheses-bounded comma-separated list of scalar literals or variables to create an array literal (i.e., my #array = ($a, $b, 6, "seven");)
%those is a hash, which is an associative array. Hashes have key-value pairs of entries, such that you can access the value of a hash by supplying its key. Hash literals can also be specified much like lists, except that every odd entry is a key and every even one is a value. You can also use a => character instead of a comma to separate a key and a value. (i.e., my %ordinals = ("one" => "first", "two" => "second");)
Normally, when you pass arrays or hashes to subroutine calls, the individual lists are flattened into one long list. This is sometimes desirable, sometimes not. In the latter case, you can use references to pass a reference to an entire list as a single scalar argument. The syntax and semantics of references are tricky, though, and fall beyond the scope of this answer. If you want to check it out, though, see perlref.

Why do you need $ when accessing array and hash elements in Perl?

Since arrays and hashes can only contain scalars in Perl, why do you have to use the $ to tell the interpreter that the value is a scalar when accessing array or hash elements? In other words, assuming you have an array #myarray and a hash %myhash, why do you need to do:
$x = $myarray[1];
$y = $myhash{'foo'};
instead of just doing :
$x = myarray[1];
$y = myhash{'foo'};
Why are the above ambiguous?
Wouldn't it be illegal Perl code if it was anything but a $ in that place? For example, aren't all of the following illegal in Perl?
#var[0];
#var{'key'};
%var[0];
%var{'key'};
I've just used
my $x = myarray[1];
in a program and, to my surprise, here's what happened when I ran it:
$ perl foo.pl
Flying Butt Monkeys!
That's because the whole program looks like this:
$ cat foo.pl
#!/usr/bin/env perl
use strict;
use warnings;
sub myarray {
print "Flying Butt Monkeys!\n";
}
my $x = myarray[1];
So myarray calls a subroutine passing it a reference to an anonymous array containing a single element, 1.
That's another reason you need the sigil on an array access.
Slices aren't illegal:
#slice = #myarray[1, 2, 5];
#slice = #myhash{qw/foo bar baz/};
And I suspect that's part of the reason why you need to specify if you want to get a single value out of the hash/array or not.
The sigil give you the return type of the container. So if something starts with #, you know that it returns a list. If it starts with $, it returns a scalar.
Now if there is only an identifier after the sigil (like $foo or #foo, then it's a simple variable access. If it's followed by a [, it is an access on an array, if it's followed by a {, it's an access on a hash.
# variables
$foo
#foo
# accesses
$stuff{blubb} # accesses %stuff, returns a scalar
#stuff{#list} # accesses %stuff, returns an array
$stuff[blubb] # accesses #stuff, returns a scalar
# (and calls the blubb() function)
#stuff[blubb] # accesses #stuff, returns an array
Some human languages have very similar concepts.
However many programmers found that confusing, so Perl 6 uses an invariant sigil.
In general the Perl 5 compiler wants to know at compile time if something is in list or in scalar context, so without the leading sigil some terms would become ambiguous.
This is valid Perl: #var[0]. It is an array slice of length one. #var[0,1] would be an array slice of length two.
#var['key'] is not valid Perl because arrays can only be indexed by numbers, and
the other two (%var[0] and %var['key']) are not valid Perl because hash slices use the {} to index the hash.
#var{'key'} and #var{0} are both valid hash slices, though. Obviously it isn't normal to take slices of length one, but it is certainly valid.
See the slice section of perldata perldocfor more information about slicing in Perl.
People have already pointed out that you can have slices and contexts, but sigils are there to separate the things that are variables from everything else. You don't have to know all of the keywords or subroutine names to choose a sensible variable name. It's one of the big things I miss about Perl in other languages.
I can think of one way that
$x = myarray[1];
is ambiguous - what if you wanted a array called m?
$x = m[1];
How can you tell that apart from a regex match?
In other words, the syntax is there to help the Perl interpreter, well, interpret!
In Perl 5 (to be changed in Perl 6) a sigil indicates the context of your expression.
You want a particular scalar out of a hash so it's $hash{key}.
You want the value of a particular slot out of an array, so it's $array[0].
However, as pointed out by zigdon, slices are legal. They interpret the those expressions in a list context.
You want a lists of 1 value in a hash #hash{key} works
But also larger lists work as well, like #hash{qw<key1 key2 ... key_n>}.
You want a couple of slots out of an array #array[0,3,5..7,$n..$n+5] works
#array[0] is a list of size 1.
There is no "hash context", so neither %hash{#keys} nor %hash{key} has meaning.
So you have "#" + "array[0]" <=> < sigil = context > + < indexing expression > as the complete expression.
The sigil provides the context for the access:
$ means scalar context (a scalar
variable or a single element of a hash or an array)
# means list context (a whole array or a slice of
a hash or an array)
% is an entire hash
In Perl 5 you need the sigils ($ and #) because the default interpretation of bareword identifier is that of a subroutine call (thus eliminating the need to use & in most cases ).