Perl - Two questions regarding proper syntax for dereferencing - perl

as a newbie I am trying to explore perl data structures using this material from atlanta perl mongers, avaliable here Perl Data Structures
Here is the sample code that I've writen, 01.pl is the same as 02.pl but 01.pl contains additional two pragmas: use strict; use warnings;.
#!/usr/bin/perl
my %name = (name=>"Linus", forename=>"Torvalds");
my #system = qw(Linux FreeBSD Solaris NetBSD);
sub passStructure{
my ($arg1,$arg2)=#_;
if (ref($arg1) eq "HASH"){
&printHash($arg1);
}
elsif (ref($arg1) eq "ARRAY"){
&printArray($arg1);
}
if (ref($arg2) eq "HASH"){
&printHash($arg2);
}
elsif (ref($arg2) eq "ARRAY"){
&printArray($arg2);
}
}
sub printArray{
my $aref = $_[0];
print "#{$aref}\n";
print "#{$aref}->[0]\n";
print "$$aref[0]\n";
print "$aref->[0]\n";
}
sub printHash{
my $href = $_[0];
print "%{$href}\n";
print "%{$href}->{'name'}\n";
print "$$href{'name'}\n";
print "$href->{'name'}\n";
}
&passStructure(\#system,\%name);
There are several points mentioned in above document that I misunderstood:
1st
Page 44 mentions that those two syntax constructions: "$$href{'name'}" and "$$aref[0]" shouldn't never ever been used for accessing values. Why ? Seems in my code they are working fine (see bellow), moreover perl is complaining about using #{$aref}->[0] as deprecated, so which one is correct ?
2nd
Page 45 mentions that without "use strict" and using "$href{'SomeKey'}" when "$href->{'SomeKey'}" should be used, the %href is created implictly. So if I understand it well, both following scripts should print "Exists"
[pista#HP-PC temp]$ perl -ale 'my %ref=(SomeKey=>'SomeVal'); print $ref{'SomeKey'}; print "Exists\n" if exists $ref{'SomeKey'};'
SomeVal
Exists
[pista#HP-PC temp]$ perl -ale ' print $ref{'SomeKey'}; print "Exists\n" if exists $ref{'SomeKey'};'
but second wont, why ?
Output of two beginning mentioned scripts:
[pista#HP-PC temp]$ perl 01.pl
Using an array as a reference is deprecated at 01.pl line 32.
Linux FreeBSD Solaris NetBSD
Linux
Linux
Linux
%{HASH(0x1c33ec0)}
%{HASH(0x1c33ec0)}->{'name'}
Linus
Linus
[pista#HP-PC temp]$ perl 02.pl
Using an array as a reference is deprecated at 02.pl line 32.
Linux FreeBSD Solaris NetBSD
Linux
Linux
Linux
%{HASH(0x774e60)}
%{HASH(0x774e60)}->{'name'}
Linus
Linus

Many people think $$aref[0] is ugly and $aref->[0] not ugly. Others disagree; there is nothing wrong with the former form.
#{$aref}->[0], on the other hand, is a mistake that happens to work but is deprecated and may not continue to.
You may want to read http://perlmonks.org/?node=References+quick+reference
A package variable %href is created simply by mentioning such a hash without use strict "vars" in effect, for instance by leaving the -> out of $href->{'SomeKey'}. That doesn't mean that particular key is created.
Update: looking at the Perl Best Practices reference (a book that inspired much more slavish adoption and less actual thought than the author intended), it is recommending the -> form specifically to avoid the possibility of leaving off a sigil, leading to the problem mentioned on p45.

Perl has normal datatypes, and references to data types. It is important that you are aware of the differerence between them, both in their meaning, and in their syntax.
Type |Normal Access | Reference Access | Debatable Reference Access
=======+==============+==================+===========================
Scalar | $scalar | $$scalar_ref |
Array | $array[0] | $arrayref->[0] | $$arrayref[0]
Hash | $hash{key} | $hashref->{key} | $$hashref{key}
Code | code() | $coderef->() | &$coderef()
The reason why accessing hashrefs or arrayrefs with the $$foo[0] syntax can be considered bad is that (1) the double sigil looks confusingly like a scalar ref access, and (2) this syntax hides the fact that references are used. The dereferencing arrow -> is clear in its intent. I covered the reason why using the & sigil is bad in this answer.
The #{$aref}->[0] is extremely wrong, because you are dereferencing a reference to an array (which cannot, by definition, be a reference itself), and then dereferencing the first element of that array with the arrow. See the above table for the right syntax.
Interpolating hashes into strings seldom makes sense. The stringification of a hash denotes the number of filled and available buckets, and so can tell you about the load. This isn't useful in most cases. Also, not treating the % character as special in strings allows you to use printf…
Another interesting thing about Perl data structures is to know when a new entry in a hash or array is created. In general, accessing a value does not create a slot in that hash or array, except when you are using the value as reference.
my %foo;
$foo{bar}; # access, nothing happens
say "created at access" if exists $foo{bar};
$foo{bar}[0]; # usage as arrayref
say "created at ref usage" if exists $foo{bar};
Output: created at ref usage.
Actually, the arrayref spings into place, because you can use undef values as references in certain cases. This arrayref then populates the slot in the hash.
Without use strict 'refs', the variable (but not a slot in that variable) springs into place, because global variables are just entries in a hash that represents the namespace. $foo{bar} is the same as $main::foo{bar} is the same as $main::{foo}{bar}.

The major advantage of the $arg->[0] form over the $$arg[0] form, is that it's much clearer with the first type as to what is going on... $arg is an ARRAYREF and you're accessing the 0th element of the array it refers to.
At first reading, the second form could be interpreted as ${$arg}[0] (dereferencing an ARRAYREF) or ${$arg[0]} (dereferencing whatever the first element of #arg is.
Naturally, only one interpretation is correct, but we all have those days (or nights) where we're looking at code and we can't quite remember what order operators and other syntactic devices work in. Also, the confusion would compound if there were additional levels of dereferencing.
Defensive programmers will tend to err towards efforts to make their intentions explicit and I would argue that $arg->[0] is a much more explicit representation of the intention of that code.
As to the automatic creation of hashes... it's only the hash that would be created (so that the Perl interpreter and check to see if the key exists). The key itself is not created (naturally... you wouldn't want to create a key that you're checking for... but you may need to create the bucket that would hold that key, if the bucket doesn't exist. The process is called autovivification and you can read more about it here.

I believe you should be accessing the array as: #{$aref}[0] or $aref->[0].
a print statement does not instantiate an object. What is meant by the implicit creation is you don't need to predefine the variable before assigning to it. Since print does not assign the variable is not created.

Related

Change ref of hash in Perl

I ran into this and couldn't find the answer. I am trying to see if it is possible to "change" the reference of a hash. In other words, I have a hash, and a function that returns a hashref, and I want to make my hash point to the location in memory specified by this ref, instead of copying the contents of the hash it points to. The code looks something like this:
%hash = $h->hashref;
My obvious guess was that it should look like this:
\%hash = $h->hashref;
but that gives the error:
Can't modify reference constructor in scalar assignment
I tried a few other things, but nothing worked. Is what I am attempting actually possible?
An experimental feature which would seemingly allow you to do exactly what you're describing has been added to Perl 5.21.5, which is a development release (see "Aliasing via reference").
It sounds like you want:
use Data::Alias;
alias %hash = $h->hashref;
Or if %hash is a package variable, you can instead just do:
*hash = $h->hashref;
But either way, this should almost always be avoided; simply use the hash reference.
This question is really old, but Perl now allows this sort of thing as an experimental feature:
use v5.22;
use experimental qw(refaliasing);
my $first = {
foo => 'bar',
baz => 'quux',
};
\my %hash = $first;
Create named variable aliases with ref aliasing
Mix assignment and reference aliasing with declared_refs
Yes, but…
References in Perl are scalars. You are trying to alias the return value. This actually is possible, but you should not do this, since it involves messing with the symbol table. Furthermore, this only works for globals (declared with our): If you assign a hashref to the glob *hash it will assign to the symbol table entry %hash:
#!/usr/bin/env perl
use warnings;
use strict;
sub a_hashref{{a => "one", b => "two"}}
our %hash;
*hash = a_hashref;
printf "%3s -> %s\n", $_, $hash{$_} foreach keys %hash;
This is bad style! It isn't in PBP (directly, but consider section 5.1: “non-lexicals should be avoided”) and won't be reported by perlcritic, but you shouldn't pollute the package namespace for a little syntactic fanciness. Furthermore it doesn't work with lexical variables (which is what you might want to use most of the time, because they are lexically scoped, not package wide).
Another problem is, that if the $h->hashref method changes its return type, you'll suddenly assign to another table entry! (So if $h->hashref changes its return type to an arrayref, you assign to #hash, good luck detecting that). You could circumvent that by checking if $h->hashref really returns a hashref with 'HASH' eq ref $h->hashref`, but that would defeat the purpose.
What is the problem with just keeping the reference? If you get a reference, just store it in a scalar:
$hash = $h->hashref
To read more about the global symbol table, take a look at perlmod and consider perlref for the *FOO{THING} syntax, which sadly isn't for lvalues.
To achieve what you want, you could check out the several aliasing modules on cpan. Data::Alias or Lexical::Alias seem to fit your purpose. Also if you are interested in tie semantics and/or don't want to use XS modules, Tie::Alias might be worth a shoot.

The reason why typeglobs can be used as a reference in Perl

Sorry for may be not clear question, but I'm asking it, because I don't like to read something without understanding what I'm reading about.
Here is the snippet from the "Programming Perl":
Since the way in which you dereference something always indicates what sort of
referent you’re looking for, a typeglob can be used the same way a reference can,
despite the fact that a typeglob contains multiple referents of various types. So
${*main::foo} and ${\$main::foo} both access the same scalar variable, although
the latter is more efficient.
For me this seems wrong, and that it would be right if it were this way:
you can use a typeglob instead of the scalar variable because reference is always a scalar and compiler knows what you need.
From the book's text, the reader can assume that a reference can be something other than a scalar variable (i.e. a scalar entry in the symbol table).
Once I saw a warning: use of array as a reference is deprecated, so it appears to me that long ago this paragraph in the "Programming Perl" was meaningful, because references could be not just scalars, but in the new 4th edition it simply was not changed to comply with modern Perl.
I checked the errata page for this book but found nothing.
Is my assumption correct? If not, would be somebody so pleasant to explain, where I'm wrong.
Thank you in advance.
No. What it's saying is that unlike a normal reference, a typeglob contains multiple types of things at the same time. But the way in which you dereference it indicates which type of thing you want:
use strict;
use warnings;
use 5.010;
our $foo = 'scalar';
our #foo = qw(array of strings);
our %foo = (key => 'value');
say ${ *foo }; # prints "scalar"
say ${ *foo }[0]; # prints "array"
say ${ *foo }{key}; # prints "value"
You don't need a special "typeglob dereferencing syntax" because the normal
dereferencing syntax already indicates which slot of the typeglob you want to dereference.
Note that this doesn't work with my variables, because lexical variables aren't associated with typeglobs.
Sidenote: The "array as a reference" warning is not related to this. It refers to this syntax: #array->[0] (meaning the same as $array[0]). That was never intended to be valid syntax; it slipped into the Perl 5 parser by accident and was deprecated once Larry noticed.
although this doesn't exactly answer your question, I can try to tell you what I experience with typeglobs
they are more dynamic than scalars and references, because using a typeglob is sort of a way telling the compiler "here is a hint, guess yourself what you have to do with it"
a reference always has a strict type and target. a typeglob may just contain a string, indicating that it's supposed to point to some variable(name) or filehandle (like STDOUT) or some other value, that's accessible through this string
there are perlish hacks to accomplish some strange things, that are only possible with typeglobs, so I think even in mordern perl they are important
PerlGuts illustrated is an accessible way to learn about Perl internals.
http://www.cpan.org/authors/id/GAAS/illguts-0.09.pdf
The answers given so far are illustrative of what's going on, but learning how perl stores variables will let you see the answer is actually very simple and analogies are actually less clear than the implementation.
PS: You are commendable in wanting to understand - it will stand you in good stead for the future :)

Error using intermediate variable to access Spreadsheet::Read sheets

I'm no expert at Perl, wondering why the first way of obtaining numSheets is okay, while the following way isn't:
use Spreadsheet::Read;
my $spreadsheet = ReadData("blah.xls");
my $n1 = $spreadsheet->[1]{sheets}; # okay
my %sh = %spreadsheet->[1]; # bad
my $n2 = $sh{label};
The next to last line gives the error
Global symbol "%spreadsheet" requires explicit package name at newexcel_display.pl line xxx
I'm pretty sure I have the right sigils; if I experiment I can only get different errors. I know spreadsheet is a reference to an array not directly an array. I don't know about the hash for the metadata or individual sheets, but experimenting with different assumptions leads nowhere (at least with my modest perl skill.)
My reference on Spreadsheet::Read workings is http://search.cpan.org/perldoc?Spreadsheet::Read If there are good examples somewhere online that show how to properly use Spreadsheet, I'd like to know where they are.
It's not okay because it's not valid Perl syntax. The why is because that's not how Larry defined his language.
The sigils in front of variables tell you what you are trying to do, not what sort of variable it is. A $ means single item, as in $scalar but also single element accesses to aggregates such as $array[0] and $hash{$key}. Don't use the sigils to coerce types. Perl 5 doesn't do that.
In your case, $spreadsheet is an array reference. The %spreadsheet variable, which is a named hash, is a completely separate variable unrelated to all other variables with the same identifier. $foo, #foo, and %foo come from different namespaces. Since you haven't declared a %spreadsheet, strict throws the error that you see.
It looks like you want to get a hash reference from $spreadsheet->[1]. All references are scalars, so you want to assign to a scalar:
my $hash_ref = $spreadsheet->[1];
Once you have the hash reference in the scalar, you dereference it to get its values:
my $n2 = $hash_ref->{sheets};
This is the stuff we cover in the first part of Intermediate Perl.

Storing a hash in a hash

I'm having troubles with a Perl script. I try to store a hash in a hash. The script is trivial:
use Data::Dumper;
my %h1=();
$h1{name}="parent";
my %h2=();
$h2{name}="child";
$h1{nested}=%h2; # store hash h2 in hash h1
print "h2:\n";
print Dumper(%h2); # works
print "h1{nested}:\n";
print Dumper($h1{nested}); # fails
The results:
h2:
$VAR1 = 'name';
$VAR2 = 'child';
h1{nested}:
$VAR1 = '1/8';
Why is the $h1{nested} not dumped as a hash, but as some kind of weird scalar (1/8)?
PS: even if this question sounds trivial - I searched SO but did not find that it was asked before.
PPS: my Perl is v5.10.1 (*) built for x86_64-linux-gnu-thread-multi
(with 53 registered patches, see perl -V for more detail)
You can only store a hashref in a hash:
$h1{nested}=\%h2;
and then you would access %h2's name by doing
$h1{nested}->{name}
In your code, %h2 is forced to scalar context, which shows you that "1/8" value, and stores that.
In perl the values stored in a list (hash or array) are always scalars. Given this, the only way to store a hash inside another hash is to store a reference to it.
$h1{'nested'} = \%h2;
or also
$h1{'nested'} = { 'name'=>'child' };
(the braces in the right hand side is a reference to an anonymous hash).
BTW, to not quote the literals in the keys is usually considered bad practice, see here
Why is the $h1{nested} not dumped as a hash, but as some kind of weird scalar (1/8)?
Because you're storing it in a scalar context!
When you do this:
$h1{nested} = %h2;
You're storing a scalar. Since %h2 is a hash, you're given the ol' fraction string. According to the Perldoc website
If you evaluate a hash in scalar context, it returns false if the hash is empty. If there are any key/value pairs, it returns true; more precisely, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash.
That explains the 1/8 you're getting.
What you need to do is store the hash as a reference in the other hash. As others pointed out, it should be:
$h1{nested} = \%h2;
The backslash before the hash's name gives you the memory location where the hash is stored. You can use the curly braces, but I prefer the backslash notation.
Take a look at perldoc prelreftut on your computer (or on the webpage I've linked to). This will tell you how to make such things as a list of lists, hashes or hashes, lists of hashes, and hashes of lists. Just a word o` warning: If you get too complex, it'll be hard to maintain, so once you've had your fun, take a look at perldoc's Perl Object Orientation Programming Tutorial.
The perldoc command contains lots of Perl documentation including for all Perl function, Perl modules installed on your system, and even basic information about the Perl language.

What must I do to prevent Perl from complaining that "using a hash as a reference is deprecated"?

The code below is from an old Perl script.
print "%{#{$noss}[$i]}->{$sector} \n\n";
How should I rewrite the code above so that Perl does not complain that "using a hash as a reference is deprecated"? I have tried all sorts of way but I still couldn't quite get the hang of what the Perl compiler want me to do.
print "%{#{$noss}[$i]}->{$sector} \n\n";
should be nothing more than
print "$noss->[$i]{$sector} \n\n";
or even
print "$$noss[$i]{$sector} \n\n";
without all that rigamarole.
Guessing that $noss is a reference to an array of hash references, you can build a correct expression by following the simple rule of replacing
what would normally be an array or hash name (not including the $/#/%) with an expression giving a reference in curly braces.
So your array element, normally $foo[$i], becomes ${$noss}[$i]. That expression is itself a hashref, so to get an element from that hash, instead of $foo{$sector}, you use ${ ${$noss}[$i] }{$sector}.
This can also appear in various other forms, such as $noss->[$i]{$sector}; see http://perlmonks.org?node=References+quick+reference for simple to understand rules.
I agree with ysth and tchrist, and want to reiterate that $noss->[$i]{$sector} really is the best option for you. This syntax is more readable since it shows clearly that $noss is a reference and that you are taking the $ith element of it and further the $sector key from that element.
In terms of teaching to fish rather than giving out fish: you should read perldoc perlreftut and specifically the "use rules". Understanding these two "use rules" along with the extra "arrow rule" (yep only 3 rules) will give you a much better grasp on how to get going with references.