Why does an undef value become a valid array reference in Perl? - perl

In perl 5.8.5, if I do the following, I don't get an error:
use strict;
my $a = undef;
foreach my $el (#$a) {
...whatever
}
What's going on here? Printing out the output of ref($a) shows that $a changes to become a valid array reference at some point. But I never explicitly set $a to anything.
Seems kind of odd that the contents of a variable could change without me doing anything.
Thoughts, anyone?
EDIT: Yes, I know all about auto-vivification. I always thought that there had to be a assignment somewhere along the way to trigger it, not just a reference.

Auto-vivification is the word. From the link:
Autovivification is a distinguishing feature of the Perl programming
language involving the dynamic
creation of data structures.
Autovivification is the automatic
creation of a variable reference when
an undefined value is dereferenced. In
other words, Perl autovivification
allows a programmer to refer to a
structured variable, and arbitrary
sub-elements of that structured
variable, without expressly declaring
the existence of the variable and its
complete structure beforehand.
In
contrast, other programming languages
either: 1) require a programmer to
expressly declare an entire variable
structure before using or referring to
any part of it; or 2) require a
programmer to declare a part of a
variable structure before referring to
any part of it; or 3) create an
assignment to a part of a variable
before referring, assigning to or
composing an expression that refers to
any part of it.
Perl autovivication can be contrasted against languages such as Python, PHP, Ruby, JavaScript and all the C style languages.
Auto-vivification can be disabled with no autovivification;

Read Uri Guttman's article on autovivification.
There is nothing odd about it once you know about it and saves a lot of awkwardness.
Perl first evaluates a dereference expression and sees that the current reference value is undefined. It notes the type of dereference (scalar, array or hash) and allocates an anonymous reference of that type. Perl then stores that new reference value where the undefined value was stored. Then the dereference operation in progress is continued. If you do a nested dereference expression, then each level from top to bottom can cause its own autovivication.

Related

Curly bracket in between the string "name" means in Perl?

I am just wondering what does the curly bracket in between the string "name" means in Perl as per the example below? This is my first question, please be gentle and i am pretty new with perl
my $pool_name = $result->get->pool_attr("name")->{"name"};
To answer the question specifically, what are the curly braces. I would say here they're the syntax for a hash reference.
There's not much to explain on such a small snippet, but think of this:
%hash = (
'name' => "Harsha",
'designation' => "Manager"
);
$hash_ref = \%hash;
When we need to reference the particular element, we can use -> operator.
my $name = $hash_ref->{name};
This is a slightly modified example taken from - http://www.thegeekstuff.com/2010/06/perl-hash-reference/
Perl is a little terse this way. There are two concepts to understand here :-
Perl Object Access
Perl Reference Access
Just like java, we can have references in perl. Think of them as pointers in C if you are from C background. Now, if we want to access anything using the references, we use the "->" symbol. There are more concepts to this like blessing etc.But we won't go into that. But one important thing is that Perl Objects are also like HASH. And hence , all access to the perl Object members etc. are done in a similar fashion to HASH (HASH references , not hash objects).
So, we have an object $result.
$result->get calls the Get method on the Object. This method returns another object to you. Let's call it temp.
Now , on this object again, we call a member function pool_attr with a functional argument "name". This function returns the HASH to you finally.
Remember that Perl HASHes behave similarly to the perl Objects, so we access the "name" key using similar notation.
You can use print Data::Dumper::Dumper function and it will tell you more about the data structure. However note that perl Objects are kind of a hack so you might see a lot of unnecessary clutter with Data::Dumper::Dumper.
$pool_name is where the result will be stored.
$result is the variable holding the object.
->get is an action for the object $result.
->pool_attr("name") gets the value for the hashed item for the ->get action.
->{"name"} accesses the anonymous hash value associated with the name "name" for the value ->pool_attr("name")

variable "requires explicit package" name issue

my %order;
while ( my $rec = $data->fetchrow_hashref ) {
push #{ $result{ $rec->{"ID"} } }, $rec->{"item"};
push #order, $rec->{ID};
}
I get Global symbol "#order" requires explicit package name at linepush #order, $rec->{ID};
Perl's sigils identify distinct data types. And two identical identifiers are different variables entirely if they have different sigils.
my $var; # This is a scalar.
my #var; # This is an array.
my %var; # This is a hash.
Each of those three are completely different variables.
The error message you are getting is because in line one of the code you posted you declare a hash named %order, while on line four of the code you posted, you push to an array named #order. That array has never been declared. Without an explicit declaration indicating otherwise, Perl will assume the first time it sees a variable that it's intended to be a package global. And because you're using strict 'vars', or strict (where vars is implicit`), Perl doesn't let you autovivify a package global, or any other type of variable, without first declaring it unless you fully qualify its name.
This behavior is explained in perldoc strict, where it states:
This generates a compile-time error if you access a variable that was
neither explicitly declared (using any of my, our, state, or use vars
) nor fully qualified.
Since the clear intent in your code is to push values onto an array, it's probable that the simplest fix is to change your first line from my %order; to my #order;, so that you're declaring an array rather than a hash.
It's unclear, without seeing more code, to know what to do with the line where you're pushing onto an array by reference, though. Presumably you already know that part of the code to be correct.
You declare the hash %order, but try to use the array #order.

Perl - Two questions regarding proper syntax for dereferencing

as a newbie I am trying to explore perl data structures using this material from atlanta perl mongers, avaliable here Perl Data Structures
Here is the sample code that I've writen, 01.pl is the same as 02.pl but 01.pl contains additional two pragmas: use strict; use warnings;.
#!/usr/bin/perl
my %name = (name=>"Linus", forename=>"Torvalds");
my #system = qw(Linux FreeBSD Solaris NetBSD);
sub passStructure{
my ($arg1,$arg2)=#_;
if (ref($arg1) eq "HASH"){
&printHash($arg1);
}
elsif (ref($arg1) eq "ARRAY"){
&printArray($arg1);
}
if (ref($arg2) eq "HASH"){
&printHash($arg2);
}
elsif (ref($arg2) eq "ARRAY"){
&printArray($arg2);
}
}
sub printArray{
my $aref = $_[0];
print "#{$aref}\n";
print "#{$aref}->[0]\n";
print "$$aref[0]\n";
print "$aref->[0]\n";
}
sub printHash{
my $href = $_[0];
print "%{$href}\n";
print "%{$href}->{'name'}\n";
print "$$href{'name'}\n";
print "$href->{'name'}\n";
}
&passStructure(\#system,\%name);
There are several points mentioned in above document that I misunderstood:
1st
Page 44 mentions that those two syntax constructions: "$$href{'name'}" and "$$aref[0]" shouldn't never ever been used for accessing values. Why ? Seems in my code they are working fine (see bellow), moreover perl is complaining about using #{$aref}->[0] as deprecated, so which one is correct ?
2nd
Page 45 mentions that without "use strict" and using "$href{'SomeKey'}" when "$href->{'SomeKey'}" should be used, the %href is created implictly. So if I understand it well, both following scripts should print "Exists"
[pista#HP-PC temp]$ perl -ale 'my %ref=(SomeKey=>'SomeVal'); print $ref{'SomeKey'}; print "Exists\n" if exists $ref{'SomeKey'};'
SomeVal
Exists
[pista#HP-PC temp]$ perl -ale ' print $ref{'SomeKey'}; print "Exists\n" if exists $ref{'SomeKey'};'
but second wont, why ?
Output of two beginning mentioned scripts:
[pista#HP-PC temp]$ perl 01.pl
Using an array as a reference is deprecated at 01.pl line 32.
Linux FreeBSD Solaris NetBSD
Linux
Linux
Linux
%{HASH(0x1c33ec0)}
%{HASH(0x1c33ec0)}->{'name'}
Linus
Linus
[pista#HP-PC temp]$ perl 02.pl
Using an array as a reference is deprecated at 02.pl line 32.
Linux FreeBSD Solaris NetBSD
Linux
Linux
Linux
%{HASH(0x774e60)}
%{HASH(0x774e60)}->{'name'}
Linus
Linus
Many people think $$aref[0] is ugly and $aref->[0] not ugly. Others disagree; there is nothing wrong with the former form.
#{$aref}->[0], on the other hand, is a mistake that happens to work but is deprecated and may not continue to.
You may want to read http://perlmonks.org/?node=References+quick+reference
A package variable %href is created simply by mentioning such a hash without use strict "vars" in effect, for instance by leaving the -> out of $href->{'SomeKey'}. That doesn't mean that particular key is created.
Update: looking at the Perl Best Practices reference (a book that inspired much more slavish adoption and less actual thought than the author intended), it is recommending the -> form specifically to avoid the possibility of leaving off a sigil, leading to the problem mentioned on p45.
Perl has normal datatypes, and references to data types. It is important that you are aware of the differerence between them, both in their meaning, and in their syntax.
Type |Normal Access | Reference Access | Debatable Reference Access
=======+==============+==================+===========================
Scalar | $scalar | $$scalar_ref |
Array | $array[0] | $arrayref->[0] | $$arrayref[0]
Hash | $hash{key} | $hashref->{key} | $$hashref{key}
Code | code() | $coderef->() | &$coderef()
The reason why accessing hashrefs or arrayrefs with the $$foo[0] syntax can be considered bad is that (1) the double sigil looks confusingly like a scalar ref access, and (2) this syntax hides the fact that references are used. The dereferencing arrow -> is clear in its intent. I covered the reason why using the & sigil is bad in this answer.
The #{$aref}->[0] is extremely wrong, because you are dereferencing a reference to an array (which cannot, by definition, be a reference itself), and then dereferencing the first element of that array with the arrow. See the above table for the right syntax.
Interpolating hashes into strings seldom makes sense. The stringification of a hash denotes the number of filled and available buckets, and so can tell you about the load. This isn't useful in most cases. Also, not treating the % character as special in strings allows you to use printf…
Another interesting thing about Perl data structures is to know when a new entry in a hash or array is created. In general, accessing a value does not create a slot in that hash or array, except when you are using the value as reference.
my %foo;
$foo{bar}; # access, nothing happens
say "created at access" if exists $foo{bar};
$foo{bar}[0]; # usage as arrayref
say "created at ref usage" if exists $foo{bar};
Output: created at ref usage.
Actually, the arrayref spings into place, because you can use undef values as references in certain cases. This arrayref then populates the slot in the hash.
Without use strict 'refs', the variable (but not a slot in that variable) springs into place, because global variables are just entries in a hash that represents the namespace. $foo{bar} is the same as $main::foo{bar} is the same as $main::{foo}{bar}.
The major advantage of the $arg->[0] form over the $$arg[0] form, is that it's much clearer with the first type as to what is going on... $arg is an ARRAYREF and you're accessing the 0th element of the array it refers to.
At first reading, the second form could be interpreted as ${$arg}[0] (dereferencing an ARRAYREF) or ${$arg[0]} (dereferencing whatever the first element of #arg is.
Naturally, only one interpretation is correct, but we all have those days (or nights) where we're looking at code and we can't quite remember what order operators and other syntactic devices work in. Also, the confusion would compound if there were additional levels of dereferencing.
Defensive programmers will tend to err towards efforts to make their intentions explicit and I would argue that $arg->[0] is a much more explicit representation of the intention of that code.
As to the automatic creation of hashes... it's only the hash that would be created (so that the Perl interpreter and check to see if the key exists). The key itself is not created (naturally... you wouldn't want to create a key that you're checking for... but you may need to create the bucket that would hold that key, if the bucket doesn't exist. The process is called autovivification and you can read more about it here.
I believe you should be accessing the array as: #{$aref}[0] or $aref->[0].
a print statement does not instantiate an object. What is meant by the implicit creation is you don't need to predefine the variable before assigning to it. Since print does not assign the variable is not created.

The reason why typeglobs can be used as a reference in Perl

Sorry for may be not clear question, but I'm asking it, because I don't like to read something without understanding what I'm reading about.
Here is the snippet from the "Programming Perl":
Since the way in which you dereference something always indicates what sort of
referent you’re looking for, a typeglob can be used the same way a reference can,
despite the fact that a typeglob contains multiple referents of various types. So
${*main::foo} and ${\$main::foo} both access the same scalar variable, although
the latter is more efficient.
For me this seems wrong, and that it would be right if it were this way:
you can use a typeglob instead of the scalar variable because reference is always a scalar and compiler knows what you need.
From the book's text, the reader can assume that a reference can be something other than a scalar variable (i.e. a scalar entry in the symbol table).
Once I saw a warning: use of array as a reference is deprecated, so it appears to me that long ago this paragraph in the "Programming Perl" was meaningful, because references could be not just scalars, but in the new 4th edition it simply was not changed to comply with modern Perl.
I checked the errata page for this book but found nothing.
Is my assumption correct? If not, would be somebody so pleasant to explain, where I'm wrong.
Thank you in advance.
No. What it's saying is that unlike a normal reference, a typeglob contains multiple types of things at the same time. But the way in which you dereference it indicates which type of thing you want:
use strict;
use warnings;
use 5.010;
our $foo = 'scalar';
our #foo = qw(array of strings);
our %foo = (key => 'value');
say ${ *foo }; # prints "scalar"
say ${ *foo }[0]; # prints "array"
say ${ *foo }{key}; # prints "value"
You don't need a special "typeglob dereferencing syntax" because the normal
dereferencing syntax already indicates which slot of the typeglob you want to dereference.
Note that this doesn't work with my variables, because lexical variables aren't associated with typeglobs.
Sidenote: The "array as a reference" warning is not related to this. It refers to this syntax: #array->[0] (meaning the same as $array[0]). That was never intended to be valid syntax; it slipped into the Perl 5 parser by accident and was deprecated once Larry noticed.
although this doesn't exactly answer your question, I can try to tell you what I experience with typeglobs
they are more dynamic than scalars and references, because using a typeglob is sort of a way telling the compiler "here is a hint, guess yourself what you have to do with it"
a reference always has a strict type and target. a typeglob may just contain a string, indicating that it's supposed to point to some variable(name) or filehandle (like STDOUT) or some other value, that's accessible through this string
there are perlish hacks to accomplish some strange things, that are only possible with typeglobs, so I think even in mordern perl they are important
PerlGuts illustrated is an accessible way to learn about Perl internals.
http://www.cpan.org/authors/id/GAAS/illguts-0.09.pdf
The answers given so far are illustrative of what's going on, but learning how perl stores variables will let you see the answer is actually very simple and analogies are actually less clear than the implementation.
PS: You are commendable in wanting to understand - it will stand you in good stead for the future :)

Error using intermediate variable to access Spreadsheet::Read sheets

I'm no expert at Perl, wondering why the first way of obtaining numSheets is okay, while the following way isn't:
use Spreadsheet::Read;
my $spreadsheet = ReadData("blah.xls");
my $n1 = $spreadsheet->[1]{sheets}; # okay
my %sh = %spreadsheet->[1]; # bad
my $n2 = $sh{label};
The next to last line gives the error
Global symbol "%spreadsheet" requires explicit package name at newexcel_display.pl line xxx
I'm pretty sure I have the right sigils; if I experiment I can only get different errors. I know spreadsheet is a reference to an array not directly an array. I don't know about the hash for the metadata or individual sheets, but experimenting with different assumptions leads nowhere (at least with my modest perl skill.)
My reference on Spreadsheet::Read workings is http://search.cpan.org/perldoc?Spreadsheet::Read If there are good examples somewhere online that show how to properly use Spreadsheet, I'd like to know where they are.
It's not okay because it's not valid Perl syntax. The why is because that's not how Larry defined his language.
The sigils in front of variables tell you what you are trying to do, not what sort of variable it is. A $ means single item, as in $scalar but also single element accesses to aggregates such as $array[0] and $hash{$key}. Don't use the sigils to coerce types. Perl 5 doesn't do that.
In your case, $spreadsheet is an array reference. The %spreadsheet variable, which is a named hash, is a completely separate variable unrelated to all other variables with the same identifier. $foo, #foo, and %foo come from different namespaces. Since you haven't declared a %spreadsheet, strict throws the error that you see.
It looks like you want to get a hash reference from $spreadsheet->[1]. All references are scalars, so you want to assign to a scalar:
my $hash_ref = $spreadsheet->[1];
Once you have the hash reference in the scalar, you dereference it to get its values:
my $n2 = $hash_ref->{sheets};
This is the stuff we cover in the first part of Intermediate Perl.