How do references in Perl work - perl

Can anyone explain the push statement in the following Perl code to me? I know how push in perl works but I can't understand what the first argument in following push command represents. I am trying to interpret someone's script. I tried to print "#a\n"; but it only printed ARRAY(0x9aa370) which makes me think that the push is not doing anything. Any help is appreciated. Thanks!
my #a = ();
my $b = 10;
my $c = 'a';
push(#{$a[$b]}, $c);

Let's break it down.
The #{...} is understood from "Using References" in perlref
Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a BLOCK returning a reference of the correct type.
So what is inside { ... } block had better work out to an array reference. You have $a[$b] there, an element of #a at index $b, so that element must be an arrayref.
Then #{...} dereferences it and pushes a new element $c to it. Altogether, $c is copied into a (sole) element of an anonymous array whose reference is at index $b of the array #a.
And a crucial part: as there is in fact no arrayref † there, the autovivification kicks in and it is created. Since there are no elements at indices preceding $b they are created as well, with value undef.
Now please work through
tutorial perlreftut and
data-structures cookbook perldsc
while using perlref linked in the beginning for a full reference.
With complex data structures it is useful to be able to see them, and there are tools for that. A most often used one is the core Data::Dumper, and here is an example with Data::Dump
perl -MData::Dump=dd -wE'#ary = (1); push #{$ary[3]}, "ah"; dd \#ary'
with output
[1, undef, undef, ["ah"]]
where [] inside indicate an arrayref, with its sole element being the string ah.
† More precisely, an undef scalar is dereferenced and since this happens in an lvalue context the autovivification goes. Thanks to ikegami for a comment. See for instance this post with its links.

Let's start with the following two assertions:
#a starts out as an empty array with no elements.
$b is assigned the value of 10.
Now look at this construct:
#{$a[$b]}
To understand we can start in the middle: $a[$b] indexes element 10 of the array #a.
Now we can work outward from there: #{...} treats its contents as a reference to an array. So #{$a[$b]} treats the content of element 10 of the array #a as a reference to an anonymous array. That is to say, the scalar value contained in $a[10] is an array reference.
Now layer in the push:
push #{$a[$b]}, $c;
Into the anonymous array referenced in element 10 of #a you are pushing the value of $c, which is the character "a". You could access that element like this:
my $value = $a[10]->[0]; # character 'a'
Or shorthand,
my $value = $a[10][0];
If you pushed another value into #{$a[10]} then you would access it at:
my $other_value = $a[10][1];
But what about $a[0] through $a[9]? You're only pushing a value into $a[$b], which is $a[10]. Perl automatically extends the array to accommodate that 11th element ($a[10]), but leaves the value in $a[0] through $a[9] as undef. You mentioned that you tried this:
print "#a\n";
Interpolating an array into a string causes its elements to be printed with a space between each one. So you didn't see this:
ARRAY(0xa6f328)
You saw this:
ARRAY(0xa6f328)
...because there were ten spaces before the 11th element which contains an array reference.
If you were running your script with use warnings at the top, you would have seen this instead:
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
Use of uninitialized value in join or string at scripts/mytest.pl line 12.
ARRAY(0xa6f328)
...or something quite similar.
Your structure currently looks like this:
#a = (undef,undef,undef,undef,undef,undef,undef,undef,undef,undef,['a'])
If you ever want to really get a look at what a data structure looks like, rather than using a simple print, do something like this:
use Data::Dumper;
print Dumper \#a;

I've had a discussion over this yesterday here
what it means is that
#a is an array
$a[$b]
is a cell in the array
the #{} syntax helps perl understand that the cell in question is an array so you can preform push/pop operations on it.
if you do
use Data::Dumper;
print Dumper \#a;
you should see something like:
$VAR1 = [
undef,
undef,
undef,
undef,
undef,
undef,
undef,
undef,
undef,
undef,
[
'a'
]
];
as you can see, the 11th cell is an array containing the letter 'a' as its only value
the push operation on an empty cell could have also been written as:
$a[$b] = [$c]

Related

Why doesn't this deref of a reference work as a one-liner?

Given:
my #list1 = ('a');
my #list2 = ('b');
my #list0 = ( \#list1, \#list2 );
then
my #listRef = $list0[1];
my #list = #$listRef; # works
but
my #list = #$($list0[1]); # gives an error message
I can't figure out why. What am I missing?
There is one simple de-referencing rule that covers this. Loosely put:
What follows the sigil need be the correct reference for it, or a block that evaluates to that.
A specific case from perlreftut
You can always use an array reference, in curly braces, in place of the name of an array.
In your case then, it should be
my #list = #{ $list0[1] };
(not index [2] since your #list0 has two elements)  Spaces are there only for readability.
The attempted #$($list0[2]) is a syntax error, first because the ( (following the $) isn't allowed in an identifier (variable name), what presumably follows that $.
A block {} though would be allowed after the $ and would be evaluated, and must yield a scalar reference in this case, to be dereferenced by that $ in front of it; but then the first # would be in error. That can then fixed as well but this gets messy if pushed, and wasn't meant to go that far. While the exact rules are (still) a little murky, see Identifier Parsing in perldata.
The #$listRef earlier is correct syntax in general. But it refers to a scalar variable $listRef (which must be an array reference since it's getting dereferenced into an array by the first #), and there is no such a thing in the example -- you have an array variable #listRef.
So with use strict; in effect this, too, would fail to compile.
Dereferencing an arrayref to assign a new array is expensive as it has to copy all elements (and to construct the new array variable), while it's rarely needed (unless you actually want a copy). With the array reference on hand ($ar) all that one may need is readily available
#$ar; # list of elements
$ar->[$index]; # specific element
#$ar[#indices]; # slice -- list of some elements, like #$ar[0,2..5,-1]
$ar->#[0,-1]; # slice, with new "postfix dereferencing" (stable at v5.24)
$#$ar; # last index in the anonymous array referred by $ar
See Slices in perldata and Postfix reference slicing in perlref
You need
#{ $list0[1] }
Whenever you can use the name of a variable, you can use a block that evaluates to a reference. That means the syntax for getting the elements of an array are
#NAME # If you have the name
#BLOCK # If you have a reference
That means that
my #array1 = 4..5;
my #array2 = #array1;
and
my $array1 = [ 4..5 ];
my #array2 = #{ $array1 }
are equivalent.
When the only thing in the block is a simple scalar ($NAME or $BLOCK), you can omit the curlies. That means that
#{ $array1 }
is equivalent to
#$array1
That's why #$listRef works, and it's why #{ $list0[1] } can't be simplified.
See Perl Dereferencing Syntax.
You have a lot going on there and multiple levels of inadvertent references, so let's go through it:
First, you start by making a list of two items, each of which is an array reference. You store that in an array:
my #list0 = ( \#list2, \#list2 );
Then you ask for the item with index 2, which is a single item, and store that in an array:
my #listRef = $list0[2];
However, there is no item with index 2 because Perl indexes from zero. The value in #listRef in undefined. Not only that, but you've asked for a single item and stored it in an array instead of a scalar. That's probably not what you meant.
You say this following line works, but I don't think you know that because it won't give you the value you were expecting even if you didn't get an error. Something else is happening. You haven't declared or used a variable $listRef, so Perl creates it for you and gives it the value undef. When you try to dereference it, Perl uses "autovivification" to create the reference. This is the process where Perl helpfully creates a reference structure for you if you start with undef:
my #list = #$listRef; # works
There is nothing in that array so #list should be empty.
Fix that to get the last item, which has index of 1, and fix it so you are assigning the single value (the reference) to a scalar variable:
my $listRef = $list0[1];
Data::Dumper is handy here:
use Data::Dumper;
my #list2 = qw(a b c);
my #list0 = ( \#list2, \#list2 );
my $listRef = $list0[1];
print Dumper($listRef);
You get the output:
$VAR1 = [
'a',
'b',
'c'
];
Perl has some features that can catch these sorts of variable naming mistakes and will help you track down problems. Add these to the top of your program:
use strict;
use warnings;
For the rest, you might want to check out my book Intermediate Perl which explains all this reference stuff.
And, recent Perls have a new feature called postfix dereferencing that allows you to write dereferences from left to right:
my #items = ( \#list2, \#list2 );
my #items_of_last_ref = $items[1]->#*;
my #list = #$#listRef; # works
I doubt that works. That may not throw a syntax error but it sure as hell does not do what you think it does. For once
my #list0 = ( \#list2, \#list2 );
defines an array with 2 elements and you access
my #listRef = $list0[2];
the third element. So #listRef is an array that contains one element which is undef. The following code doesn't make sense either.
Unless the question is purely academic (answered by zdim already), I assume you want the second element of #list into a separate array, I would write
my #list = #{ $list0[1] };
The question is not complete and not clear on desired outcome.
OP tries to access an element $list0[2] of array #list0 which does not exist -- array has elements with indexes 0 and 1.
Perhaps #listRef should be $listRef instead in the post.
Bellow is my vision of described problem
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #list1 = qw/word1 word2 word3 word4/;
my #list2 = 1000..1004;
my #list0 = (\#list1, \#list2);
my $ref_array = $list0[0];
map{ say } #{$ref_array};
$ref_array = $list0[1];
map{ say } #{$ref_array};
say "Element: " . #{$ref_array}[2];
output
word1
word2
word3
word4
1000
1001
1002
1003
1004
Element: 1002

How can #? be used on a dereferenced array without first using #?

An array in perl is dereferenced like so,
my #array = #{$array_reference};
When trying to assign an array to a dereference without the '#', like,
my #array = {$array_reference};
Perl throws the error, 'Odd number of elements in anonymous hash at ./sand.pl line 22.' We can't assign it to an array variable becauase Perl is confused about the type.
So how can we perform...
my $lastindex = $#{$array_reference};
if Perl struggles to understand that '{$array_reference}' is an array type? It would make more sense to me if this looked like,
my $lastindex = $##{$array_reference};
(despite looking much uglier).
tl;dr: It's $#{$array_reference} to match the syntax of $#array.
{} is overloaded with many meanings and that's just how Perl is.
Sometimes {} creates an anonymous hash. That's what {$array_reference} is doing, trying to make a hash where the key is the stringification of $array_reference, something like "ARRAY(0x7fb21e803280)" and there is no value. Because you're trying to create a hash with a key and no value you get an "odd number of elements" warning.
Sometimes {...} is a block like sub { ... } or if(...) { ... }, or do {...} and so on.
Sometimes it's a bare block like { local $/; ... }.
Sometimes it's indicating the key of a hash like $hash{key} or $hash->{key}.
Preceeded with certain sigils {} makes dereferencing explicit. While you can write $#$array_reference or #$array_reference sometimes you want to dereference something that isn't a simple scalar. For example, if you had a function that returned an array reference you could get its size in one line with $#{ get_array_reference() }. It's $#{$array_reference} to match the syntax of $#array.
$#{...} dereferences an array and gets the index. #{...} dereferences an array. %{...} dereferences a hash. ${...} dereferences a scalar. *{...} dereferences a glob.
You might find the section on Variable Names and Sigils in Modern Perl helpful to see the pattern better.
It would make more sense to me if this looked like...
There's a lot of things like that. Perl has been around since 1987. A lot of these design decisions were made decades ago. The code for deciding what {} means is particularly complex. That there is a distinction between an array and an array reference at all is a bit odd.
$array[$index]
#array[#indexes]
#array
$#array
is equivalent to
${ \#array }[$index]
#{ \#array }[#indexes]
#{ \#array }
$#{ \#array }
See the pattern? Wherever the NAME of an array isused, you can use a BLOCK that returns a reference to an array instead. That means you can use
${ $ref }[$index]
#{ $ref }[#indexes]
#{ $ref }
$#{ $ref }
This is illustrated in Perl Dereferencing Syntax.
Note that you can omit the curlies if the BLOCK contains nothing but a simple scalar.
$$ref[$index]
#$ref[#indexes]
#$ref
$#$ref
There's also an "arrow" syntax which is considered clearer.
$ref->[$index]
$ref->#[#indexes]
$ref->#*
$ref->$#*
Perl is confused about the type
Perl struggles to understand that '{$array_reference}' is an array type
Well, it's not an array type. Perl doesn't "struggle"; you just have wrong expectations.
The general rule (as explained in perldoc perlreftut) is: You can always use a reference in curly braces in place of a variable name.
Thus:
#array # a whole array
#{ $array_ref } # same thing with a reference
$array[$i] # an array element
${ $array_ref }[$i] # same thing with a reference
$#array # last index of an array
$#{ $array_ref } # same thing with a reference
On the other hand, what's going on with
my #array = {$array_reference};
is that you're using the syntax for a hash reference constructor, { LIST }. The warning occurs because the list in question is supposed to have an even number of elements (for keys and values):
my $hash_ref = {
key1 => 'value1',
key2 => 'value2',
};
What you wrote is treated as
my #array = ({
$array_reference => undef,
});
i.e. an array containing a single element, which is a reference to a hash containing a single key, which is a stringified reference (and whose value is undef).
The syntactic difference between a dereference and a hashref constructor is that a dereference starts with a sigil (such as $, #, or %) whereas a hashref constructor starts with just a bare {.
Technically speaking the { } in the dereference syntax form an actual block of code:
print ${
print "one\n"; # yeah, I just put a statement in the middle of an expression
print "two\n";
["three"] # the last expression in this block is implicitly returned
# (and dereferenced by the surrounding $ [0] construct outside)
}[0], "\n";
For (hopefully) obvious reasons, no one actually does this in real code.
The syntax is
my $lastindex = $#$array_reference;
which assigns to $lastindex the index of the last element of the anonymous array which reference is in the variable $array_reference.
The code
my #ary = { $ra }; # works but you get a warning
doesn't throw "an error" but rather a warning. In other words, you do get #ary with one element, a reference to an anonymous hash. However, a hash need have an even number of elements so you also get a warning that that isn't so.
Your last attempt dereferences the array with #{$array_reference} -- which returns a list, not an array variable. A "list" is a fleeting collection of scalars in memory (think of copying scalars on stack to go elsewhere); there is no notion of "index" for such a thing. For this reason a $##{$ra} isn't even parsed as intended and is a syntax error.
The syntax $#ary works only with a variable #ary, and then there is the $#$arrayref syntax. You can in general write $#{$arrayref} since the curlies allow for an arbitrary expression that evaluates to an array reference but there is no reason for that since you do have a variable with an array reference.
I'd agree readily that much of this syntax takes some getting-used-to, to put it that way.

How to get sub array?

I have the code below:
#a = ((1,2,3),("test","hello"));
print #a[1]
I was expecting it to print
testhello
But it gives me 2.
Sorry for the newbie question (Perl is a bit unnatural to me) but why does it happen and how can I get the result I want?
The way Perl constructs #a is such that it is equivalent to your writing,
#a = (1,2,3,"test","hello");
And that is why when you ask for the value at index 1 by writing #a[1] (really should be $a[1]), you get 2. To demonstrate this, if you were to do the following,
use strict;
use warnings;
my #a = ((1,2,3), ("test","hello"));
my #b = (1,2,3,"test","hello");
print "#a\n";
print "#b\n";
Both print the same line,
1 2 3 test hello
1 2 3 test hello
What you want is to create anonymous arrays within your array - something like this,
my #c = ([1,2,3], ["test","hello"]);
Then if you write the following,
use Data::Dumper;
print Dumper $c[1];
You will see this printed,
$VAR1 = [
'test',
'hello'
];
Perl lists are one-dimensional only, which means (1,2,(3,4)) is automatically flattened to (1,2,3,4). If you want a multidimensional array, you must use references for any dimension beyond the first (which are stored in scalars).
You can get any anonymous array reference with bracket notation [1,2,3,4] or reference an existing array with a backslash my $ref = \#somearray.
So a construct such as my $aref = [1,2,[3,4]] is an array reference in which the first element of the referenced array is 1, the second element is 2, and the third element is another array reference.
(I find when working with multidimensional arrays, that it's less confusing just to use references even for the first dimension, but my #array = (1,2,[3,4]) is fine too.)
By the way, when you stringify a perl reference, you get some gibberish indicating the type of reference and the memory location, like "ARRAY(0x7f977b02ac58)".
Dereference an array reference to an array with #, or get a specific element of the reference with ->.
Example:
my $ref = ['A','B',['C','D']];
print $ref; # prints ARRAY(0x001)
print join ',', #{$ref}; # prints A,B,ARRAY(0x002)
print join ',', #$ref; # prints A,B,ARRAY(0x002) (shortcut for above)
print $ref->[0]; # prints A
print $ref->[1]; # prints B
print $ref->[2]; # prints ARRAY(0x002)
print $ref->[2]->[0]; # prints C
print $ref->[2][0]; # prints C (shortcut for above)
print $ref->[2][1] # prints D
print join ',', #{$ref->[2]}; # prints C,D
I think you're after an array of arrays. So, you need to create an array of array references by using square brackets, like this:
#a = ([1,2,3],["test","hello"]);
Then you can print the second array as follows:
print #{$a[1]};
Which will give you the output you were expecting: testhello
It's just a matter of wrong syntax:
print $a[1]

Perl "Not an ARRAY reference" error

I'll be glad if someone can enlighten me as to my mistake:
my %mymap;
#mymap{"balloon"} = {1,2,3};
print $mymap{"balloon"}[0] . "\n";
$mymap{'balloon'} is a hash not an array. The expression {1,2,3} creates a hash:
{
'1' => 2,
'3' => undef
}
You assigned it to a slice of %mymap corresponding to the list of keys: ('balloon'). Since the key list was 1 item and the value list was one item, you did the same thing as
$mymap{'balloon'} = { 1 => 2, 3 => undef };
If you had used strict and warnings it would have clued you in to your error. I got:
Scalar value #mymap{"balloon"} better written as $mymap{"balloon"} at - line 3.
Odd number of elements in anonymous hash at - line 3.
If you had used 'use strict; use warnings;' on the top of your code you probably have had better error messages.
What you're doing is creating a hash called mymap. A hash stores data as key => value pairs.
You're then assigning an array reference to the key balloon. Your small code snipped had two issues: 1. you did not addressed the mymap hash, 2. if you want to pass a list, you should use square brackets:
my %mymap;
$mymap{"balloon"} = [1,2,3];
print $mymap{"balloon"}[0] . "\n";
this prints '1'.
You can also just use an array:
my #balloon = (1,2,3);
print $balloon[0] . "\n";
Well, first off, always use strict; use warnings;. If you had, it might have told you about what is wrong here.
Here's what you do in your program:
my %mymap; # declare hash %mymap
#mymap{"balloon"} = {1,2,3}; # attempt to use a hash key on an undeclared
# array slice and assign an anonymous hash to it
print $mymap{"balloon"}[0] . "\n"; # print the first element of a scalar hash value
For it to do what you expect, do:
my %mymap = ( 'balloon' => [ 1,2,3 ] );
print $mymap{'balloon'}[0];
Okay, a few things...
%mymap is a hash. $mymap{"balloon"} is a scalar--namely, the value of the hash %mymap corresponding to the key "balloon". #mymap{"balloon"} is an attempt at what's called a hash slice--basically, you can use these to assign a bunch of values to a bunch of keys at once: #hash{#keys}=#values.
So, if you want to assign an array reference to $mymap{"balloon"}, you'd need something like:
$mymap{"balloon"}=[1,2,3].
To access the elements, you can use -> like so:
$mymap{"balloon"}->[0] #equals 1
$mymap{"balloon"}->[1] #equals 2
$mymap{"balloon"}->[2] #equals 3
Or, you can omit the arrows: $mymap{"balloon"}[0], etc.

How do I access the array's element stored in my hash in Perl?

# I have a hash
my %my_hash;
# I have an array
#my_array = ["aa" , "bbb"];
# I store the array in my hash
$my_hash{"Kunjan"} = #my_array;
# But I can't print my array's element
print $my_hash{"Kunjan"}[0];
I am new to Perl. Please help me.
Your array syntax is incorrect. You are creating an anonymous list reference, and #my_array is a single-element list containing that reference.
You can either work with the reference properly, as a scalar:
$my_array = ["aa" , "bbb"];
$my_hash{"Kunjan"} = $my_array;
Or you can work with the list as a list, creating the reference only when putting it into the hash:
#my_array = ("aa" , "bbb");
$my_hash{"Kunjan"} = \#my_array;
If you had only put this at the top of your script:
use strict;
use warnings;
...you would have gotten some error messages indicating what was wrong:
Global symbol "#my_array" requires explicit package name at kunjan-array.pl line 8.
Global symbol "#my_array" requires explicit package name at kunjan-array.pl line 11.
So, declare the array first with my #my_array; and then you would get:
Can't use string ("1") as an ARRAY ref while "strict refs" in use at kunjan-array.pl line 14.
You created an arrayref and attempted to assign it to an array - see perldoc perldata for how to declare an array
You attempted to assign an array to a hash (you can only assign scalars, such as an arrayref - see perldoc perlref for more about references)
You need to dereference the hash element to get at the array element, e.g. $my_hash{"Kunjan"}->[0] - again see perldoc perlref for how to dereference a hashref
You have a few errors in your program:
my #my_array = ("aa" , "bbb");
$my_hash{"Kunjan"} = \#my_array;
print $my_hash{"Kunjan"}[0];
I made three changes:
Added my in front of #my_array on the first line
Change the [...] to (...) on the first line
Add a \ in front of #my_array on the second line
Try these amendments:
my %my_hash;
# ["aa" , "bbb"] produces an array reference. Use () instead
my #my_array = ("aa" , "bbb");
# 'Kunjan' hash is given reference to #my_array
$my_hash{ Kunjan } = \#my_array;
# bareword for hash key is nicer on the eye IMHO
print $my_hash{ Kunjan }[0];
However there is still one thing you need to consider if you use this method:
unshift #my_array, 'AA';
print $my_hash{ Kunjan }[0]; # => AA - probably not what u wanted!
So what you are probably after is:
$my_hash{ Kunjan } = ["aa" , "bbb"];
Then the hash is no longer referencing #my_array.
/I3az/
Others already explained nicely what's what, but I would like to add, that (especially if you're new to Perl), it would be great if you spend some time and read the perldsc and perllol docs.