How can I create a multi-dimensional array in perl? - perl

I was creating a multi-dimensional array this way:
#!/usr/bin/perl
use warnings;
use strict;
my #a1 = (1, 2);
my #a2 = (#a1, 3);
But it turns out that I still got a one-dimensional array...
What's the right way in Perl?

You get a one-dimensional array because the array #a1 is expanded inside the parentheses. So, assuming:
my #a1 = (1, 2);
my #a2 = (#a1, 3);
Then your second statement is equivalent to my #a2 = (1,2,3);.
When creating a multi-dimensional array, you have a few choices:
Direct assignment of each value
Dereferencing an existing array
Inserting a reference
The first option is basically $array[0][0] = 1; and is not very exciting.
The second is doing this: my #a2 = (\#a1, 3);. Note that this makes a reference to the namespace for the array #a1, so if you later change #a1, the values inside #a2 will also change. It is not always a recommended option.
A variation of the second option is doing this: my #a2 = ([1,2], 3);. The brackets will create an anonymous array, which has no namespace, only a memory address, and will only exist inside #a2.
The third option, a bit more obscure, is doing this: my $a1 = [1,2]; my #a2 = ($a1, 3);. It will do exactly the same thing as 2, only the array reference is already in a scalar variable, called $a1.
Note the difference between () and [] when assigning to arrays. Brackets [] create an anonymous array, which returns an array reference as a scalar value (for example, that can be held by $a1, or $a2[0]).
Parentheses, on the other hand, do nothing at all really, except change the precedence of operators.
Consider this piece of code:
my #a2 = 1, 2, 3;
print "#a2";
This will print 1. If you use warnings, you will also get a warning such as: Useless use of a constant in void context. Basically, this happens:
my #a2 = 1;
2, 3;
Because commas (,) have a lower precedence than equal sign =. (See "Operator Precedence and Associativity" in perldoc perlop.)
Parentheses simply negate the default precedence of = and ,, and group 1,2,3 together in a list, which is then passed to #a2.
So, in short, brackets, [], have some magic in them: They create anonymous arrays. Parentheses, (), just change precedence, much like in math.
There is much to read in the documentation. Someone here once showed me a very good link for dereferencing, but I don't recall what it was. In perldoc perlreftut you will find a basic tutorial on references. And in perldoc perldsc you will find documentation on data structures (thanks Oesor for reminding me).

I would propose to work through perlreftut, perldsc and perllol, preferably in the same day and preferably using Data::Dumper to print data structures.
The tutorials complement each other and I think they would take better effect together. Visualizing data structures helped me a lot to believe they actually work (seriously) and to see my mistakes.

Arrays contain scalars, so you need to add a reference.
my #a1 = (1,2);
my #a2 = (\#a1, ,3);
You'll want to read http://perldoc.perl.org/perldsc.html.

The most important thing to understand
about all data structures in
Perl--including multidimensional
arrays--is that even though they might
appear otherwise, Perl #ARRAY s and
%HASH es are all internally
one-dimensional. They can hold only
scalar values (meaning a string,
number, or a reference). They cannot
directly contain other arrays or
hashes, but instead contain references
to other arrays or hashes.
Now, because the top level contains only references, if you try to print out your array in with a simple print() function, you'll get something that doesn't look very nice, like this:
#AoA = ( [2, 3], [4, 5, 7], [0] );
print $AoA[1][2];
7
print #AoA;
ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
That's because Perl doesn't (ever) implicitly dereference your variables. If you want to get at the thing a reference is referring to, then you have to do this yourself using either prefix typing indicators, like ${$blah} , #{$blah} , #{$blah[$i]} , or else postfix pointer arrows, like $a->[3] , $h->{fred} , or even $ob->method()->[3]
Source: perldoc
Now coming to your question. Here's your code:
my #a1 = (1,2);
my #a2 = (#a1,3);
Notice that the arrays contain scalar values. So you have to use reference and you can add a reference by using the \ keyword before an array's name which is to be referenced.
Like this:
my #a2 = (\#a1, ,3);

Inner arrays should be scalar references in the outer one:
my #a2 = (\#a1,3); # first element is a reference to a1
print ${$a2[0]}[1]; # print second element of inner array

This is a simple example of a 2D array as ref:
my $AoA = undef;
for(my $i=0; $i<3; $i++) {
for(my $j=0; $j<3; $j++) {
$AoA->[$i]->[$j] = rand(); # Assign some value
}
}

Related

Perl assigning #ARGV array to a variable

When I assign the Perl #ARGV array to a variable, if I don't use the quotes, it gives me the number of strings in the array, and not the strings in the array.
What is this called - I thought it was dereferencing, but it is not. Right now I am calling it one more thing I need to memorize in Perl.
#!/usr/bin/perl
use strict ;
use warnings;
my $str = "#ARGV" ;
#my $str = #ARGV ;
#my $str = 'geeks, for, geeks';
my #spl = split(', ' , $str);
foreach my $i (#spl) {
print "$i\n" ;
}
If you assign an array to a scalar in Perl, you get the number of elements in the array.
my #array = (1, 1, 2, 3, 5, 8, 13);
my $scalar = #array; # $scalar contains 7
This is known as "evaluating an array in scalar context".
If you expand an array in a double-quoted string in Perl, you get the elements of the array separated by spaces.
my #array = (1, 1, 2, 3, 5, 8, 13);
my $scalar = "#array"; $scalar contains '1 1 2 3 5 8 13'
This is known as "interpolating an array in a double-quoted string".
Actually, in that second example, the elements are separated by the current contents of the $" variable. And the default value of that variable is a space. But you can change it.
my #array = (1, 1, 2, 3, 5, 8, 13);
$" = '+';
my $scalar = "#array"; $scalar contains '1+1+2+3+5+8+13'
To store a reference to the array, you either take a reference to the array.
my $scalar = \#array;
Or create a new, anonymous array using the elements of of the original array.
my $scalar = [ #array ];
Because we don't know what you are actually trying to do, we can't recommend which of these is the best approach.
Perl works by context. The one that you see here is scalar versus array context. In scalar context, you want one thing, so Perl gives you the one thing the probably makes sense. Recognize the context and you can probably suss out what's going on.
When you have a scalar on the left side of an assignment, you have scalar context because you want to end up with one thing:
my $one_thing = ...
Put an array on the right side, and you have an array in scalar context. The design of Perl decided that the most common thing people probably want in that case is the number of elements:
my $one_thing = #array;
This works with some other builtins too. The localtime builtin returns a single string in scalar context (a timestamp):
my $uid = localtime; # Tue Mar 17 11:39:47 2020
But, in array context, you want possibly multiple things (where that could be two, or one, or zero, or ten thousand, or...). In that case, localtime returns a list of things:
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime();
You already know some of this though, probably. The + operator uses its operands as numbers, but the . operator uses them as strings:
my $sum = '123' + 14;
my $string = '123' . 14;
Perl's philosophy is that it is going to try to do what the verbs (operators, builtins, functions) are trying to do, not what the nouns (variable or value type) might imply. Many languages tell the verbs what to do based on the nouns, so fitting Perl into one of those mental modules usually doesn't work out. You don't have to memorize a lot; I've been doing this quite awhile and I still refer to the docs often.
We go through a lot of this philosophical explanation in Learning Perl.
The idiom you are looking for is one of
my $str = \#ARGV;
my $str = [ #ARGV ];
These both assign an array reference to the scalar variable $str. You can then get back the elements of #ARGV when you dereference $str. For example,
for my $i (#$str) {
print "$i\n";
}
(Some people prefer #{$str}, which does the same thing)
\ is the reference operator, which returns a reference to whatever is on its right hand side.
[...] creates a new array reference out of whatever is contained between the brackets.
"#array" is a stringify operation on an array, and equivalent to join($", #array)
And finally, a scalar assignment from an array, like
$n = #array
returns the number of elements in the array.
The total number of elements in the Array can sometimes be required. Since such situations are often encountered, we also have to learn how to get the total number.
#array = ("a".."z");
$re = $#array;
print ("$re\n");
We may need to add one to the number we get to reach the total number.
$ay = ("a", "b", "c");
$re = $#ay;
$re = $re +1;
print ("$re\n");
result: 3

Why doesn't this deref of a reference work as a one-liner?

Given:
my #list1 = ('a');
my #list2 = ('b');
my #list0 = ( \#list1, \#list2 );
then
my #listRef = $list0[1];
my #list = #$listRef; # works
but
my #list = #$($list0[1]); # gives an error message
I can't figure out why. What am I missing?
There is one simple de-referencing rule that covers this. Loosely put:
What follows the sigil need be the correct reference for it, or a block that evaluates to that.
A specific case from perlreftut
You can always use an array reference, in curly braces, in place of the name of an array.
In your case then, it should be
my #list = #{ $list0[1] };
(not index [2] since your #list0 has two elements)  Spaces are there only for readability.
The attempted #$($list0[2]) is a syntax error, first because the ( (following the $) isn't allowed in an identifier (variable name), what presumably follows that $.
A block {} though would be allowed after the $ and would be evaluated, and must yield a scalar reference in this case, to be dereferenced by that $ in front of it; but then the first # would be in error. That can then fixed as well but this gets messy if pushed, and wasn't meant to go that far. While the exact rules are (still) a little murky, see Identifier Parsing in perldata.
The #$listRef earlier is correct syntax in general. But it refers to a scalar variable $listRef (which must be an array reference since it's getting dereferenced into an array by the first #), and there is no such a thing in the example -- you have an array variable #listRef.
So with use strict; in effect this, too, would fail to compile.
Dereferencing an arrayref to assign a new array is expensive as it has to copy all elements (and to construct the new array variable), while it's rarely needed (unless you actually want a copy). With the array reference on hand ($ar) all that one may need is readily available
#$ar; # list of elements
$ar->[$index]; # specific element
#$ar[#indices]; # slice -- list of some elements, like #$ar[0,2..5,-1]
$ar->#[0,-1]; # slice, with new "postfix dereferencing" (stable at v5.24)
$#$ar; # last index in the anonymous array referred by $ar
See Slices in perldata and Postfix reference slicing in perlref
You need
#{ $list0[1] }
Whenever you can use the name of a variable, you can use a block that evaluates to a reference. That means the syntax for getting the elements of an array are
#NAME # If you have the name
#BLOCK # If you have a reference
That means that
my #array1 = 4..5;
my #array2 = #array1;
and
my $array1 = [ 4..5 ];
my #array2 = #{ $array1 }
are equivalent.
When the only thing in the block is a simple scalar ($NAME or $BLOCK), you can omit the curlies. That means that
#{ $array1 }
is equivalent to
#$array1
That's why #$listRef works, and it's why #{ $list0[1] } can't be simplified.
See Perl Dereferencing Syntax.
You have a lot going on there and multiple levels of inadvertent references, so let's go through it:
First, you start by making a list of two items, each of which is an array reference. You store that in an array:
my #list0 = ( \#list2, \#list2 );
Then you ask for the item with index 2, which is a single item, and store that in an array:
my #listRef = $list0[2];
However, there is no item with index 2 because Perl indexes from zero. The value in #listRef in undefined. Not only that, but you've asked for a single item and stored it in an array instead of a scalar. That's probably not what you meant.
You say this following line works, but I don't think you know that because it won't give you the value you were expecting even if you didn't get an error. Something else is happening. You haven't declared or used a variable $listRef, so Perl creates it for you and gives it the value undef. When you try to dereference it, Perl uses "autovivification" to create the reference. This is the process where Perl helpfully creates a reference structure for you if you start with undef:
my #list = #$listRef; # works
There is nothing in that array so #list should be empty.
Fix that to get the last item, which has index of 1, and fix it so you are assigning the single value (the reference) to a scalar variable:
my $listRef = $list0[1];
Data::Dumper is handy here:
use Data::Dumper;
my #list2 = qw(a b c);
my #list0 = ( \#list2, \#list2 );
my $listRef = $list0[1];
print Dumper($listRef);
You get the output:
$VAR1 = [
'a',
'b',
'c'
];
Perl has some features that can catch these sorts of variable naming mistakes and will help you track down problems. Add these to the top of your program:
use strict;
use warnings;
For the rest, you might want to check out my book Intermediate Perl which explains all this reference stuff.
And, recent Perls have a new feature called postfix dereferencing that allows you to write dereferences from left to right:
my #items = ( \#list2, \#list2 );
my #items_of_last_ref = $items[1]->#*;
my #list = #$#listRef; # works
I doubt that works. That may not throw a syntax error but it sure as hell does not do what you think it does. For once
my #list0 = ( \#list2, \#list2 );
defines an array with 2 elements and you access
my #listRef = $list0[2];
the third element. So #listRef is an array that contains one element which is undef. The following code doesn't make sense either.
Unless the question is purely academic (answered by zdim already), I assume you want the second element of #list into a separate array, I would write
my #list = #{ $list0[1] };
The question is not complete and not clear on desired outcome.
OP tries to access an element $list0[2] of array #list0 which does not exist -- array has elements with indexes 0 and 1.
Perhaps #listRef should be $listRef instead in the post.
Bellow is my vision of described problem
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #list1 = qw/word1 word2 word3 word4/;
my #list2 = 1000..1004;
my #list0 = (\#list1, \#list2);
my $ref_array = $list0[0];
map{ say } #{$ref_array};
$ref_array = $list0[1];
map{ say } #{$ref_array};
say "Element: " . #{$ref_array}[2];
output
word1
word2
word3
word4
1000
1001
1002
1003
1004
Element: 1002

Perl dereferencing in non-strict mode

In Perl, if I have:
no strict;
#ARY = (58, 90);
To operate on an element of the array, say it, the 2nd one, I would write (possibly as part of a larger expression):
$ARY[1] # The most common way found in Perldoc's idioms.
Though, for some reason these also work:
#ARY[1]
#{ARY[1]}
Resulting all in the same object:
print (\$ARY[1]);
print (\#ARY[1]);
print (\#{ARY[1]});
Output:
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
What is the syntax rules that enable this sort of constructs? How far could one devise reliable program code with each of these constructs, or with a mix of all of them either? How interchangeable are these expressions? (always speaking in a non-strict context).
On a concern of justifying how I come into this question, I agree "use strict" as a better practice, still I'm interested at some knowledge on build-up non-strict expressions.
In an attemp to find myself some help to this uneasiness, I came to:
The notion on "no strict;" of not complaining about undeclared
variables and quirk syntax.
The prefix dereference having higher precedence than subindex [] (perldsc § "Caveat on precedence").
The clarification on when to use # instead of $ (perldata § "Slices").
The lack of "[]" (array subscript / slice) description among the Perl's operators (perlop), which lead me to think it is not an
operator... (yet it has to be something else. But, what?).
For what I learned, none of these hints, put together, make me better understand my issue.
Thanks in advance.
Quotation from perlfaq4:
What is the difference between $array[1] and #array[1]?
The difference is the sigil, that special character in front of the array name. The $ sigil means "exactly one item", while the # sigil means "zero or more items". The $ gets you a single scalar, while the # gets you a list.
Please see: What is the difference between $array[1] and #array[1]?
#ARY[1] is indeed a slice, in fact a slice of only one member. The difference is it creates a list context:
#ar1[0] = qw( a b c ); # List context.
$ar2[0] = qw( a b c ); # Scalar context, the last value is returned.
print "<#ar1> <#ar2>\n";
Output:
<a> <c>
Besides using strict, turn warnings on, too. You'll get the following warning:
Scalar value #ar1[0] better written as $ar1[0]
In perlop, you can read that "Perl's prefix dereferencing operators are typed: $, #, %, and &." The standard syntax is SIGIL { ... }, but in the simple cases, the curly braces can be omitted.
See Can you use string as a HASH ref while "strict refs" in use? for some fun with no strict refs and its emulation under strict.
Extending choroba's answer, to check a particular context, you can use wantarray
sub context { return wantarray ? "LIST" : "SCALAR" }
print $ary1[0] = context(), "\n";
print #ary1[0] = context(), "\n";
Outputs:
SCALAR
LIST
Nothing you did requires no strict; other than to hide your error of doing
#ARY = (58, 90);
when you should have done
my #ARY = (58, 90);
The following returns a single element of the array. Since EXPR is to return a single index, it is evaluated in scalar context.
$array[EXPR]
e.g.
my #array = qw( a b c d );
my $index = 2;
my $ele = $array[$index]; # my $ele = 'c';
The following returns the elements identified by LIST. Since LIST is to return 0 or more elements, it must be evaluated in list context.
#array[LIST]
e.g.
my #array = qw( a b c d );
my #indexes ( 1, 2 );
my #slice = $array[#indexes]; # my #slice = qw( b c );
\( $ARY[$index] ) # Returns a ref to the element returned by $ARY[$index]
\( #ARY[#indexes] ) # Returns refs to each element returned by #ARY[#indexes]
${foo} # Weird way of writing $foo. Useful in literals, e.g. "${foo}bar"
#{foo} # Weird way of writing #foo. Useful in literals, e.g. "#{foo}bar"
${foo}[...] # Weird way of writing $foo[...].
Most people don't even know you can use these outside of string literals.

Perl: dereferencing an array is using scalar context?

I have a strange issue with Perl and dereferencing.
I have an INI file with array values, under two different sections e.g.
[Common]
animals =<<EOT
dog
cat
EOT
[ACME]
animals =<<EOT
cayote
bird
EOT
I have a sub routine to read the INI file into an %INI hash and cope with multi-line entries.
I then use an $org variable to determine whether we use the common array or a specific organisation array.
#array = #{$INI{$org}->{animals}} || #{$INI{Common}->{animals}};
The 'Common' array works fine, i.e. if $org is anything but 'ACME' I get the values (dog cat) but if $org equals 'ACME'` I get a value of 2 back?
Any ideas??
Derefencing arrays is of course not forcing scalar context. But using || is. Therefore things like $val = $special_val || "the default"; work just fine while your example doesn't.
Therefore #array will contain either a single number (the number of elements in the first array) or, if that is 0, the elements of the second array.
The perlop perldoc page even lists this example speficially:
In particular, this means that you shouldn't use this for selecting
between two aggregates for assignment:
#a = #b || #c; # this is wrong
#a = scalar(#b) || #c; # really meant this
#a = #b ? #b : #c; # this works fine, though
Depending on what you want, the solution could be:
my #array = #{$INI{$org}->{animals}}
? #{$INI{$org}->{animals}}
: #{$INI{Common}->{animals}};

How do I determine the number of elements in an array reference?

Here is the situation I am facing...
$perl_scalar = decode_json( encode ('utf8',$line));
decode_json returns a reference. I am sure this is an array. How do I find the size of $perl_scalar?? As per Perl documentation, arrays are referenced using #name. Is there a workaround?
This reference consist of an array of hashes. I would like to get the number of hashes.
If I do length($perl_scalar), I get some number which does not match the number of elements in array.
That would be:
scalar(#{$perl_scalar});
You can get more information from perlreftut.
You can copy your referenced array to a normal one like this:
my #array = #{$perl_scalar};
But before that you should check whether the $perl_scalar is really referencing an array, with ref:
if (ref($perl_scalar) eq "ARRAY") {
my #array = #{$perl_scalar};
# ...
}
The length method cannot be used to calculate length of arrays. It's for getting the length of the strings.
You can also use the last index of the array to calculate the number of elements in the array.
my $length = $#{$perl_scalar} + 1;
$num_of_hashes = #{$perl_scalar};
Since you're assigning to a scalar, the dereferenced array is evaluated in a scalar context to the number of elements.
If you need to force scalar context then do as KARASZI says and use the scalar function.
You can see the entire structure with Data::Dumper:
use Data::Dumper;
print Dumper $perl_scalar;
Data::Dumper is a standard module that is installed with Perl. For a complete list of all the standard pragmatics and modules, see perldoc perlmodlib.