Perl: dereferencing an array is using scalar context?

Perl: dereferencing an array is using scalar context? - perl

I have a strange issue with Perl and dereferencing.
I have an INI file with array values, under two different sections e.g.
[Common]
animals =<<EOT
dog
cat
EOT
[ACME]
animals =<<EOT
cayote
bird
EOT
I have a sub routine to read the INI file into an %INI hash and cope with multi-line entries.
I then use an $org variable to determine whether we use the common array or a specific organisation array.
#array = #{$INI{$org}->{animals}} || #{$INI{Common}->{animals}};
The 'Common' array works fine, i.e. if $org is anything but 'ACME' I get the values (dog cat) but if $org equals 'ACME'` I get a value of 2 back?
Any ideas??

Derefencing arrays is of course not forcing scalar context. But using || is. Therefore things like $val = $special_val || "the default"; work just fine while your example doesn't.
Therefore #array will contain either a single number (the number of elements in the first array) or, if that is 0, the elements of the second array.
The perlop perldoc page even lists this example speficially:
In particular, this means that you shouldn't use this for selecting
between two aggregates for assignment:
#a = #b || #c; # this is wrong
#a = scalar(#b) || #c; # really meant this
#a = #b ? #b : #c; # this works fine, though
Depending on what you want, the solution could be:
my #array = #{$INI{$org}->{animals}}
? #{$INI{$org}->{animals}}
: #{$INI{Common}->{animals}};

Related

Perl assigning #ARGV array to a variable

When I assign the Perl #ARGV array to a variable, if I don't use the quotes, it gives me the number of strings in the array, and not the strings in the array.
What is this called - I thought it was dereferencing, but it is not. Right now I am calling it one more thing I need to memorize in Perl.
#!/usr/bin/perl
use strict ;
use warnings;
my $str = "#ARGV" ;
#my $str = #ARGV ;
#my $str = 'geeks, for, geeks';
my #spl = split(', ' , $str);
foreach my $i (#spl) {
print "$i\n" ;
}

If you assign an array to a scalar in Perl, you get the number of elements in the array.
my #array = (1, 1, 2, 3, 5, 8, 13);
my $scalar = #array; # $scalar contains 7
This is known as "evaluating an array in scalar context".
If you expand an array in a double-quoted string in Perl, you get the elements of the array separated by spaces.
my #array = (1, 1, 2, 3, 5, 8, 13);
my $scalar = "#array"; $scalar contains '1 1 2 3 5 8 13'
This is known as "interpolating an array in a double-quoted string".
Actually, in that second example, the elements are separated by the current contents of the $" variable. And the default value of that variable is a space. But you can change it.
my #array = (1, 1, 2, 3, 5, 8, 13);
$" = '+';
my $scalar = "#array"; $scalar contains '1+1+2+3+5+8+13'
To store a reference to the array, you either take a reference to the array.
my $scalar = \#array;
Or create a new, anonymous array using the elements of of the original array.
my $scalar = [ #array ];
Because we don't know what you are actually trying to do, we can't recommend which of these is the best approach.

Perl works by context. The one that you see here is scalar versus array context. In scalar context, you want one thing, so Perl gives you the one thing the probably makes sense. Recognize the context and you can probably suss out what's going on.
When you have a scalar on the left side of an assignment, you have scalar context because you want to end up with one thing:
my $one_thing = ...
Put an array on the right side, and you have an array in scalar context. The design of Perl decided that the most common thing people probably want in that case is the number of elements:
my $one_thing = #array;
This works with some other builtins too. The localtime builtin returns a single string in scalar context (a timestamp):
my $uid = localtime; # Tue Mar 17 11:39:47 2020
But, in array context, you want possibly multiple things (where that could be two, or one, or zero, or ten thousand, or...). In that case, localtime returns a list of things:
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime();
You already know some of this though, probably. The + operator uses its operands as numbers, but the . operator uses them as strings:
my $sum = '123' + 14;
my $string = '123' . 14;
Perl's philosophy is that it is going to try to do what the verbs (operators, builtins, functions) are trying to do, not what the nouns (variable or value type) might imply. Many languages tell the verbs what to do based on the nouns, so fitting Perl into one of those mental modules usually doesn't work out. You don't have to memorize a lot; I've been doing this quite awhile and I still refer to the docs often.
We go through a lot of this philosophical explanation in Learning Perl.

The idiom you are looking for is one of
my $str = \#ARGV;
my $str = [ #ARGV ];
These both assign an array reference to the scalar variable $str. You can then get back the elements of #ARGV when you dereference $str. For example,
for my $i (#$str) {
print "$i\n";
}
(Some people prefer #{$str}, which does the same thing)
\ is the reference operator, which returns a reference to whatever is on its right hand side.
[...] creates a new array reference out of whatever is contained between the brackets.
"#array" is a stringify operation on an array, and equivalent to join($", #array)
And finally, a scalar assignment from an array, like
$n = #array
returns the number of elements in the array.

The total number of elements in the Array can sometimes be required. Since such situations are often encountered, we also have to learn how to get the total number.
#array = ("a".."z");
$re = $#array;
print ("$re\n");
We may need to add one to the number we get to reach the total number.
$ay = ("a", "b", "c");
$re = $#ay;
$re = $re +1;
print ("$re\n");
result: 3

Why doesn't this deref of a reference work as a one-liner?

Given:
my #list1 = ('a');
my #list2 = ('b');
my #list0 = ( \#list1, \#list2 );
then
my #listRef = $list0[1];
my #list = #$listRef; # works
but
my #list = #$($list0[1]); # gives an error message
I can't figure out why. What am I missing?

There is one simple de-referencing rule that covers this. Loosely put:
What follows the sigil need be the correct reference for it, or a block that evaluates to that.
A specific case from perlreftut
You can always use an array reference, in curly braces, in place of the name of an array.
In your case then, it should be
my #list = #{ $list0[1] };
(not index [2] since your #list0 has two elements)  Spaces are there only for readability.
The attempted #$($list0[2]) is a syntax error, first because the ( (following the $) isn't allowed in an identifier (variable name), what presumably follows that $.
A block {} though would be allowed after the $ and would be evaluated, and must yield a scalar reference in this case, to be dereferenced by that $ in front of it; but then the first # would be in error. That can then fixed as well but this gets messy if pushed, and wasn't meant to go that far. While the exact rules are (still) a little murky, see Identifier Parsing in perldata.
The #$listRef earlier is correct syntax in general. But it refers to a scalar variable $listRef (which must be an array reference since it's getting dereferenced into an array by the first #), and there is no such a thing in the example -- you have an array variable #listRef.
So with use strict; in effect this, too, would fail to compile.
Dereferencing an arrayref to assign a new array is expensive as it has to copy all elements (and to construct the new array variable), while it's rarely needed (unless you actually want a copy). With the array reference on hand ($ar) all that one may need is readily available
#$ar; # list of elements
$ar->[$index]; # specific element
#$ar[#indices]; # slice -- list of some elements, like #$ar[0,2..5,-1]
$ar->#[0,-1]; # slice, with new "postfix dereferencing" (stable at v5.24)
$#$ar; # last index in the anonymous array referred by $ar
See Slices in perldata and Postfix reference slicing in perlref

You need
#{ $list0[1] }
Whenever you can use the name of a variable, you can use a block that evaluates to a reference. That means the syntax for getting the elements of an array are
#NAME # If you have the name
#BLOCK # If you have a reference
That means that
my #array1 = 4..5;
my #array2 = #array1;
and
my $array1 = [ 4..5 ];
my #array2 = #{ $array1 }
are equivalent.
When the only thing in the block is a simple scalar ($NAME or $BLOCK), you can omit the curlies. That means that
#{ $array1 }
is equivalent to
#$array1
That's why #$listRef works, and it's why #{ $list0[1] } can't be simplified.
See Perl Dereferencing Syntax.

You have a lot going on there and multiple levels of inadvertent references, so let's go through it:
First, you start by making a list of two items, each of which is an array reference. You store that in an array:
my #list0 = ( \#list2, \#list2 );
Then you ask for the item with index 2, which is a single item, and store that in an array:
my #listRef = $list0[2];
However, there is no item with index 2 because Perl indexes from zero. The value in #listRef in undefined. Not only that, but you've asked for a single item and stored it in an array instead of a scalar. That's probably not what you meant.
You say this following line works, but I don't think you know that because it won't give you the value you were expecting even if you didn't get an error. Something else is happening. You haven't declared or used a variable $listRef, so Perl creates it for you and gives it the value undef. When you try to dereference it, Perl uses "autovivification" to create the reference. This is the process where Perl helpfully creates a reference structure for you if you start with undef:
my #list = #$listRef; # works
There is nothing in that array so #list should be empty.
Fix that to get the last item, which has index of 1, and fix it so you are assigning the single value (the reference) to a scalar variable:
my $listRef = $list0[1];
Data::Dumper is handy here:
use Data::Dumper;
my #list2 = qw(a b c);
my #list0 = ( \#list2, \#list2 );
my $listRef = $list0[1];
print Dumper($listRef);
You get the output:
$VAR1 = [
'a',
'b',
'c'
];
Perl has some features that can catch these sorts of variable naming mistakes and will help you track down problems. Add these to the top of your program:
use strict;
use warnings;
For the rest, you might want to check out my book Intermediate Perl which explains all this reference stuff.
And, recent Perls have a new feature called postfix dereferencing that allows you to write dereferences from left to right:
my #items = ( \#list2, \#list2 );
my #items_of_last_ref = $items[1]->#*;

my #list = #$#listRef; # works
I doubt that works. That may not throw a syntax error but it sure as hell does not do what you think it does. For once
my #list0 = ( \#list2, \#list2 );
defines an array with 2 elements and you access
my #listRef = $list0[2];
the third element. So #listRef is an array that contains one element which is undef. The following code doesn't make sense either.
Unless the question is purely academic (answered by zdim already), I assume you want the second element of #list into a separate array, I would write
my #list = #{ $list0[1] };

The question is not complete and not clear on desired outcome.
OP tries to access an element $list0[2] of array #list0 which does not exist -- array has elements with indexes 0 and 1.
Perhaps #listRef should be $listRef instead in the post.
Bellow is my vision of described problem
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #list1 = qw/word1 word2 word3 word4/;
my #list2 = 1000..1004;
my #list0 = (\#list1, \#list2);
my $ref_array = $list0[0];
map{ say } #{$ref_array};
$ref_array = $list0[1];
map{ say } #{$ref_array};
say "Element: " . #{$ref_array}[2];
output
word1
word2
word3
word4
1000
1001
1002
1003
1004
Element: 1002

How to get sub array?

I have the code below:
#a = ((1,2,3),("test","hello"));
print #a[1]
I was expecting it to print
testhello
But it gives me 2.
Sorry for the newbie question (Perl is a bit unnatural to me) but why does it happen and how can I get the result I want?

The way Perl constructs #a is such that it is equivalent to your writing,
#a = (1,2,3,"test","hello");
And that is why when you ask for the value at index 1 by writing #a[1] (really should be $a[1]), you get 2. To demonstrate this, if you were to do the following,
use strict;
use warnings;
my #a = ((1,2,3), ("test","hello"));
my #b = (1,2,3,"test","hello");
print "#a\n";
print "#b\n";
Both print the same line,
1 2 3 test hello
1 2 3 test hello
What you want is to create anonymous arrays within your array - something like this,
my #c = ([1,2,3], ["test","hello"]);
Then if you write the following,
use Data::Dumper;
print Dumper $c[1];
You will see this printed,
$VAR1 = [
'test',
'hello'
];

Perl lists are one-dimensional only, which means (1,2,(3,4)) is automatically flattened to (1,2,3,4). If you want a multidimensional array, you must use references for any dimension beyond the first (which are stored in scalars).
You can get any anonymous array reference with bracket notation [1,2,3,4] or reference an existing array with a backslash my $ref = \#somearray.
So a construct such as my $aref = [1,2,[3,4]] is an array reference in which the first element of the referenced array is 1, the second element is 2, and the third element is another array reference.
(I find when working with multidimensional arrays, that it's less confusing just to use references even for the first dimension, but my #array = (1,2,[3,4]) is fine too.)
By the way, when you stringify a perl reference, you get some gibberish indicating the type of reference and the memory location, like "ARRAY(0x7f977b02ac58)".
Dereference an array reference to an array with #, or get a specific element of the reference with ->.
Example:
my $ref = ['A','B',['C','D']];
print $ref; # prints ARRAY(0x001)
print join ',', #{$ref}; # prints A,B,ARRAY(0x002)
print join ',', #$ref; # prints A,B,ARRAY(0x002) (shortcut for above)
print $ref->[0]; # prints A
print $ref->[1]; # prints B
print $ref->[2]; # prints ARRAY(0x002)
print $ref->[2]->[0]; # prints C
print $ref->[2][0]; # prints C (shortcut for above)
print $ref->[2][1] # prints D
print join ',', #{$ref->[2]}; # prints C,D

I think you're after an array of arrays. So, you need to create an array of array references by using square brackets, like this:
#a = ([1,2,3],["test","hello"]);
Then you can print the second array as follows:
print #{$a[1]};
Which will give you the output you were expecting: testhello

It's just a matter of wrong syntax:
print $a[1]

How can I create a multi-dimensional array in perl?

I was creating a multi-dimensional array this way:
#!/usr/bin/perl
use warnings;
use strict;
my #a1 = (1, 2);
my #a2 = (#a1, 3);
But it turns out that I still got a one-dimensional array...
What's the right way in Perl?

You get a one-dimensional array because the array #a1 is expanded inside the parentheses. So, assuming:
my #a1 = (1, 2);
my #a2 = (#a1, 3);
Then your second statement is equivalent to my #a2 = (1,2,3);.
When creating a multi-dimensional array, you have a few choices:
Direct assignment of each value
Dereferencing an existing array
Inserting a reference
The first option is basically $array[0][0] = 1; and is not very exciting.
The second is doing this: my #a2 = (\#a1, 3);. Note that this makes a reference to the namespace for the array #a1, so if you later change #a1, the values inside #a2 will also change. It is not always a recommended option.
A variation of the second option is doing this: my #a2 = ([1,2], 3);. The brackets will create an anonymous array, which has no namespace, only a memory address, and will only exist inside #a2.
The third option, a bit more obscure, is doing this: my $a1 = [1,2]; my #a2 = ($a1, 3);. It will do exactly the same thing as 2, only the array reference is already in a scalar variable, called $a1.
Note the difference between () and [] when assigning to arrays. Brackets [] create an anonymous array, which returns an array reference as a scalar value (for example, that can be held by $a1, or $a2[0]).
Parentheses, on the other hand, do nothing at all really, except change the precedence of operators.
Consider this piece of code:
my #a2 = 1, 2, 3;
print "#a2";
This will print 1. If you use warnings, you will also get a warning such as: Useless use of a constant in void context. Basically, this happens:
my #a2 = 1;
2, 3;
Because commas (,) have a lower precedence than equal sign =. (See "Operator Precedence and Associativity" in perldoc perlop.)
Parentheses simply negate the default precedence of = and ,, and group 1,2,3 together in a list, which is then passed to #a2.
So, in short, brackets, [], have some magic in them: They create anonymous arrays. Parentheses, (), just change precedence, much like in math.
There is much to read in the documentation. Someone here once showed me a very good link for dereferencing, but I don't recall what it was. In perldoc perlreftut you will find a basic tutorial on references. And in perldoc perldsc you will find documentation on data structures (thanks Oesor for reminding me).

I would propose to work through perlreftut, perldsc and perllol, preferably in the same day and preferably using Data::Dumper to print data structures.
The tutorials complement each other and I think they would take better effect together. Visualizing data structures helped me a lot to believe they actually work (seriously) and to see my mistakes.

Arrays contain scalars, so you need to add a reference.
my #a1 = (1,2);
my #a2 = (\#a1, ,3);
You'll want to read http://perldoc.perl.org/perldsc.html.

The most important thing to understand
about all data structures in
Perl--including multidimensional
arrays--is that even though they might
appear otherwise, Perl #ARRAY s and
%HASH es are all internally
one-dimensional. They can hold only
scalar values (meaning a string,
number, or a reference). They cannot
directly contain other arrays or
hashes, but instead contain references
to other arrays or hashes.
Now, because the top level contains only references, if you try to print out your array in with a simple print() function, you'll get something that doesn't look very nice, like this:
#AoA = ( [2, 3], [4, 5, 7], [0] );
print $AoA[1][2];
7
print #AoA;
ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
That's because Perl doesn't (ever) implicitly dereference your variables. If you want to get at the thing a reference is referring to, then you have to do this yourself using either prefix typing indicators, like ${$blah} , #{$blah} , #{$blah[$i]} , or else postfix pointer arrows, like $a->[3] , $h->{fred} , or even $ob->method()->[3]
Source: perldoc
Now coming to your question. Here's your code:
my #a1 = (1,2);
my #a2 = (#a1,3);
Notice that the arrays contain scalar values. So you have to use reference and you can add a reference by using the \ keyword before an array's name which is to be referenced.
Like this:
my #a2 = (\#a1, ,3);

Inner arrays should be scalar references in the outer one:
my #a2 = (\#a1,3); # first element is a reference to a1
print ${$a2[0]}[1]; # print second element of inner array

This is a simple example of a 2D array as ref:
my $AoA = undef;
for(my $i=0; $i<3; $i++) {
for(my $j=0; $j<3; $j++) {
$AoA->[$i]->[$j] = rand(); # Assign some value
}
}

= and , operators in Perl

Please explain this apparently inconsistent behaviour:
$a = b, c;
print $a; # this prints: b
$a = (b, c);
print $a; # this prints: c

The = operator has higher precedence than ,.
And the comma operator throws away its left argument and returns the right one.
Note that the comma operator behaves differently depending on context. From perldoc perlop:
Binary "," is the comma operator. In
scalar context it evaluates its left
argument, throws that value away, then
evaluates its right argument and
returns that value. This is just like
C's comma operator.
In list context, it's just the list
argument separator, and inserts both
its arguments into the list. These
arguments are also evaluated from left
to right.

As eugene's answer seems to leave some questions by OP i try to explain based on that:
$a = "b", "c";
print $a;
Here the left argument is $a = "b" because = has a higher precedence than , it will be evaluated first. After that $a contains "b".
The right argument is "c" and will be returned as i show soon.
At that point when you print $a it is obviously printing b to your screen.
$a = ("b", "c");
print $a;
Here the term ("b","c") will be evaluated first because of the higher precedence of parentheses. It returns "c" and this will be assigned to $a.
So here you print "c".
$var = ($a = "b","c");
print $var;
print $a;
Here $a contains "b" and $var contains "c".
Once you get the precedence rules this is perfectly consistent

Since eugene and mugen have answered this question nicely with good examples already, I am going to setup some concepts then ask some conceptual questions of the OP to see if it helps to illuminate some Perl concepts.
The first concept is what the sigils $ and # mean (we wont descuss % here). # means multiple items (said "these things"). $ means one item (said "this thing"). To get first element of an array #a you can do $first = $a[0], get the last element: $last = $a[-1]. N.B. not #a[0] or #a[-1]. You can slice by doing #shorter = #longer[1,2].
The second concept is the difference between void, scalar and list context. Perl has the concept of the context in which your containers (scalars, arrays etc.) are used. An easy way to see this is that if you store a list (we will get to this) as an array #array = ("cow", "sheep", "llama") then we store the array as a scalar $size = #array we get the length of the array. We can also force this behavior by using the scalar operator such as print scalar #array. I will say it one more time for clarity: An array (not a list) in scalar context will return, not an element (as a list does) but rather the length of the array.
Remember from before you use the $ sigil when you only expect one item, i.e. $first = $a[0]. In this way you know you are in scalar context. Now when you call $length = #array you can see clearly that you are calling the array in scalar context, and thus you trigger the special property of an array in list context, you get its length.
This has another nice feature for testing if there are element in the array. print '#array contains items' if #array; print '#array is empty' unless #array. The if/unless tests force scalar context on the array, thus the if sees the length of the array not elements of it. Since all numerical values are 'truthy' except zero, if the array has non-zero length, the statement if #array evaluates to true and you get the print statement.
Void context means that the return value of some operation is ignored. A useful operation in void context could be something like incrementing. $n = 1; $n++; print $n; In this example $n++ (increment after returning) was in void context in that its return value "1" wasn't used (stored, printed etc).
The third concept is the difference between a list and an array. A list is an ordered set of values, an array is a container that holds an ordered set of values. You can see the difference for example in the gymnastics one must do to get particular element after using sort without storing the result first (try pop sort { $a cmp $b } #array for example, which doesn't work because pop does not act on a list, only an array).
Now we can ask, when you attempt your examples, what would you want Perl to do in these cases? As others have said, this depends on precedence.
In your first example, since the = operator has higher precedence than the ,, you haven't actually assigned a list to the variable, you have done something more like ($a = "b"), ("c") which effectively does nothing with the string "c". In fact it was called in void context. With warnings enabled, since this operation does not accomplish anything, Perl attempts to warn you that you probably didn't mean to do that with the message: Useless use of a constant in void context.
Now, what would you want Perl to do when you attempt to store a list to a scalar (or use a list in a scalar context)? It will not store the length of the list, this is only a behavior of an array. Therefore it must store one of the values in the list. While I know it is not canonically true, this example is very close to what happens.
my #animals = ("cow", "sheep", "llama");
my $return;
foreach my $animal (#animals) {
$return = $animal;
}
print $return;
And therefore you get the last element of the list (the canonical difference is that the preceding values were never stored then overwritten, however the logic is similar).
There are ways to store a something that looks like a list in a scalar, but this involves references. Read more about that in perldoc perlreftut.
Hopefully this makes things a little more clear. Finally I will say, until you get the hang of Perl's precedence rules, it never hurts to put in explicit parentheses for lists and function's arguments.

There is an easy way to see how Perl handles both of the examples, just run them through with:
perl -MO=Deparse,-p -e'...'
As you can see, the difference is because the order of operations is slightly different than you might suspect.
perl -MO=Deparse,-p -e'$a = a, b;print $a'
(($a = 'a'), '???');
print($a);
perl -MO=Deparse,-p -e'$a = (a, b);print $a'
($a = ('???', 'b'));
print($a);
Note: you see '???', because the original value got optimized away.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse