How do I get a slice from an array reference? - perl

Let us say that we have following array:
my #arr=('Jan','Feb','Mar','Apr');
my #arr2=#arr[0..2];
How can we do the same thing if we have array reference like below:
my $arr_ref=['Jan','Feb','Mar','Apr'];
my $arr_ref2; # How can we do something similar to #arr[0..2]; using $arr_ref ?

To get a slice starting with an array reference, replace the array name with a block containing the array reference. I've used whitespace to spread out the parts, but it's still the same thing:
my #slice = # array [1,3,2];
my #slice = # { $aref } [1,3,2];
If the reference inside the block is a simple scalar (so, not an array or hash element or a lot of code), you can leave off the braces:
my #slice = #$aref[1,3,2];
Then, if you want a reference from that, you can use the anonymous array constructor:
my $slice_ref = [ #$aref[1,3,2] ];
With the new post-dereference feature (experimental) in v5.20,
use v5.20;
use feature qw(postderef);
no warnings qw(experimental::postderef);
my #slice = $aref->#[1,3,2];

Just slice the reference (the syntax is similar to dereferencing it, see the comments), and then turn the resulting list back into a ref:
my $arr_ref2=[#$arr_ref[0..2]];

my $arr_ref2 = [ #$arr_ref[0..2] ];

Related

What are the advantages of anonymous arrays in Perl?

What are the advantages of anonymous array in Perl?
Array references—which anonymous arrays are one type of—allows Perl to treat the array as a single item. This allows you to build complex, nested data structures as well. This applies to hash, code, and all the other reference types too, but here I'll show only the hash reference. You can read Intermediate Perl for much more.
For the immediate, literal question, consider that there are two ways to make an array: with a named variable, and without a named variable ("anonymous"):
my #named_array = ( 1, 3, 7 );
[ 1, 3, 7 ];
The first line takes a list and stores it in a named array variable. You are probably used to seeing that everywhere.
That second line, [ 1, 3, 7 ], doesn't do anything. It's just a value.
But, consider this analogue, where you store a scalar value in a scalar variable (overloaded use of "scalar" there), and the next line that is just the value:
my $number = 6;
6;
Now here's the trick. You know that you can pass a scalar variable to a subroutine. You could write that as this:
my $number = 6;
some_sub( $number );
But why bother with the variable at all if that's the only use of it? Get rid of it altogether and pass the value directly:
some_sub( 6 );
It's the same thing with anonymous references. You can make the named version first and take a reference to it:
my #array = ( 1, 3, 7 );
some_sub( \#array );
But just like the scalar example, you don't need to clutter your code with a named array if it's only there so you can get a reference to it. Just make the reference directly:
some_sub( [ 1, 3, 7 ] );
But there's more to the story, and you have to know a little about how Perl works to understand it.
Why references at all?
Perl is mostly built around scalars (single item) and lists (multiple items). A scalar variable holds a scalar value, and an array variable hold a list (see What’s the difference between a list and an array?).
There are many features where you can use only a scalar, including single list elements, hash keys, and hash values:
$array[$i] = $single_item;
$hash{$single_item} = $other_single_item;
Other places that are always a list, such as the argument list to a subroutine:
sub some_sub {
my #args = #_;
...
}
Even if you call some_sub with two arrays, you end up with a single list stored in #_. You can't tell where #array_1 stopped and #array_2 started. This is all one list whose size is the combined sizes of the two arrays:
some_sub( #array_1, #array_2 );
Reference are a way to treat something as a single item. When you get that single item, you dereference it to get back to the original.
This means that you can store a array reference as a hash value:
$hash{$key} = \#some_array; # ref to named variable
$hash{$key} = [ 1, 3, 7 ]; # anonymous array directly
Or, you can create a list where each item is an array reference rather than the single, "flat" list you saw before:
my #Array_of_Arrays = ( \#array_1, \#array_2 , [1,3,7], ... );
my $seventh_of_ninth = $Array_of_Arrays[9][7];
The Perl Data Structures Cookbook (perldsc) has many examples of different sorts of complex data structures which you build with references.
You can pass references to subroutines so the array elements don't mix. This argument list is exactly two elements, and inside the subroutine you know which array you are dealing with:
some_sub( \#array_1, \#array_2 );
If you were curious about another aspect of this, you can update your question.

Why doesn't this deref of a reference work as a one-liner?

Given:
my #list1 = ('a');
my #list2 = ('b');
my #list0 = ( \#list1, \#list2 );
then
my #listRef = $list0[1];
my #list = #$listRef; # works
but
my #list = #$($list0[1]); # gives an error message
I can't figure out why. What am I missing?
There is one simple de-referencing rule that covers this. Loosely put:
What follows the sigil need be the correct reference for it, or a block that evaluates to that.
A specific case from perlreftut
You can always use an array reference, in curly braces, in place of the name of an array.
In your case then, it should be
my #list = #{ $list0[1] };
(not index [2] since your #list0 has two elements)  Spaces are there only for readability.
The attempted #$($list0[2]) is a syntax error, first because the ( (following the $) isn't allowed in an identifier (variable name), what presumably follows that $.
A block {} though would be allowed after the $ and would be evaluated, and must yield a scalar reference in this case, to be dereferenced by that $ in front of it; but then the first # would be in error. That can then fixed as well but this gets messy if pushed, and wasn't meant to go that far. While the exact rules are (still) a little murky, see Identifier Parsing in perldata.
The #$listRef earlier is correct syntax in general. But it refers to a scalar variable $listRef (which must be an array reference since it's getting dereferenced into an array by the first #), and there is no such a thing in the example -- you have an array variable #listRef.
So with use strict; in effect this, too, would fail to compile.
Dereferencing an arrayref to assign a new array is expensive as it has to copy all elements (and to construct the new array variable), while it's rarely needed (unless you actually want a copy). With the array reference on hand ($ar) all that one may need is readily available
#$ar; # list of elements
$ar->[$index]; # specific element
#$ar[#indices]; # slice -- list of some elements, like #$ar[0,2..5,-1]
$ar->#[0,-1]; # slice, with new "postfix dereferencing" (stable at v5.24)
$#$ar; # last index in the anonymous array referred by $ar
See Slices in perldata and Postfix reference slicing in perlref
You need
#{ $list0[1] }
Whenever you can use the name of a variable, you can use a block that evaluates to a reference. That means the syntax for getting the elements of an array are
#NAME # If you have the name
#BLOCK # If you have a reference
That means that
my #array1 = 4..5;
my #array2 = #array1;
and
my $array1 = [ 4..5 ];
my #array2 = #{ $array1 }
are equivalent.
When the only thing in the block is a simple scalar ($NAME or $BLOCK), you can omit the curlies. That means that
#{ $array1 }
is equivalent to
#$array1
That's why #$listRef works, and it's why #{ $list0[1] } can't be simplified.
See Perl Dereferencing Syntax.
You have a lot going on there and multiple levels of inadvertent references, so let's go through it:
First, you start by making a list of two items, each of which is an array reference. You store that in an array:
my #list0 = ( \#list2, \#list2 );
Then you ask for the item with index 2, which is a single item, and store that in an array:
my #listRef = $list0[2];
However, there is no item with index 2 because Perl indexes from zero. The value in #listRef in undefined. Not only that, but you've asked for a single item and stored it in an array instead of a scalar. That's probably not what you meant.
You say this following line works, but I don't think you know that because it won't give you the value you were expecting even if you didn't get an error. Something else is happening. You haven't declared or used a variable $listRef, so Perl creates it for you and gives it the value undef. When you try to dereference it, Perl uses "autovivification" to create the reference. This is the process where Perl helpfully creates a reference structure for you if you start with undef:
my #list = #$listRef; # works
There is nothing in that array so #list should be empty.
Fix that to get the last item, which has index of 1, and fix it so you are assigning the single value (the reference) to a scalar variable:
my $listRef = $list0[1];
Data::Dumper is handy here:
use Data::Dumper;
my #list2 = qw(a b c);
my #list0 = ( \#list2, \#list2 );
my $listRef = $list0[1];
print Dumper($listRef);
You get the output:
$VAR1 = [
'a',
'b',
'c'
];
Perl has some features that can catch these sorts of variable naming mistakes and will help you track down problems. Add these to the top of your program:
use strict;
use warnings;
For the rest, you might want to check out my book Intermediate Perl which explains all this reference stuff.
And, recent Perls have a new feature called postfix dereferencing that allows you to write dereferences from left to right:
my #items = ( \#list2, \#list2 );
my #items_of_last_ref = $items[1]->#*;
my #list = #$#listRef; # works
I doubt that works. That may not throw a syntax error but it sure as hell does not do what you think it does. For once
my #list0 = ( \#list2, \#list2 );
defines an array with 2 elements and you access
my #listRef = $list0[2];
the third element. So #listRef is an array that contains one element which is undef. The following code doesn't make sense either.
Unless the question is purely academic (answered by zdim already), I assume you want the second element of #list into a separate array, I would write
my #list = #{ $list0[1] };
The question is not complete and not clear on desired outcome.
OP tries to access an element $list0[2] of array #list0 which does not exist -- array has elements with indexes 0 and 1.
Perhaps #listRef should be $listRef instead in the post.
Bellow is my vision of described problem
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #list1 = qw/word1 word2 word3 word4/;
my #list2 = 1000..1004;
my #list0 = (\#list1, \#list2);
my $ref_array = $list0[0];
map{ say } #{$ref_array};
$ref_array = $list0[1];
map{ say } #{$ref_array};
say "Element: " . #{$ref_array}[2];
output
word1
word2
word3
word4
1000
1001
1002
1003
1004
Element: 1002

Perl: Calling a module method inside [ ]

Using an example, let a perl program start in the following fashion:
use strict;
use warnings;
use Time::HiRes;
What's the difference between
my $request_start_epoch = [Time::HiRes::gettimeofday];
and
my $request_start_epoch = Time::HiRes::gettimeofday;
?
The former calls the function in list context, assembles an anonymous array containing the elements of the returned list, and sets $request_start_epoch to a reference to that array.
The latter calls the function in scalar context and stores its return-value in $request_start_epoch.
These will almost always be different; the only time they would be the same is if the function's behavior in scalar context is to wrap up its list-context results in an anonymous array and return a reference to it. I've never seen any method written like that, but I'm sure someone somewhere has done it at some point!
The brackets [] convert what is returned by gettimeofday to an array reference. In your case, it would be a one element array.
Creating an array reference.
$arr_ref = [ 1,2,3,4,5 ];
Deferencing it.
#{ $arr_ref };
Accessing an element.
$ { $array_ref }[0]

How can I create a multi-dimensional array in perl?

I was creating a multi-dimensional array this way:
#!/usr/bin/perl
use warnings;
use strict;
my #a1 = (1, 2);
my #a2 = (#a1, 3);
But it turns out that I still got a one-dimensional array...
What's the right way in Perl?
You get a one-dimensional array because the array #a1 is expanded inside the parentheses. So, assuming:
my #a1 = (1, 2);
my #a2 = (#a1, 3);
Then your second statement is equivalent to my #a2 = (1,2,3);.
When creating a multi-dimensional array, you have a few choices:
Direct assignment of each value
Dereferencing an existing array
Inserting a reference
The first option is basically $array[0][0] = 1; and is not very exciting.
The second is doing this: my #a2 = (\#a1, 3);. Note that this makes a reference to the namespace for the array #a1, so if you later change #a1, the values inside #a2 will also change. It is not always a recommended option.
A variation of the second option is doing this: my #a2 = ([1,2], 3);. The brackets will create an anonymous array, which has no namespace, only a memory address, and will only exist inside #a2.
The third option, a bit more obscure, is doing this: my $a1 = [1,2]; my #a2 = ($a1, 3);. It will do exactly the same thing as 2, only the array reference is already in a scalar variable, called $a1.
Note the difference between () and [] when assigning to arrays. Brackets [] create an anonymous array, which returns an array reference as a scalar value (for example, that can be held by $a1, or $a2[0]).
Parentheses, on the other hand, do nothing at all really, except change the precedence of operators.
Consider this piece of code:
my #a2 = 1, 2, 3;
print "#a2";
This will print 1. If you use warnings, you will also get a warning such as: Useless use of a constant in void context. Basically, this happens:
my #a2 = 1;
2, 3;
Because commas (,) have a lower precedence than equal sign =. (See "Operator Precedence and Associativity" in perldoc perlop.)
Parentheses simply negate the default precedence of = and ,, and group 1,2,3 together in a list, which is then passed to #a2.
So, in short, brackets, [], have some magic in them: They create anonymous arrays. Parentheses, (), just change precedence, much like in math.
There is much to read in the documentation. Someone here once showed me a very good link for dereferencing, but I don't recall what it was. In perldoc perlreftut you will find a basic tutorial on references. And in perldoc perldsc you will find documentation on data structures (thanks Oesor for reminding me).
I would propose to work through perlreftut, perldsc and perllol, preferably in the same day and preferably using Data::Dumper to print data structures.
The tutorials complement each other and I think they would take better effect together. Visualizing data structures helped me a lot to believe they actually work (seriously) and to see my mistakes.
Arrays contain scalars, so you need to add a reference.
my #a1 = (1,2);
my #a2 = (\#a1, ,3);
You'll want to read http://perldoc.perl.org/perldsc.html.
The most important thing to understand
about all data structures in
Perl--including multidimensional
arrays--is that even though they might
appear otherwise, Perl #ARRAY s and
%HASH es are all internally
one-dimensional. They can hold only
scalar values (meaning a string,
number, or a reference). They cannot
directly contain other arrays or
hashes, but instead contain references
to other arrays or hashes.
Now, because the top level contains only references, if you try to print out your array in with a simple print() function, you'll get something that doesn't look very nice, like this:
#AoA = ( [2, 3], [4, 5, 7], [0] );
print $AoA[1][2];
7
print #AoA;
ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
That's because Perl doesn't (ever) implicitly dereference your variables. If you want to get at the thing a reference is referring to, then you have to do this yourself using either prefix typing indicators, like ${$blah} , #{$blah} , #{$blah[$i]} , or else postfix pointer arrows, like $a->[3] , $h->{fred} , or even $ob->method()->[3]
Source: perldoc
Now coming to your question. Here's your code:
my #a1 = (1,2);
my #a2 = (#a1,3);
Notice that the arrays contain scalar values. So you have to use reference and you can add a reference by using the \ keyword before an array's name which is to be referenced.
Like this:
my #a2 = (\#a1, ,3);
Inner arrays should be scalar references in the outer one:
my #a2 = (\#a1,3); # first element is a reference to a1
print ${$a2[0]}[1]; # print second element of inner array
This is a simple example of a 2D array as ref:
my $AoA = undef;
for(my $i=0; $i<3; $i++) {
for(my $j=0; $j<3; $j++) {
$AoA->[$i]->[$j] = rand(); # Assign some value
}
}

How do I determine the number of elements in an array reference?

Here is the situation I am facing...
$perl_scalar = decode_json( encode ('utf8',$line));
decode_json returns a reference. I am sure this is an array. How do I find the size of $perl_scalar?? As per Perl documentation, arrays are referenced using #name. Is there a workaround?
This reference consist of an array of hashes. I would like to get the number of hashes.
If I do length($perl_scalar), I get some number which does not match the number of elements in array.
That would be:
scalar(#{$perl_scalar});
You can get more information from perlreftut.
You can copy your referenced array to a normal one like this:
my #array = #{$perl_scalar};
But before that you should check whether the $perl_scalar is really referencing an array, with ref:
if (ref($perl_scalar) eq "ARRAY") {
my #array = #{$perl_scalar};
# ...
}
The length method cannot be used to calculate length of arrays. It's for getting the length of the strings.
You can also use the last index of the array to calculate the number of elements in the array.
my $length = $#{$perl_scalar} + 1;
$num_of_hashes = #{$perl_scalar};
Since you're assigning to a scalar, the dereferenced array is evaluated in a scalar context to the number of elements.
If you need to force scalar context then do as KARASZI says and use the scalar function.
You can see the entire structure with Data::Dumper:
use Data::Dumper;
print Dumper $perl_scalar;
Data::Dumper is a standard module that is installed with Perl. For a complete list of all the standard pragmatics and modules, see perldoc perlmodlib.