Perl Hash Slice, Replication x Operator, and sub params - perl

Ok, I understand perl hash slices, and the "x" operator in Perl, but can someone explain the following code example from here (slightly simplified)?
sub test{
my %hash;
#hash{#_} = (undef) x #_;
}
Example Call to sub:
test('one', 'two', 'three');
This line is what throws me:
#hash{#_} = (undef) x #_;
It is creating a hash where the keys are the parameters to the sub and initializing to undef, so:
%hash:
'one' => undef,
'two' => undef,
'three' => undef
The rvalue of the x operator should be a number; how is it that #_ is interpreted as the length of the sub's parameter array? I would expect you'd at least have to do this:
#hash{#_} = (undef) x scalar #_;

To figure out this code you need to understand three things:
The repetition operator. The x operator is the repetition operator. In list context, if the operator's left-hand argument is enclosed in parentheses, it will repeat the items in a list:
my #x = ('foo') x 3; # ('foo', 'foo', 'foo')
Arrays in scalar context. When an array is used in scalar context, it returns its size. The x operator imposes scalar context on its right-hand argument.
my #y = (7,8,9);
my $n = 10 * #y; # $n is 30
Hash slices. The hash slice syntax provides a way to access multiple hash items at once. A hash slice can retrieve hash values, or it can be assigned to. In the case at hand, we are assigning to a hash slice.
# Right side creates a list of repeated undef values -- the size of #_.
# We assign that list to a set of hash keys -- also provided by #_.
#hash{#_} = (undef) x #_;
Less obscure ways to do the same thing:
#hash{#_} = ();
$hash{$_} = undef for #_;

In scalar context, an array evaluates to its length. From perldoc perldata:
If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.)
Although I cannot find more information on it currently, it seems that the replication operator evaluates its second argument in scalar context, causing the array to evaluate to its length.

Related

Why does perl only dereference the last index when the range operator is used?

I have an array, #array, of array references. If I use the range operator to print elements 1 through 3 of #array, print #array[1..3], perl prints the array references for elements 1 through 3.
Why when I try to dereference the array references indexed between 1 and 3, #{#array[1..3]}, perl only dereferences and prints out the last element indexed in the range operator?
Is there a way to use the range operator while dereferencing an array?
Example Code
#!/bin/perl
use strict;
use warnings;
my #array = ();
foreach my $i (0..10) {
push #array, [rand(1000), int(rand(3))];
}
foreach my $i (#array) {
print "#$i\n";
}
print "\n\n================\n\n";
print #{#array[1..3]};
print "\n\n================\n\n";
From perldata:
Slices in scalar context return the last item of the slice.
#{ ... } dereferences a scalar value as an array, this implies that the value being dereferenced is in scalar context. From the perldata quote above we know that this will return the last element. Therefore the result is the last element.
A more reasonable approach would be to loop through your slice and print each individual array reference:
use strict;
use warnings;
use feature qw(say);
my #array_of_arrayrefs = (
[qw(1 2 3)],
[qw(4 5 6)],
[qw(7 8 9)],
[qw(a b c)],
);
foreach my $aref ( #array_of_arrayrefs[1..3] ) {
say join ',', #$aref;
}
#{#array[1..3]} is a strange-looking construct. #{ ... } is the array dereference operator. It needs a reference, which is a type of scalar. But #array[ ... ] produces a list.
This is one of those situations where you need to remember the rule for list evaluation in scalar context. The rule is that there is no general rule. Each list-producing operator does its own thing. In this case, apparently the array slice operator used in scalar context returns the last element of the list. #array[1..3] in scalar context is the same as $array[3].
As you have noticed, this is not useful. Array slices aren't meant to be used in scalar context
If you want to flatten a 2-dimensional nested array structure into a 1-dimensional list, use map:
print join ' ', map { #$_ } #array[1..3]
You still use the range operator for slicing. You just need some kind of looping construct (e.g. map) to apply the array dereference operator separately to each element of the outer array.
The #{ ... } construction dereferences the scalar value of the code within the braces as an array
I'm unclear what you expect from #{ #array[1..3] }, but the list#array[1..3] in scalar context returns just the last element of the list -- $array[3] -- so you are asking for #{ $array[3] } which I guess is what you got
If you explain what you want to print then I am sure we can help, but dereferencing a list makes little sense
#array[1..3] is a list of 3 array references. You can't dereference them all at once, so you should iterate over this list and dereference each element separately:
print #$_ for #array[1..3];
print "#$_\n" for #array[1..3]; # for better looking output

return lists in perl cause different results

my #names = ( (), 'my', 'name' );
sub fn1 {
my #names = ( 'my', 'name' );
return ( (), #names ); #<<< this must be flatted
}
sub fn2 {
return ( (), 'my', 'name' );
}
say length fn1(); #1
say length fn2(); #4
say length #names; #1
say length scalar ( (), 'my', 'name' ); #4
say scalar fn1(); #2
say scalar fn2(); #name
say scalar #names; #2
say scalar ( (), 'my', 'name' ); #name
# but this produce same output
say fn1(); #myname
say fn2(); #myname
say #names; #myname
say ( (), 'my', 'name' ); #myname
Why in subroutine fn1 variable #names does not flatted and array instead of list is returned?
It seems that perl return two lists: one is empty and second is array #names.
This list of two elements in scalar context return last element - arrray.
Array in scalar context is its length: number 2
lenght of string '2' is 1, but this is differ from fn2
Is there any possibility to flat #names in fn1?
In case there's confusion, length does not return the number of elements in the array. It returns the size of a string.
What you're getting caught out by is the list vs array problem in Perl. In short, an array is a variable like #array. A list is something like (1,2,3,4). The problem comes when Perl evaluates them in scalar context.
An #array in scalar context will return its number of elements. A list in scalar context acts like the C comma operator, it returns its last element.
$ perl -wle 'sub foo { #a = ("first","second"); return #a } print scalar foo()'
2
$ perl -wle 'sub foo { return("first","second") } print scalar foo()'
second
So when you try sub foo { return("first","second") } print length foo() length puts the call to foo() in scalar context. foo() treats the list in return("first", "second") as the comma operator and returns the last element returning "second". And you get 6.
$ perl -wle 'sub foo { return("first","second") } print length foo()'
6
Long story short, don't return lists, return arrays.
UPDATE: I've figured out your confusion, and my own. () is doing something, it's converting what should be a simple array into a C comma operator. return( (), #names ) and return( #names ) are not the same thing. Let's pick them apart.
In scalar context, return( #name ) says "evaluate this array in scalar context" and returns the number of elements. In list context it will return all of #names. Simple. This is what you want to be using.
return( (), #names ) on the other hand is the comma operator. In list context it will return the whole list which flattens to #names. In scalar context it will return the last element, which is #names which will then be evaluated in scalar context returning its number of elements.
To see how this is true, if we add another element onto the end that's what we get back.
sub foo { return (), #names, 42 }
print scalar foo; # 42
Part of what's going on here is return is not acting like a normal function. It does not squash its arguments into an array before acting on them. You are returning the literal evaluation of the expression (), #names, 42 which is the comma operator.
I can see now why you started using (), but all you are doing is forcing the expression to be evaluated as the comma operator. Stop using it, it's a crutch and it's very hard to understand. What you need to do instead is learn about return contexts.
My very strong rule of thumb remains, don't return lists, return arrays.
length and scalar force scalar context, while say forces list context.
Note that the lengths match the strings:
String Length
--------------
2 1
name 4
2 1
name 4
I got answer: http://perldoc.perl.org/functions/scalar.html
If I really want to return array (NOT LIST) from subroutine, regardless of calling context I must write:
return #{ [ #names ] }
instead of
return ( (), #names );
becase calling function fn1 in scalar context really mean:
return scalar ( (), #names );
I must look description of return operator: http://perldoc.perl.org/functions/return.html
Evaluation of EXPR may be in list, scalar, or void context, depending on how the return value will be used
More detailed description from Schwern:
In scalar context, return( #name ) says "evaluate this array in scalar context" and returns the number of elements. In list context it will return all of #names. Simple. This is what you want to be using.
return( (), #names ) on the other hand is the comma operator. In list context it will return the whole list which flattens to #names. In scalar context it will return the last element, which is #names, which will then be evaluated in scalar context returning its number of elements.
To see how this is true, if we add another element onto the end that's what we get back.
sub foo { return (), #names, 42 }
print scalar foo; # 42

Why I can use #list to call an array, but can't use %dict to call a hash in perl? [duplicate]

This question already has answers here:
Why do you need $ when accessing array and hash elements in Perl?
(9 answers)
Closed 8 years ago.
Today I start my perl journey, and now I'm exploring the data type.
My code looks like:
#list=(1,2,3,4,5);
%dict=(1,2,3,4,5);
print "$list[0]\n"; # using [ ] to wrap index
print "$dict{1}\n"; # using { } to wrap key
print "#list[2]\n";
print "%dict{2}\n";
it seems $ + var_name works for both array and hash, but # + var_name can be used to call an array, meanwhile % + var_name can't be used to call a hash.
Why?
#list[2] works because it is a slice of a list.
In Perl 5, a sigil indicates--in a non-technical sense--the context of your expression. Except from some of the non-standard behavior that slices have in a scalar context, the basic thought is that the sigil represents what you want to get out of the expression.
If you want a scalar out of a hash, it's $hash{key}.
If you want a scalar out of an array, it's $array[0]. However, Perl allows you to get slices of the aggregates. And that allows you to retrieve more than one value in a compact expression. Slices take a list of indexes. So,
#list = #hash{ qw<key1 key2> };
gives you a list of items from the hash. And,
#list2 = #list[0..3];
gives you the first four items from the array. --> For your case, #list[2] still has a "list" of indexes, it's just that list is the special case of a "list of one".
As scalar and list contexts were rather well defined, and there was no "hash context", it stayed pretty stable at $ for scalar and # for "lists" and until recently, Perl did not support addressing any variable with %. So neither %hash{#keys} nor %hash{key} had meaning. Now, however, you can dump out pairs of indexes with values by putting the % sigil on the front.
my %hash = qw<a 1 b 2>;
my #list = %hash{ qw<a b> }; # yields ( 'a', 1, 'b', 2 )
my #l2 = %list[0..2]; # yields ( 0, 'a', 1, '1', 2, 'b' )
So, I guess, if you have an older version of Perl, you can't, but if you have 5.20, you can.
But for a completist's sake, slices have a non-intuitive way that they work in a scalar context. Because the standard behavior of putting a list into a scalar context is to count the list, if a slice worked with that behavior:
( $item = #hash{ #keys } ) == scalar #keys;
Which would make the expression:
$item = #hash{ #keys };
no more valuable than:
scalar #keys;
So, Perl seems to treat it like the expression:
$s = ( $hash{$keys[0]}, $hash{$keys[1]}, ... , $hash{$keys[$#keys]} );
And when a comma-delimited list is evaluated in a scalar context, it assigns the last expression. So it really ends up that
$item = #hash{ #keys };
is no more valuable than:
$item = $hash{ $keys[-1] };
But it makes writing something like this:
$item = $hash{ source1(), source2(), #array3, $banana, ( map { "$_" } source4()};
slightly easier than writing:
$item = $hash{ [source1(), source2(), #array3, $banana, ( map { "$_" } source4()]->[-1] }
But only slightly.
Arrays are interpolated within double quotes, so you see the actual contents of the array printed.
On the other hand, %dict{1} works, but is not interpolated within double quotes. So, something like my %partial_dict = %dict{1,3} is valid and does what you expect i.e. %partial_dict will now have the value (1,2,3,4). But "%dict{1,3}" (in quotes) will still be printed as %dict{1,3}.
Perl Cookbook has some tips on printing hashes.

How to clear a Perl hash

Let's say we define an anonymous hash like this:
my $hash = {};
And then use the hash afterwards. Then it's time to empty or clear the hash for
reuse. After some Google searching, I found:
%{$hash} = ()
and:
undef %{$hash}
Both will serve my needs. What's the difference between the two? Are they both identical ways to empty a hash?
%$hash_ref = (); makes more sense than undef-ing the hash. Undef-ing the hash says that you're done with the hash. Assigning an empty list says you just want an empty hash.
Yes, they are absolutely identical. Both remove any existing keys and values from the table and sets the hash to the empty list.
See perldoc -f undef:
undef EXPR
undef Undefines the value of EXPR, which must be an lvalue. Use only
on a scalar value, an array (using "#"), a hash (using "%"), a
subroutine (using "&"), or a typeglob (using "*")...
Examples:
undef $foo;
undef $bar{'blurfl'}; # Compare to: delete $bar{'blurfl'};
undef #ary;
undef %hash;
However, you should not use undef to remove the value of anything except a scalar. For other variable types, set it to the "empty" version of that type -- e.g. for arrays or hashes, #foo = (); %bar = ();

What is the difference between the scalar and list contexts in Perl?

What is the difference between the scalar and list contexts in Perl and does this have any parallel in other languages such as Java or Javascript?
Various operators in Perl are context sensitive and produce different results in list and scalar context.
For example:
my(#array) = (1, 2, 4, 8, 16);
my($first) = #array;
my(#copy1) = #array;
my #copy2 = #array;
my $count = #array;
print "array: #array\n";
print "first: $first\n";
print "copy1: #copy1\n";
print "copy2: #copy2\n";
print "count: $count\n";
Output:
array: 1 2 4 8 16
first: 1
copy1: 1 2 4 8 16
copy2: 1 2 4 8 16
count: 5
Now:
$first contains 1 (the first element of the array), because the parentheses in the my($first) provide an array context, but there's only space for one value in $first.
both #copy1 and #copy2 contain a copy of #array,
and $count contains 5 because it is a scalar context, and #array evaluates to the number of elements in the array in a scalar context.
More elaborate examples could be constructed too (the results are an exercise for the reader):
my($item1, $item2, #rest) = #array;
my(#copy3, #copy4) = #array, #array;
There is no direct parallel to list and scalar context in other languages that I know of.
Scalar context is what you get when you're looking for a single value. List context is what you get when you're looking for multiple values. One of the most common places to see the distinction is when working with arrays:
#x = #array; # copy an array
$x = #array; # get the number of elements in an array
Other operators and functions are context sensitive as well:
$x = 'abc' =~ /(\w+)/; # $x = 1
($x) = 'abc' =~ /(\w+)/; # $x = 'abc'
#x = localtime(); # (seconds, minutes, hours...)
$x = localtime(); # 'Thu Dec 18 10:02:17 2008'
How an operator (or function) behaves in a given context is up to the operator. There are no general rules for how things are supposed to behave.
You can make your own subroutines context sensitive by using the wantarray function to determine the calling context. You can force an expression to be evaluated in scalar context by using the scalar keyword.
In addition to scalar and list contexts you'll also see "void" (no return value expected) and "boolean" (a true/false value expected) contexts mentioned in the documentation.
This simply means that a data-type will be evaluated based on the mode of the operation. For example, an assignment to a scalar means the right-side will be evaluated as a scalar.
I think the best means of understanding context is learning about wantarray. So imagine that = is a subroutine that implements wantarray:
sub = {
return if ( ! defined wantarray ); # void: just return (doesn't make sense for =)
return #_ if ( wantarray ); # list: return the array
return $#_ + 1; # scalar: return the count of the #_
}
The examples in this post work as if the above subroutine is called by passing the right-side as the parameter.
As for parallels in other languages, yes, I still maintain that virtually every language supports something similar. Polymorphism is similar in all OO languages. Another example, Java converts objects to String in certain contexts. And every untyped scripting language i've used has similar concepts.