perl: why \(1,2) is SCALAR? How to dereference it? - perl

Here's the code:
$ref = \(1,2);
print "$ref\n";
print "#$ref\n";
It gives me:
SCALAR(0xa15a68)
Not an ARRAY reference at ./test.pl line 23.
I was trying to write a wrapper function that calls a function with unknown result type.
It should call the real function, save the result, do some other stuff and return the saved result. But it turned out that for the caller that expects a scalar value, returning an array and returning a list in parentheses are different things. This is why I tried to use references.

What you want is this:
$ref = [ 1, 2 ];
In your code:
$ref = \(1,2);
What the right hand side does is create a list of references, instead of a reference to an array. It's the same as this:
$ref = ( \1, \2 );
Since you're assigning that list to a scalar, all but the last item is thrown away, and $ref is set to a reference to the scalar value 2, which is probably not what you want.
See perldoc perlref.
Note that this behavior is consistent for non-literal values as well, such as subroutine calls. If you call a sub like this:
$val = function();
then the function sub is called in scalar context, and can choose to return a different value than if it were called in list context. (See perldoc -f wantarray.) If it chooses to return a list anyway, all but the last element of the list will be discarded, and that last element will be assigned to $val:
sub fun1() { return 1; }
sub fun2() { return (1,2); }
my $f1 = fun1();
my $f2 = fun2();
# $f1 is 1, and $f2 is 2
my $r1 = \( fun1() );
my $r2 = \( fun2() );
# $r1 is a ref to 1, and $r2 is a ref to 2

The unary reference operator \ returns the references of its RHS argument. If that argument is a list, then a list of references is returned.
If a list is used in scalar context, the last element of that list is returned:
1, 2 # list of 1, 2
\(1, 2) # list of \1, \2
$ref = \(1, 2) # list is used in scalar context
$ref = \2 # equivalent
Lists and Arrays are distinct: Lists are a syntactic construct, while an array is a data structure.

Related

return lists in perl cause different results

my #names = ( (), 'my', 'name' );
sub fn1 {
my #names = ( 'my', 'name' );
return ( (), #names ); #<<< this must be flatted
}
sub fn2 {
return ( (), 'my', 'name' );
}
say length fn1(); #1
say length fn2(); #4
say length #names; #1
say length scalar ( (), 'my', 'name' ); #4
say scalar fn1(); #2
say scalar fn2(); #name
say scalar #names; #2
say scalar ( (), 'my', 'name' ); #name
# but this produce same output
say fn1(); #myname
say fn2(); #myname
say #names; #myname
say ( (), 'my', 'name' ); #myname
Why in subroutine fn1 variable #names does not flatted and array instead of list is returned?
It seems that perl return two lists: one is empty and second is array #names.
This list of two elements in scalar context return last element - arrray.
Array in scalar context is its length: number 2
lenght of string '2' is 1, but this is differ from fn2
Is there any possibility to flat #names in fn1?
In case there's confusion, length does not return the number of elements in the array. It returns the size of a string.
What you're getting caught out by is the list vs array problem in Perl. In short, an array is a variable like #array. A list is something like (1,2,3,4). The problem comes when Perl evaluates them in scalar context.
An #array in scalar context will return its number of elements. A list in scalar context acts like the C comma operator, it returns its last element.
$ perl -wle 'sub foo { #a = ("first","second"); return #a } print scalar foo()'
2
$ perl -wle 'sub foo { return("first","second") } print scalar foo()'
second
So when you try sub foo { return("first","second") } print length foo() length puts the call to foo() in scalar context. foo() treats the list in return("first", "second") as the comma operator and returns the last element returning "second". And you get 6.
$ perl -wle 'sub foo { return("first","second") } print length foo()'
6
Long story short, don't return lists, return arrays.
UPDATE: I've figured out your confusion, and my own. () is doing something, it's converting what should be a simple array into a C comma operator. return( (), #names ) and return( #names ) are not the same thing. Let's pick them apart.
In scalar context, return( #name ) says "evaluate this array in scalar context" and returns the number of elements. In list context it will return all of #names. Simple. This is what you want to be using.
return( (), #names ) on the other hand is the comma operator. In list context it will return the whole list which flattens to #names. In scalar context it will return the last element, which is #names which will then be evaluated in scalar context returning its number of elements.
To see how this is true, if we add another element onto the end that's what we get back.
sub foo { return (), #names, 42 }
print scalar foo; # 42
Part of what's going on here is return is not acting like a normal function. It does not squash its arguments into an array before acting on them. You are returning the literal evaluation of the expression (), #names, 42 which is the comma operator.
I can see now why you started using (), but all you are doing is forcing the expression to be evaluated as the comma operator. Stop using it, it's a crutch and it's very hard to understand. What you need to do instead is learn about return contexts.
My very strong rule of thumb remains, don't return lists, return arrays.
length and scalar force scalar context, while say forces list context.
Note that the lengths match the strings:
String Length
--------------
2 1
name 4
2 1
name 4
I got answer: http://perldoc.perl.org/functions/scalar.html
If I really want to return array (NOT LIST) from subroutine, regardless of calling context I must write:
return #{ [ #names ] }
instead of
return ( (), #names );
becase calling function fn1 in scalar context really mean:
return scalar ( (), #names );
I must look description of return operator: http://perldoc.perl.org/functions/return.html
Evaluation of EXPR may be in list, scalar, or void context, depending on how the return value will be used
More detailed description from Schwern:
In scalar context, return( #name ) says "evaluate this array in scalar context" and returns the number of elements. In list context it will return all of #names. Simple. This is what you want to be using.
return( (), #names ) on the other hand is the comma operator. In list context it will return the whole list which flattens to #names. In scalar context it will return the last element, which is #names, which will then be evaluated in scalar context returning its number of elements.
To see how this is true, if we add another element onto the end that's what we get back.
sub foo { return (), #names, 42 }
print scalar foo; # 42

Why I can't do "shift subroutine_name()" in Perl?

Why does this code return an Not an ARRAY reference error?
sub Prog {
my $var1 = 1;
my $var2 = 2;
($var1, $var2);
}
my $variable = shift &Prog;
print "$variable\n";
If I use an intermediate array, I avoid the error:
my #intermediate_array = &Prog;
my $variable = shift #intermediate_array;
print "$variable\n";
The above code now outputs "1".
The subroutine Prog returns a list of scalars. The shift function only operates on an array. Arrays and lists are not the same thing. Arrays have storage, but lists do not.
If what you want is to get the first element of the list that Prog returns, do this:
sub Prog {
return ( 'this', 'that' );
}
my $var = (Prog())[0];
print "$var\n";
I changed the sub invocation to Prog() instead of &Prog because the latter is decidedly old style.
You can also assign the first element to a scalar like others are showing:
my ($var) = Prog();
This is roughly the same as:
my ($var, $ignored_var) = Prog();
and then ignoring $ignored_var. If you want to make it clear that you're ignoring the second value without actually giving it a variable, you can do this:
my ($var, undef) = Prog();
Prog is returning a list, not an array. Operations like shift modify the array and cannot be used on lists.
You can instead do:
my ($variable) = Prog; # $variable is now 1:
# Prog is evaluated in list context
# and the results assigned to the list ($variable)
Note that you don't need the &.

Dereferencing in case of $_[0], $_[1] ..... so on

please see the below code:
$scalar = 10;
subroutine(\$scalar);
sub subroutine {
my $subroutine_scalar = ${$_[0]}; #note you need the {} brackets, or this doesn't work!
print "$subroutine_scalar\n";
}
In the code above you can see the comment written "note you need the {} brackets, or this doesn't work!" . Please explain the reason that why we cant use the same statement as:
my $subroutine_scalar = $$_[0];
i.e. without using the curly brackets.
Many people have already given correct answers here. I wanted to add an example I found illuminating. You can read the documentation in perldoc perlref for more information.
Your problem is one of ambiguity, you have two operations $$ and [0] working on the same identifier _, and the result depends on which operation is performed first. We can make it less ambiguous by using the support curly braces ${ ... }. $$_[0] could (for a human anyway) possibly mean:
${$$_}[0] -- dereference the scalar $_, then take its first element.
${$_[0]} -- take element 0 of the array #_ and dereference it.
As you can see, these two cases refer to completely different variables, #_ and $_.
Of course, for Perl it is not ambiguous, we simply get the first option, since dereferencing is performed before key lookup. We need the support curly braces to override this dereferencing, and that is why your example does not "work" without support braces.
You might consider a slightly less confusing functionality for your subroutine. Instead of trying to do two things at once (get the argument and dereference it), you can do it in two stages:
sub foo {
my $n = shift;
print $$n;
}
Here, we take the first argument off #_ with shift, and then dereference it. Clean and simple.
Most often, you will not be using references to scalar variables, however. And in those cases, you can make use of the arrow operator ->
my #array = (1,2,3);
foo(\#array);
sub foo {
my $aref = shift;
print $aref->[0];
}
I find using the arrow operator to be preferable to the $$ syntax.
${ $x }[0] grabs the value of element 0 in the array referenced by $x.
${ $x[0] } grabs the value of scalar referenced by the element 0 of the array #x.
>perl -E"$x=['def']; #x=\'abc'; say ${ $x }[0];"
def
>perl -E"$x=['def']; #x=\'abc'; say ${ $x[0] };"
abc
$$x[0] is short for ${ $x }[0].
>perl -E"$x=['def']; #x=\'abc'; say $$x[0];"
def
my $subroutine_scalar = $$_[0];
is same as
my $subroutine_scalar = $_->[0]; # $_ is array reference
On the other hand,
my $subroutine_scalar = ${$_[0]};
dereferences scalar ref for first element of #_ array, and can be written as
my ($sref) = #_;
my $subroutine_scalar = ${$sref}; # or $$sref for short
Because $$_[0] means ${$_}[0].
Consider these two pieces of code which both print 10:
sub subroutine1 {
my $scalar = 10;
my $ref_scalar = \$scalar;
my #array = ($ref_scalar);
my $subroutine_scalar = ${$array[0]};
print "$subroutine_scalar\n";
}
sub subroutine2 {
my #array = (10);
my $ref_array = \#array;
my $subroutine_scalar = $$ref_array[0];
print "$subroutine_scalar\n";
}
In subroutine1, #array is an array containing the reference of $scalar. So the first step is to get the first element by $array[0], and then deference it.
While in subroutine2, #array is an array containing an scalar 10, and $ref_array is its reference. So the first step is to get the array by $ref_array, and then index the array.

Perl Hash Slice, Replication x Operator, and sub params

Ok, I understand perl hash slices, and the "x" operator in Perl, but can someone explain the following code example from here (slightly simplified)?
sub test{
my %hash;
#hash{#_} = (undef) x #_;
}
Example Call to sub:
test('one', 'two', 'three');
This line is what throws me:
#hash{#_} = (undef) x #_;
It is creating a hash where the keys are the parameters to the sub and initializing to undef, so:
%hash:
'one' => undef,
'two' => undef,
'three' => undef
The rvalue of the x operator should be a number; how is it that #_ is interpreted as the length of the sub's parameter array? I would expect you'd at least have to do this:
#hash{#_} = (undef) x scalar #_;
To figure out this code you need to understand three things:
The repetition operator. The x operator is the repetition operator. In list context, if the operator's left-hand argument is enclosed in parentheses, it will repeat the items in a list:
my #x = ('foo') x 3; # ('foo', 'foo', 'foo')
Arrays in scalar context. When an array is used in scalar context, it returns its size. The x operator imposes scalar context on its right-hand argument.
my #y = (7,8,9);
my $n = 10 * #y; # $n is 30
Hash slices. The hash slice syntax provides a way to access multiple hash items at once. A hash slice can retrieve hash values, or it can be assigned to. In the case at hand, we are assigning to a hash slice.
# Right side creates a list of repeated undef values -- the size of #_.
# We assign that list to a set of hash keys -- also provided by #_.
#hash{#_} = (undef) x #_;
Less obscure ways to do the same thing:
#hash{#_} = ();
$hash{$_} = undef for #_;
In scalar context, an array evaluates to its length. From perldoc perldata:
If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.)
Although I cannot find more information on it currently, it seems that the replication operator evaluates its second argument in scalar context, causing the array to evaluate to its length.

What is the difference between the scalar and list contexts in Perl?

What is the difference between the scalar and list contexts in Perl and does this have any parallel in other languages such as Java or Javascript?
Various operators in Perl are context sensitive and produce different results in list and scalar context.
For example:
my(#array) = (1, 2, 4, 8, 16);
my($first) = #array;
my(#copy1) = #array;
my #copy2 = #array;
my $count = #array;
print "array: #array\n";
print "first: $first\n";
print "copy1: #copy1\n";
print "copy2: #copy2\n";
print "count: $count\n";
Output:
array: 1 2 4 8 16
first: 1
copy1: 1 2 4 8 16
copy2: 1 2 4 8 16
count: 5
Now:
$first contains 1 (the first element of the array), because the parentheses in the my($first) provide an array context, but there's only space for one value in $first.
both #copy1 and #copy2 contain a copy of #array,
and $count contains 5 because it is a scalar context, and #array evaluates to the number of elements in the array in a scalar context.
More elaborate examples could be constructed too (the results are an exercise for the reader):
my($item1, $item2, #rest) = #array;
my(#copy3, #copy4) = #array, #array;
There is no direct parallel to list and scalar context in other languages that I know of.
Scalar context is what you get when you're looking for a single value. List context is what you get when you're looking for multiple values. One of the most common places to see the distinction is when working with arrays:
#x = #array; # copy an array
$x = #array; # get the number of elements in an array
Other operators and functions are context sensitive as well:
$x = 'abc' =~ /(\w+)/; # $x = 1
($x) = 'abc' =~ /(\w+)/; # $x = 'abc'
#x = localtime(); # (seconds, minutes, hours...)
$x = localtime(); # 'Thu Dec 18 10:02:17 2008'
How an operator (or function) behaves in a given context is up to the operator. There are no general rules for how things are supposed to behave.
You can make your own subroutines context sensitive by using the wantarray function to determine the calling context. You can force an expression to be evaluated in scalar context by using the scalar keyword.
In addition to scalar and list contexts you'll also see "void" (no return value expected) and "boolean" (a true/false value expected) contexts mentioned in the documentation.
This simply means that a data-type will be evaluated based on the mode of the operation. For example, an assignment to a scalar means the right-side will be evaluated as a scalar.
I think the best means of understanding context is learning about wantarray. So imagine that = is a subroutine that implements wantarray:
sub = {
return if ( ! defined wantarray ); # void: just return (doesn't make sense for =)
return #_ if ( wantarray ); # list: return the array
return $#_ + 1; # scalar: return the count of the #_
}
The examples in this post work as if the above subroutine is called by passing the right-side as the parameter.
As for parallels in other languages, yes, I still maintain that virtually every language supports something similar. Polymorphism is similar in all OO languages. Another example, Java converts objects to String in certain contexts. And every untyped scripting language i've used has similar concepts.