Scalar vs list context - perl

Wonder what is the rationale behind following two examples giving different results when in both cases do {} returns lists.
perl -wE 'say my $r = do { (44); }'
44
perl -wE 'say my $r = do { my ($x) = map $_, 44; }'
1

In both cases the assignment to $r is forcing scalar context on the do. However in the first case scalar context on a list returns the last value of the list, '44'.
In the second instance the assignment to my ($x) forces a list context, The result of assigning to a list in scalar context is the number of elements on the right hand side of the assignment. So you get.
map $_, 44 returns a list of length 1 containing (44)
my ($x) = assigns the results above in list context, because of the brackets around $x, to the list ($x) making $x = 44
The do block is in scalar context because of the assignment to $r, note the lack of brackets, and as I said above this returns the length of the right hand side of the list assignment. 1 in this case.
See what happens if you do this:
perl -wE 'say my $r = () = (1,3,5,7)'

First of all, neither do returns a list. They are evaluated in scalar context so they must return a single scalar, not an arbitrary number of scalars ("a list").
In the first case, the do returns the result of evaluating 44 in scalar context. This returns 44.
scalar( 44 ) ⇒ 44
In the second case, the do returns the result of evaluating a list assignment in scalar context. This returns the number of elements returned by the right-hand side of the assignment.
scalar( () = 44 ) ⇒ 1
I believe the real cause of your confusion is that you don't know about the assignment operators and how they are affected by context. If so, see Scalar vs List Assignment Operator for the answer to your real question.

Related

Perl "reverse comma operator" (Example from the book Programming Perl, 4th Edition)

I'm reading "Programming Perl" and ran into a strange example that does not seem to make sense. The book describes how the comma operator in Perl will return only the last result when used in a scalar context.
Example:
# After this statement, $stuff = "three"
$stuff = ("one", "two", "three");
The book then gives this example of a "reverse comma operator" a few pages later (page 82)
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
However to me this doesn't seem to be reverse at all..?
Example:
# After this statement, $stuff = "three"
$stuff = reverse_comma("one", "two", "three");
# Implementation of the "reverse comma operator"
sub reverse_comma {
return (pop(#_), pop(#_))[0];
}
How is this in any way reverse of the normal comma operator? The results are the same, not reversed!
Here is a link to the exact page. The example is near the bottom.
It's a bad example, and should be forgotten.
What it's demonstrating is simple:
Normally, if you have a sequence of expressions separated by commas in scalar context, that can be interpreted an instance of the comma operator, which evaluates to the last thing in the sequence.
However, if you put that sequence in parentheses and stick [0] at the end, it turns that sequence into a list and takes its first element, e.g.
my $x = (1, 2, 3)[0];
For some reason, the book calls this the "reverse comma operator". This is a misnomer; it's just a list that's having its first element taken.
The book is confusing matters further by using the pop function twice in the arguments. These are evaluated from left to right, so the first pop evaluates to "three" and the second one to "two".
In any case: Don't ever use either the comma or "reverse comma" operators in real code. Both are likely to prove confusing to future readers.
It's a cute and clever example, but thinking too hard about it distracts from its purpose. That section of the book is showing off list slices. Anything beyond slicing a list, no matter what is in the list, is not germane to the purpose of the examples.
You're only on page 82 of a very big book (we literally couldn't fit any more pages in because we were at the limit of the binding method), so there's not much we could throw at you. Among the other list slices examples, there this clever one that I wouldn't use in real code. That's the curse of contrived examples though.
But let's say there were a reverse comma operator. It would have to evaluate both side of the comma. Many answers go right for "just return the first thing". That's not the feature though. You have to visit every expression even though you keep one of them.
Consider this much much advanced version with a series of anonymous subroutines that I immediately dereference, each of which prints something then returns a result:
use v5.10;
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
);
The parens are there only for precedence since the assignment operator binds more tightly than the comma operator. There's no list here, even though it looks like there is one.
The output shows that Perl evaluated each even though it kept on the last one:
First
wantarray is 0
Second
Third
Scalar is [137]
The poorly-named wantarray built-in returns false, noting that the subroutine thinks it is in scalar context.
Now, suppose that you wanted to flip that around so it still evaluates every expression but keeps the first one. You can use a literal list access:
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
)[0];
With the addition on the subscription, the righthand side is now a list and I pull out the first item. Notice that the second subroutine thinks it is in list context now. I get the result of the first subroutine:
First
wantarray is 1
Second
Third
Scalar is [35]
But, let's put this in a subroutine. I still need to call each subroutine even though I won't use the results of the other ones:
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
sub reverse_comma { ( map { $_->() } #_ )[0] }
Or, would I use the results of the other ones? What if I did something slightly different. I'll add a side effect of setting $last to the evaluated expression:
use v5.10;
my $last;
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
say "Last is [$last]";
sub reverse_comma { ( map { $last = $_->() } #_ )[0] }
Now I see the feature that makes the scalar comma interesting. It evaluates all the expressions, and some of them might have side effects:
First
wantarray is 0
Second
Third
Scalar is [35]
Last is [137]
It's not a huge secret that tchrist is handy with shell scripts. The Perl to csh converter was basically his inbox (or comp.lang.perl). You'll see some shell like idioms popping up in his examples. Something like that trick to swap two numerical values with one statement:
use v5.10;
my $x = 17;
my $y = 137;
say "x => $x, y => $y";
$x = (
$x = $x + $y,
$y = $x - $y,
$x - $y,
);
say "x => $x, y => $y";
The side effects are important there.
So, back to the Camel, we have an example of where the thing that has the side effect is the pop array operator:
use v5.10;
my #foo = qw( g h j k );
say "list: #foo";
my $scalar = sub { return ( pop(#foo), pop(#foo) )[0] }->();
say "list: #foo";
This shows off that each expression on either side of all the commas are evaluated. I threw in the subroutine wrapper since we didn't show a complete example:
list: g h j k
list: g h
But, none of this was the point of that section, which was indexing into a list literal. The point of the example was not to return a different result than the other examples or Perl features. It was to return the same thing assuming the comma operator acted differently. The section is about list slices and is showing off list slices, so the stuff in the list wasn't the important part.
Let's make the examples more similar.
The comma operator:
$scalar = ("one", "two", "three");
# $scalar now contains "three"
The reverse comma operator:
$scalar = ("one", "two", "three")[0];
# $scalar now contains "one"
It's a "reverse comma" because $scalar gets the result of the first expression, where the normal comma operator gives the last expression. (If you know anything about Lisp, it's like the difference between progn and prog1.)
An implementation of the "reverse comma operator" as a subroutine would look something like this:
sub reverse_comma {
return shift #_;
}
An ordinary comma will evaluate its operands and then return the value of the right-hand operand
my $v = $a, $b
sets $v to the value of $b
For the purpose of demonstrating list slices, the Camel is proposing some code that behaves like the comma operator but instead evaluates its operands and then return the value of the left-hand operand
Something like that can be done with a list slice, like this
my $v = ($a, $b)[0]
which sets $v to the value of $a
That's all there is to it really. The book isn't trying to suggest that there should be a reverse comma subroutine, it is simply considering the problem of evaluating two expressions in order and returning the first. The order of evaluation is relevant only when the two expressions have side effects, which is why the example in the book uses pop, which changes the array as well as returning a value
The imaginary problem is this
Suppose I want a subroutine to remove the last two elements of an array, and then return the value of what used to be the last element
Ordinarily that would require a temporary variable, like this
my $last = pop #foo;
pop #foo;
return $last;
But as an example of a list slice the code suggests that this would also work
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
Please understand that this isn't a recommendation. There are a few ways to do this. Another single-statement way is
return scalar splice #foo, -2;
but that doesn't use a list slice, which is the topic of that section of the book. In reality I doubt if the book's authors would propose anything other than the simple solution with the temporary variable. It is purely an example of what a list slice can do
I hope that helps

Why do I get the last value in a list in scalar context in perl?

I had assumed that, in perl, $x=(2,3,4,5) and ($x)=(2,3,4,5) would give me the same result, but was surprised at what happened in my tests. I am wondering why this behavior is the way it is and why wantarray behaves differently. Here are my tests and the results:
>perl -e '$x=(1,2,3,5);print("$x\n")'
5
>perl -e '($x)=(1,2,3,5);print("$x\n")'
1
>perl -e '$x=(wantarray ? (1,2,3,5) : 4);print("$x\n")'
4
>perl -e '($x)=(wantarray ? (1,2,3,5) : 4);print("$x\n")'
4
Is this behavior consistent/reliable across all platforms?
Whoops. wantarray is for context of subroutine calls...
>perl -e '$x=test();sub test{return(1,2,3,5)};print("$x\n")'
5
>perl -e '($x)=test();sub test{return(1,2,3,5)};print("$x\n")'
1
>perl -e '$x=test();sub test{return(wantarray ? (1,2,3,5) : 4)};print("$x\n")'
4
>perl -e '($x)=test();sub test{return(wantarray ? (1,2,3,5) : 4)};print("$x\n")'
1
So I guess it is consistent, but why does the list return the last value in scalar context?
So I guess it is consistent, but why does the list return the last value in scalar context?
Because it's useful.
my $x = f() || ( warn('!!!'), 'default' );
Well, more useful than any other alternative, at least. It's also consistent with its stronger cousin, ;.
sub f { x(), y(), z() };
is the same as
sub f { x(); y(); z() };
Each operator determines decides what it returns in both scalar and list context.[1]
There are operators that only even return a single scalar.
say time; # Returns the number of seconds since epoch
say scalar( time ); # Ditto.
But operators that normally returns more than one scalar and those that return a variable number of scalars cannot possible return that in scalar context, so they will return something else. It's up to each one to decide what that is.
say scalar( 4,5,6 ); # Last item evaluated in scalar context.
say scalar( #a ); # Number of elements in #a.
say scalar( grep f(), g() ); # Number of matching items.
say scalar( localtime ); # Formatted timestamp.
The list operator (e.g. x,y,z) returns what the last item of the list (z) returns when evaluated in scalar context. For example,
my $x = (f(),g(),#a);
is a weird way of writing
f();
g();
my $x = #a;
Notes
Same goes for subs, though it's common to write subs that are useless to call in scalar context.
You have two types of assigment.
The first one is in scalar context and that is how the comma operators behave in this context (it evaluates both arguments but discard the one to the left and return the value to the right).
The second one is list context (when you enclose a variable in parentheses you are asking for list context). In list context the comma operator just separates the arguments and returns the whole list.
From the Documentation
Comma Operator
Binary "," is the comma operator. In scalar context it evaluates its
left argument, throws that value away, then evaluates its right
argument and returns that value. This is just like C's comma operator.
In list context, it's just the list argument separator, and inserts
both its arguments into the list. These arguments are also evaluated
from left to right.
http://perldoc.perl.org/perlop.html#Comma-Operator

Perl5 = (equals) operator precedence

$a,$b,$c = 1,2,3;
print "$a, $b, $c\n";
returns
, , 1
So does = (equals) take higher precedence than the tuple construction - doing this?
$a,$b,($c=1),2,3;
Yes. There's a precedence table in perlop. Assignment operators are level 19, and comma is level 20. In general, Perl's operators have the same precedence as the corresponding C operators (for those operators that have a corresponding C operator).
If you meant ($a,$b,$c) = (1,2,3); you have to use the parens.
The comma operator as you used it (in scalar context) is not for tuple construction, it's for evaluating several expressions and returning the last one.
Perl does things differently depending on context, it decides what to do depending on if it's expecting a scalar value, a list, nothing at all, ... See perldoc perldata's section on Context for an introduction.
So, if you do:
perl -e '$a = (1 and 4,2,0); print"$a\n"'
You get 0, because 4,2,0 is evaluated in scalar context, and behaves like C's comma operator, evaluating expressions between commas and returning the result of the last one.
If you force 4,2,0 to be evaluated in list context:
perl -e '$a = (1 and #a=(4,2,0)); print"$a\n"'
You get 3, because assigning to an array forces list context (the additional parenthesis are there to solve the precedence issue cjm mentioned), and the value of a list in scalar context (forced by being the RHS of an and in scalar context) is the number of elements it has (logical and in Perl returns the last expression evaluated, instead of a boolean value as in other programming languages).
So, as cjm said, you need to do:
($a,$b,$c) = (1,2,3);
to deal with precedence and force list context.
Notice the difference between:
$ perl -e '$a,$b,$c = (7,6,8); print "$a $b $c\n"'
8
The comma operator is evaluated in scalar context, and returns 8.
$ perl -e '($a,$b,$c) = (7,6,8); print "$a $b $c\n"'
7 6 8
The comma operator is evaluated in list context, and returns a list.
$ perl -e '$a,$b,$c = () = (7,6,8); print "$a $b $c\n"'
3
The comma operator is evaluated in list context, returning a list, then the assignment to $c forces scalar context, returning the number of elements in the list.

= and , operators in Perl

Please explain this apparently inconsistent behaviour:
$a = b, c;
print $a; # this prints: b
$a = (b, c);
print $a; # this prints: c
The = operator has higher precedence than ,.
And the comma operator throws away its left argument and returns the right one.
Note that the comma operator behaves differently depending on context. From perldoc perlop:
Binary "," is the comma operator. In
scalar context it evaluates its left
argument, throws that value away, then
evaluates its right argument and
returns that value. This is just like
C's comma operator.
In list context, it's just the list
argument separator, and inserts both
its arguments into the list. These
arguments are also evaluated from left
to right.
As eugene's answer seems to leave some questions by OP i try to explain based on that:
$a = "b", "c";
print $a;
Here the left argument is $a = "b" because = has a higher precedence than , it will be evaluated first. After that $a contains "b".
The right argument is "c" and will be returned as i show soon.
At that point when you print $a it is obviously printing b to your screen.
$a = ("b", "c");
print $a;
Here the term ("b","c") will be evaluated first because of the higher precedence of parentheses. It returns "c" and this will be assigned to $a.
So here you print "c".
$var = ($a = "b","c");
print $var;
print $a;
Here $a contains "b" and $var contains "c".
Once you get the precedence rules this is perfectly consistent
Since eugene and mugen have answered this question nicely with good examples already, I am going to setup some concepts then ask some conceptual questions of the OP to see if it helps to illuminate some Perl concepts.
The first concept is what the sigils $ and # mean (we wont descuss % here). # means multiple items (said "these things"). $ means one item (said "this thing"). To get first element of an array #a you can do $first = $a[0], get the last element: $last = $a[-1]. N.B. not #a[0] or #a[-1]. You can slice by doing #shorter = #longer[1,2].
The second concept is the difference between void, scalar and list context. Perl has the concept of the context in which your containers (scalars, arrays etc.) are used. An easy way to see this is that if you store a list (we will get to this) as an array #array = ("cow", "sheep", "llama") then we store the array as a scalar $size = #array we get the length of the array. We can also force this behavior by using the scalar operator such as print scalar #array. I will say it one more time for clarity: An array (not a list) in scalar context will return, not an element (as a list does) but rather the length of the array.
Remember from before you use the $ sigil when you only expect one item, i.e. $first = $a[0]. In this way you know you are in scalar context. Now when you call $length = #array you can see clearly that you are calling the array in scalar context, and thus you trigger the special property of an array in list context, you get its length.
This has another nice feature for testing if there are element in the array. print '#array contains items' if #array; print '#array is empty' unless #array. The if/unless tests force scalar context on the array, thus the if sees the length of the array not elements of it. Since all numerical values are 'truthy' except zero, if the array has non-zero length, the statement if #array evaluates to true and you get the print statement.
Void context means that the return value of some operation is ignored. A useful operation in void context could be something like incrementing. $n = 1; $n++; print $n; In this example $n++ (increment after returning) was in void context in that its return value "1" wasn't used (stored, printed etc).
The third concept is the difference between a list and an array. A list is an ordered set of values, an array is a container that holds an ordered set of values. You can see the difference for example in the gymnastics one must do to get particular element after using sort without storing the result first (try pop sort { $a cmp $b } #array for example, which doesn't work because pop does not act on a list, only an array).
Now we can ask, when you attempt your examples, what would you want Perl to do in these cases? As others have said, this depends on precedence.
In your first example, since the = operator has higher precedence than the ,, you haven't actually assigned a list to the variable, you have done something more like ($a = "b"), ("c") which effectively does nothing with the string "c". In fact it was called in void context. With warnings enabled, since this operation does not accomplish anything, Perl attempts to warn you that you probably didn't mean to do that with the message: Useless use of a constant in void context.
Now, what would you want Perl to do when you attempt to store a list to a scalar (or use a list in a scalar context)? It will not store the length of the list, this is only a behavior of an array. Therefore it must store one of the values in the list. While I know it is not canonically true, this example is very close to what happens.
my #animals = ("cow", "sheep", "llama");
my $return;
foreach my $animal (#animals) {
$return = $animal;
}
print $return;
And therefore you get the last element of the list (the canonical difference is that the preceding values were never stored then overwritten, however the logic is similar).
There are ways to store a something that looks like a list in a scalar, but this involves references. Read more about that in perldoc perlreftut.
Hopefully this makes things a little more clear. Finally I will say, until you get the hang of Perl's precedence rules, it never hurts to put in explicit parentheses for lists and function's arguments.
There is an easy way to see how Perl handles both of the examples, just run them through with:
perl -MO=Deparse,-p -e'...'
As you can see, the difference is because the order of operations is slightly different than you might suspect.
perl -MO=Deparse,-p -e'$a = a, b;print $a'
(($a = 'a'), '???');
print($a);
perl -MO=Deparse,-p -e'$a = (a, b);print $a'
($a = ('???', 'b'));
print($a);
Note: you see '???', because the original value got optimized away.

What is the difference between the scalar and list contexts in Perl?

What is the difference between the scalar and list contexts in Perl and does this have any parallel in other languages such as Java or Javascript?
Various operators in Perl are context sensitive and produce different results in list and scalar context.
For example:
my(#array) = (1, 2, 4, 8, 16);
my($first) = #array;
my(#copy1) = #array;
my #copy2 = #array;
my $count = #array;
print "array: #array\n";
print "first: $first\n";
print "copy1: #copy1\n";
print "copy2: #copy2\n";
print "count: $count\n";
Output:
array: 1 2 4 8 16
first: 1
copy1: 1 2 4 8 16
copy2: 1 2 4 8 16
count: 5
Now:
$first contains 1 (the first element of the array), because the parentheses in the my($first) provide an array context, but there's only space for one value in $first.
both #copy1 and #copy2 contain a copy of #array,
and $count contains 5 because it is a scalar context, and #array evaluates to the number of elements in the array in a scalar context.
More elaborate examples could be constructed too (the results are an exercise for the reader):
my($item1, $item2, #rest) = #array;
my(#copy3, #copy4) = #array, #array;
There is no direct parallel to list and scalar context in other languages that I know of.
Scalar context is what you get when you're looking for a single value. List context is what you get when you're looking for multiple values. One of the most common places to see the distinction is when working with arrays:
#x = #array; # copy an array
$x = #array; # get the number of elements in an array
Other operators and functions are context sensitive as well:
$x = 'abc' =~ /(\w+)/; # $x = 1
($x) = 'abc' =~ /(\w+)/; # $x = 'abc'
#x = localtime(); # (seconds, minutes, hours...)
$x = localtime(); # 'Thu Dec 18 10:02:17 2008'
How an operator (or function) behaves in a given context is up to the operator. There are no general rules for how things are supposed to behave.
You can make your own subroutines context sensitive by using the wantarray function to determine the calling context. You can force an expression to be evaluated in scalar context by using the scalar keyword.
In addition to scalar and list contexts you'll also see "void" (no return value expected) and "boolean" (a true/false value expected) contexts mentioned in the documentation.
This simply means that a data-type will be evaluated based on the mode of the operation. For example, an assignment to a scalar means the right-side will be evaluated as a scalar.
I think the best means of understanding context is learning about wantarray. So imagine that = is a subroutine that implements wantarray:
sub = {
return if ( ! defined wantarray ); # void: just return (doesn't make sense for =)
return #_ if ( wantarray ); # list: return the array
return $#_ + 1; # scalar: return the count of the #_
}
The examples in this post work as if the above subroutine is called by passing the right-side as the parameter.
As for parallels in other languages, yes, I still maintain that virtually every language supports something similar. Polymorphism is similar in all OO languages. Another example, Java converts objects to String in certain contexts. And every untyped scripting language i've used has similar concepts.