I have a function, which depends on calling context and i wanted to use this function as as argument to other function. Surprisingly i discovered that this second function is called in list context now. I tried force scalar context with +() but it does not work as i expected. So only way was to call it implicitly with scalar.
use 5.010;
say first( 1, second( 'y' ) );
say first( 1, +( second( 'y' ) ) );
say first( 1, scalar second( 'y' ) );
sub first {
my $x = shift;
my $y = shift;
return "$x + $y";
}
sub second {
my $y = shift;
if ( wantarray ) {
qw/ array array /;
} else {
'scalar';
}
}
__END__
1 + array
1 + array
1 + scalar
Arguments to function are treated as list, but does it mean that every argument in that list implies list context too? If yes, then why?
And, using scalar works, but which other ways i have to call this function in scalar context (without intermediate variable)?
It makes sense that function arguments are evaluated in list context:
In Perl, all subroutines map lists to lists.
If a sub takes a list, it makes sense to evaluate all arguments in list context, which makes functions composable. Consider map:
map { ... } 1, 2, 3; # makes sense
map { ... } foo(); # can we "return 1, 2, 3" for the same effect?
Should the foo() be called in scalar context, this would be equivalent to return 3 when we intended to return the list 1, 2, 3. Using list context enables us to compose list transformations without to many explicit loops. The Schwartzian Transform is a prime example of this:
my #sorted =
map { $_->[1] }
sort { $a->[0] <=> $b->[0] }
map { [make_key($_), $_] }
#input;
There are two ways to evaluate an argument in list context:
Use an expression that forces scalar context, like scalar or scalar operations.
Use prototypes like ($). This destroys composability of functions, requires your subroutine to be predeclared, can have confusing semantics, and is action at a distance. Prototypes should therefore be avoided.
Your +(...) did not force scalar context because unary plus does not impose context. It is commonly used to disambiguate parens (e.g. used for precedence) from a function application, e.g. map +($_ => 2*$_), 1, 2, 3.
The unary plus is distinct from the binary plus, an arithmetic operator which imposes scalar context on its operands and coerces them to numbers.
Why function arguments induce list context?
Subroutines accept a variable number of scalars as arguments. What other choice is there?
Arguments to function are treated as list, but does it mean that every argument in that list implies list context too? If yes, then why?
Yes. Because you want to be able to build lists from the contents of hashes and arrays. There's a million reason why that's useful.
%h = (%h, ...); # Add to a hash
f( $x, #opts ); # Composing argument lists
etc
using scalar works, but which other ways i have to call this function in scalar context (without intermediate variable)?
Kinda.
say first( 1, "".second( 'y' ) ); # Side-effect: stringification
say first( 1, 0+.second( 'y' ) ); # Side-effect: numificatiion
say first( 1, !!second( 'y' ) ); # Side-effect: conversion to boolean
Subrountine prototypes can also enforce scalar context, but they're generally seen as bad for that very reason.
Related
I'm reading "Programming Perl" and ran into a strange example that does not seem to make sense. The book describes how the comma operator in Perl will return only the last result when used in a scalar context.
Example:
# After this statement, $stuff = "three"
$stuff = ("one", "two", "three");
The book then gives this example of a "reverse comma operator" a few pages later (page 82)
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
However to me this doesn't seem to be reverse at all..?
Example:
# After this statement, $stuff = "three"
$stuff = reverse_comma("one", "two", "three");
# Implementation of the "reverse comma operator"
sub reverse_comma {
return (pop(#_), pop(#_))[0];
}
How is this in any way reverse of the normal comma operator? The results are the same, not reversed!
Here is a link to the exact page. The example is near the bottom.
It's a bad example, and should be forgotten.
What it's demonstrating is simple:
Normally, if you have a sequence of expressions separated by commas in scalar context, that can be interpreted an instance of the comma operator, which evaluates to the last thing in the sequence.
However, if you put that sequence in parentheses and stick [0] at the end, it turns that sequence into a list and takes its first element, e.g.
my $x = (1, 2, 3)[0];
For some reason, the book calls this the "reverse comma operator". This is a misnomer; it's just a list that's having its first element taken.
The book is confusing matters further by using the pop function twice in the arguments. These are evaluated from left to right, so the first pop evaluates to "three" and the second one to "two".
In any case: Don't ever use either the comma or "reverse comma" operators in real code. Both are likely to prove confusing to future readers.
It's a cute and clever example, but thinking too hard about it distracts from its purpose. That section of the book is showing off list slices. Anything beyond slicing a list, no matter what is in the list, is not germane to the purpose of the examples.
You're only on page 82 of a very big book (we literally couldn't fit any more pages in because we were at the limit of the binding method), so there's not much we could throw at you. Among the other list slices examples, there this clever one that I wouldn't use in real code. That's the curse of contrived examples though.
But let's say there were a reverse comma operator. It would have to evaluate both side of the comma. Many answers go right for "just return the first thing". That's not the feature though. You have to visit every expression even though you keep one of them.
Consider this much much advanced version with a series of anonymous subroutines that I immediately dereference, each of which prints something then returns a result:
use v5.10;
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
);
The parens are there only for precedence since the assignment operator binds more tightly than the comma operator. There's no list here, even though it looks like there is one.
The output shows that Perl evaluated each even though it kept on the last one:
First
wantarray is 0
Second
Third
Scalar is [137]
The poorly-named wantarray built-in returns false, noting that the subroutine thinks it is in scalar context.
Now, suppose that you wanted to flip that around so it still evaluates every expression but keeps the first one. You can use a literal list access:
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
)[0];
With the addition on the subscription, the righthand side is now a list and I pull out the first item. Notice that the second subroutine thinks it is in list context now. I get the result of the first subroutine:
First
wantarray is 1
Second
Third
Scalar is [35]
But, let's put this in a subroutine. I still need to call each subroutine even though I won't use the results of the other ones:
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
sub reverse_comma { ( map { $_->() } #_ )[0] }
Or, would I use the results of the other ones? What if I did something slightly different. I'll add a side effect of setting $last to the evaluated expression:
use v5.10;
my $last;
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
say "Last is [$last]";
sub reverse_comma { ( map { $last = $_->() } #_ )[0] }
Now I see the feature that makes the scalar comma interesting. It evaluates all the expressions, and some of them might have side effects:
First
wantarray is 0
Second
Third
Scalar is [35]
Last is [137]
It's not a huge secret that tchrist is handy with shell scripts. The Perl to csh converter was basically his inbox (or comp.lang.perl). You'll see some shell like idioms popping up in his examples. Something like that trick to swap two numerical values with one statement:
use v5.10;
my $x = 17;
my $y = 137;
say "x => $x, y => $y";
$x = (
$x = $x + $y,
$y = $x - $y,
$x - $y,
);
say "x => $x, y => $y";
The side effects are important there.
So, back to the Camel, we have an example of where the thing that has the side effect is the pop array operator:
use v5.10;
my #foo = qw( g h j k );
say "list: #foo";
my $scalar = sub { return ( pop(#foo), pop(#foo) )[0] }->();
say "list: #foo";
This shows off that each expression on either side of all the commas are evaluated. I threw in the subroutine wrapper since we didn't show a complete example:
list: g h j k
list: g h
But, none of this was the point of that section, which was indexing into a list literal. The point of the example was not to return a different result than the other examples or Perl features. It was to return the same thing assuming the comma operator acted differently. The section is about list slices and is showing off list slices, so the stuff in the list wasn't the important part.
Let's make the examples more similar.
The comma operator:
$scalar = ("one", "two", "three");
# $scalar now contains "three"
The reverse comma operator:
$scalar = ("one", "two", "three")[0];
# $scalar now contains "one"
It's a "reverse comma" because $scalar gets the result of the first expression, where the normal comma operator gives the last expression. (If you know anything about Lisp, it's like the difference between progn and prog1.)
An implementation of the "reverse comma operator" as a subroutine would look something like this:
sub reverse_comma {
return shift #_;
}
An ordinary comma will evaluate its operands and then return the value of the right-hand operand
my $v = $a, $b
sets $v to the value of $b
For the purpose of demonstrating list slices, the Camel is proposing some code that behaves like the comma operator but instead evaluates its operands and then return the value of the left-hand operand
Something like that can be done with a list slice, like this
my $v = ($a, $b)[0]
which sets $v to the value of $a
That's all there is to it really. The book isn't trying to suggest that there should be a reverse comma subroutine, it is simply considering the problem of evaluating two expressions in order and returning the first. The order of evaluation is relevant only when the two expressions have side effects, which is why the example in the book uses pop, which changes the array as well as returning a value
The imaginary problem is this
Suppose I want a subroutine to remove the last two elements of an array, and then return the value of what used to be the last element
Ordinarily that would require a temporary variable, like this
my $last = pop #foo;
pop #foo;
return $last;
But as an example of a list slice the code suggests that this would also work
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
Please understand that this isn't a recommendation. There are a few ways to do this. Another single-statement way is
return scalar splice #foo, -2;
but that doesn't use a list slice, which is the topic of that section of the book. In reality I doubt if the book's authors would propose anything other than the simple solution with the temporary variable. It is purely an example of what a list slice can do
I hope that helps
my #names = ( (), 'my', 'name' );
sub fn1 {
my #names = ( 'my', 'name' );
return ( (), #names ); #<<< this must be flatted
}
sub fn2 {
return ( (), 'my', 'name' );
}
say length fn1(); #1
say length fn2(); #4
say length #names; #1
say length scalar ( (), 'my', 'name' ); #4
say scalar fn1(); #2
say scalar fn2(); #name
say scalar #names; #2
say scalar ( (), 'my', 'name' ); #name
# but this produce same output
say fn1(); #myname
say fn2(); #myname
say #names; #myname
say ( (), 'my', 'name' ); #myname
Why in subroutine fn1 variable #names does not flatted and array instead of list is returned?
It seems that perl return two lists: one is empty and second is array #names.
This list of two elements in scalar context return last element - arrray.
Array in scalar context is its length: number 2
lenght of string '2' is 1, but this is differ from fn2
Is there any possibility to flat #names in fn1?
In case there's confusion, length does not return the number of elements in the array. It returns the size of a string.
What you're getting caught out by is the list vs array problem in Perl. In short, an array is a variable like #array. A list is something like (1,2,3,4). The problem comes when Perl evaluates them in scalar context.
An #array in scalar context will return its number of elements. A list in scalar context acts like the C comma operator, it returns its last element.
$ perl -wle 'sub foo { #a = ("first","second"); return #a } print scalar foo()'
2
$ perl -wle 'sub foo { return("first","second") } print scalar foo()'
second
So when you try sub foo { return("first","second") } print length foo() length puts the call to foo() in scalar context. foo() treats the list in return("first", "second") as the comma operator and returns the last element returning "second". And you get 6.
$ perl -wle 'sub foo { return("first","second") } print length foo()'
6
Long story short, don't return lists, return arrays.
UPDATE: I've figured out your confusion, and my own. () is doing something, it's converting what should be a simple array into a C comma operator. return( (), #names ) and return( #names ) are not the same thing. Let's pick them apart.
In scalar context, return( #name ) says "evaluate this array in scalar context" and returns the number of elements. In list context it will return all of #names. Simple. This is what you want to be using.
return( (), #names ) on the other hand is the comma operator. In list context it will return the whole list which flattens to #names. In scalar context it will return the last element, which is #names which will then be evaluated in scalar context returning its number of elements.
To see how this is true, if we add another element onto the end that's what we get back.
sub foo { return (), #names, 42 }
print scalar foo; # 42
Part of what's going on here is return is not acting like a normal function. It does not squash its arguments into an array before acting on them. You are returning the literal evaluation of the expression (), #names, 42 which is the comma operator.
I can see now why you started using (), but all you are doing is forcing the expression to be evaluated as the comma operator. Stop using it, it's a crutch and it's very hard to understand. What you need to do instead is learn about return contexts.
My very strong rule of thumb remains, don't return lists, return arrays.
length and scalar force scalar context, while say forces list context.
Note that the lengths match the strings:
String Length
--------------
2 1
name 4
2 1
name 4
I got answer: http://perldoc.perl.org/functions/scalar.html
If I really want to return array (NOT LIST) from subroutine, regardless of calling context I must write:
return #{ [ #names ] }
instead of
return ( (), #names );
becase calling function fn1 in scalar context really mean:
return scalar ( (), #names );
I must look description of return operator: http://perldoc.perl.org/functions/return.html
Evaluation of EXPR may be in list, scalar, or void context, depending on how the return value will be used
More detailed description from Schwern:
In scalar context, return( #name ) says "evaluate this array in scalar context" and returns the number of elements. In list context it will return all of #names. Simple. This is what you want to be using.
return( (), #names ) on the other hand is the comma operator. In list context it will return the whole list which flattens to #names. In scalar context it will return the last element, which is #names, which will then be evaluated in scalar context returning its number of elements.
To see how this is true, if we add another element onto the end that's what we get back.
sub foo { return (), #names, 42 }
print scalar foo; # 42
Why do we use function protoypes in Perl?
What are the different prototypes available? How to use them?
Example: $$,$#,\## what do they mean?
You can find the description in the official documentation: http://perldoc.perl.org/perlsub.html#Prototypes
But more important: read why you should not use function prototytpes" Why are Perl 5's function prototypes bad?
To write some functions, prototypes are absolutely neccessary, as they change the way arguments are passed, the sub invocations are parsed, and in what context the arguments are evaluated.
Below are discussions on prototypes with the builtins open and bless, as well as the effect on user-written code like a fold_left subroutine. I come to the conclusion that there are a few scenarios where they are useful, but they are generally not a good mechanism to cope with signatures.
Example: CORE::open
Some builtin functions have prototypes, e.g open. You can get the prototype of any function like say prototype "CORE::open". We get *;$#. This means:
The * takes a bareword, glob, globref or scalar. E.g. STDOUT or my $fh.
The ; makes the following arguments optional.
The $ evaluates the next item in scalar context. We'll see in a minute why this is good.
The # allows any number of arguments.
This allows invocations like
open FOO; (very bad style, equivalent to open FOO, our $FOO)
open my $fh, #array;, which parses as open my $fh, scalar(#array). Useless
open my $fh, "<foo.txt"; (bad style, allows shell injection)
open my $fh, "<", "foo.txt"; (good three-arg-open)
open my $fh, "-|", #command; (now #command is evaluated in list context, i.e. is flattened)
So why should the second argument have scalar context? (1) either you use traditional two-arg-open. Then it isn't difficult to access the first element. (2) Or you want 3-arg-open (rather: multiarg). Then having an explicit mode in the source code is neccessary, which is good style and reduces action at a distance. So this forces you to decide between the outdated flexible 2-arg or the safe multi-arg.
Further restrictions, like that the < mode can only take one filename, while -| takes at least one string (the command) plus any number of arguments, are implemented on a non-syntactic level.
Example: CORE::bless
Another interesting example is the bless function. Its prototype is $;$. I.e. takes one or two scalars.
This allows bless $self; (blesses into current package), or the better bless $self, $class. However, my #array = ($self, $class); bless #array does not work, as scalar context is imposed on the first arg. So the first argument is not a reference, but the number 2. This reduces action at a distance, and fails rather than providing a probably wrong interpretation: both bless $array[0], $array[1] or bless \#array could have been meant here. So prototypes help and augment input validation, but are no substitute for it.
Example fold_left
Let us define a function fold_left that takes a list and an action as arguments. It performs this action on the first two values of the list, and replaces them with the result. This loops until only one element, the return value is left.
Simple implementation:
sub fold_left {
my $code = shift;
while ($#_) { # loop while more than one element
my ($x, $y) = splice #_, 0, 2;
unshift #_, $code->($x, $y);
}
return $_[0];
}
This can be called like
my $sum = fold_left sub{ $_[0] + $_[1] }, 1 .. 10;
my $str = fold_left sub{ "$_[0] $_[1]" }, 1 .. 10;
my $undef = fold_left;
my $runtime_error = fold_left \"foo", 1..10;
But this is unsatisfactory: we know that the first argument is a sub, so the sub keyword is redundant. Also, We can call it without a sub, which we want to be illegal. With prototypes, we can work around that:
sub fold_left (&#) { ... }
The & states that we'll take a coderef. If this is the first argument, this allows the sub keyword and the comma after the sub block to be omitted. Now we can do
my $sum = fold_left { $_[0] + $_[1] } 1 .. 10; # aka List::Util::sum(1..10);
my $str = fold_left { "$_[0] $_[1]" } 1 .. 10; # aka join " ", 1..10;
my $compile_error1 = fold_left; # ERROR: not enough arguments
my $compile_error2 = fold_left "foo", 1..10; # ERROR: type of arg 1 must be sub{} or block.
which is reminiscent of map {...} #list
On backslash prototypes
Backslash prototypes allow to capture typed references to arguments without imposing context. This is good when we want to pass an array without flattening it. E.g.
sub mypush (\##) {
my ($arrayref, #push_these) = #_;
my $len = #$arrayref;
#$arrayref[$len .. $len + $#push_these] = #push_these;
}
my #array;
mypush #array, 1, 2, 3;
You can think of the \ protecting the # like in regexes, thus requiring a literal # character on the argument. This is where prototypes are a sad story: Requiring literal characters is a bad idea. We can't even pass a reference directly, we have to dereference it first:
my $array = [];
mypush #$array, 1, 2, 3;
even though the called code sees and wants exactly that reference. From v14 on, the + can be used instead. It accepts an array, arrayref, hash or hashref (actually, it's like $ on scalar arguments, and \[#%] on hashes and arrays). This proto does no type validation, It'll just make sure you receive a reference unless the argument already is scalar.
sub mypush (+#) { ... }
my #array;
mypush #array, 1, 2, 3;
my $array_ref = [];
mypush $array_ref, 1, 2, 3; # works as well! yay
my %hash;
mypush %hash, 1, 2, 3; # syntactically legal, but will throw fatal on dereferencing.
mypush "foo", 1, 2, 3; # ditto
Conclusion
Prototypes are a great way to bend Perl to your will. Recently I was investigating how pattern matching from functional languages can be implemented in Perl. The match itself has the prototype $% (one scalar thing which is to be matched, and an even number of further arguments. These are pairs of patterns and code).
They are also a great way to shoot yourself in the foot, and can be downright ugly. From List::MoreUtils:
sub each_array (\#;\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#) {
return each_arrayref(#_);
}
This allows you to call it as each_array #a, #b, #c ..., but it isn't much effort to directly do each_arrayref \#a, \#b, \#c, ..., which imposes no limit on the number of parameters, and is more flexible.
Especially parameters like sub foo ($$$$$$;$$) indicate a code smell, and that you should move to named parameters, Method::Signatures, or Params::Validate.
In my experience, good prototypes are
#, % to slurp any (or an even) number of args. Note that # as sole prototype is equivalent to no prototype at all.
& leading codeblocks for nicer syntax.
$ iff you need to pad a slurpy # or %, but not on their own.
I actively dislike \# etc, and have yet to see a good use for _ aside from length (_ can be the last required argument in a prototype. If no explicit value is given, $_ is used.)
Having a good documentation and requiring the user of your subs to include the occasional backslash before your arguments is generally preferable to unexpected action at a distance or having scalar context imposed surprisingly.
Prototypes can be overridden like &foo(#args), and aren't honoured on method calls, so they are already useless here.
Ok, I understand perl hash slices, and the "x" operator in Perl, but can someone explain the following code example from here (slightly simplified)?
sub test{
my %hash;
#hash{#_} = (undef) x #_;
}
Example Call to sub:
test('one', 'two', 'three');
This line is what throws me:
#hash{#_} = (undef) x #_;
It is creating a hash where the keys are the parameters to the sub and initializing to undef, so:
%hash:
'one' => undef,
'two' => undef,
'three' => undef
The rvalue of the x operator should be a number; how is it that #_ is interpreted as the length of the sub's parameter array? I would expect you'd at least have to do this:
#hash{#_} = (undef) x scalar #_;
To figure out this code you need to understand three things:
The repetition operator. The x operator is the repetition operator. In list context, if the operator's left-hand argument is enclosed in parentheses, it will repeat the items in a list:
my #x = ('foo') x 3; # ('foo', 'foo', 'foo')
Arrays in scalar context. When an array is used in scalar context, it returns its size. The x operator imposes scalar context on its right-hand argument.
my #y = (7,8,9);
my $n = 10 * #y; # $n is 30
Hash slices. The hash slice syntax provides a way to access multiple hash items at once. A hash slice can retrieve hash values, or it can be assigned to. In the case at hand, we are assigning to a hash slice.
# Right side creates a list of repeated undef values -- the size of #_.
# We assign that list to a set of hash keys -- also provided by #_.
#hash{#_} = (undef) x #_;
Less obscure ways to do the same thing:
#hash{#_} = ();
$hash{$_} = undef for #_;
In scalar context, an array evaluates to its length. From perldoc perldata:
If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.)
Although I cannot find more information on it currently, it seems that the replication operator evaluates its second argument in scalar context, causing the array to evaluate to its length.
What is the difference between the scalar and list contexts in Perl and does this have any parallel in other languages such as Java or Javascript?
Various operators in Perl are context sensitive and produce different results in list and scalar context.
For example:
my(#array) = (1, 2, 4, 8, 16);
my($first) = #array;
my(#copy1) = #array;
my #copy2 = #array;
my $count = #array;
print "array: #array\n";
print "first: $first\n";
print "copy1: #copy1\n";
print "copy2: #copy2\n";
print "count: $count\n";
Output:
array: 1 2 4 8 16
first: 1
copy1: 1 2 4 8 16
copy2: 1 2 4 8 16
count: 5
Now:
$first contains 1 (the first element of the array), because the parentheses in the my($first) provide an array context, but there's only space for one value in $first.
both #copy1 and #copy2 contain a copy of #array,
and $count contains 5 because it is a scalar context, and #array evaluates to the number of elements in the array in a scalar context.
More elaborate examples could be constructed too (the results are an exercise for the reader):
my($item1, $item2, #rest) = #array;
my(#copy3, #copy4) = #array, #array;
There is no direct parallel to list and scalar context in other languages that I know of.
Scalar context is what you get when you're looking for a single value. List context is what you get when you're looking for multiple values. One of the most common places to see the distinction is when working with arrays:
#x = #array; # copy an array
$x = #array; # get the number of elements in an array
Other operators and functions are context sensitive as well:
$x = 'abc' =~ /(\w+)/; # $x = 1
($x) = 'abc' =~ /(\w+)/; # $x = 'abc'
#x = localtime(); # (seconds, minutes, hours...)
$x = localtime(); # 'Thu Dec 18 10:02:17 2008'
How an operator (or function) behaves in a given context is up to the operator. There are no general rules for how things are supposed to behave.
You can make your own subroutines context sensitive by using the wantarray function to determine the calling context. You can force an expression to be evaluated in scalar context by using the scalar keyword.
In addition to scalar and list contexts you'll also see "void" (no return value expected) and "boolean" (a true/false value expected) contexts mentioned in the documentation.
This simply means that a data-type will be evaluated based on the mode of the operation. For example, an assignment to a scalar means the right-side will be evaluated as a scalar.
I think the best means of understanding context is learning about wantarray. So imagine that = is a subroutine that implements wantarray:
sub = {
return if ( ! defined wantarray ); # void: just return (doesn't make sense for =)
return #_ if ( wantarray ); # list: return the array
return $#_ + 1; # scalar: return the count of the #_
}
The examples in this post work as if the above subroutine is called by passing the right-side as the parameter.
As for parallels in other languages, yes, I still maintain that virtually every language supports something similar. Polymorphism is similar in all OO languages. Another example, Java converts objects to String in certain contexts. And every untyped scripting language i've used has similar concepts.