Meaning of the perl syntax construction involving a comma - perl

I've encountered a piece of code in a book that looks like this:
#for (some_condition) {
#do something not particularly related to the question
$var = $anotherVar+1, next if #some other condition with $var
#}
I've got no clue what does the comma (",") between $anotherVar+1 and before next do. How is this syntax construction called and is it even correct?

The comma operator is described in perlop. You can use it to separate commands, it evaluates its left operand first, then it evaluates the second operand. In this case, the second operand is next which changes the flow of the program.
Basically, this is a shorter way of writing
if ($var eq "...") {
$var = $anotherVar + 1;
next
}
The comma can be used in a similar way in C, where you can find it often in for loops:
for (i = 0, j = 10; i < 10; i++, j--)

The comma is an operator, in any context. In list context where it's usually seen, it is one way to concatenate values into a list. Here it is being used in scalar context, where it runs the preceding expression, the following expression, and then returns the following expression. This is a holdover from how it works in C and similar languages, when it's not an argument separator; see https://learn.microsoft.com/en-us/cpp/cpp/comma-operator?view=vs-2019.
my #stuff = ('a', 'b'); # comma in list context, forms a list and assigns it
my $stuff = ('a', 'b'); # comma in scalar context, assigns "b"
my $stuff = 'a', 'b'; # assignment has higher precedence
# assignment done first then comma operator evaluated in greater context

Consider the following code:
$x=1;
$y=1;
$x++ , $y++ if 0; # note the comma! both x and y are one statement
print "With comma: $x $y\n";
$x=1;
$y=1;
$x++ ; $y++ if 0; # note the semicolon! so two separate statements
print "With semicolon: $x $y\n";
The output is as follows:
With comma: 1 1
With semicolon: 2 1
A comma is similar to a semicolon, except that both sides of the command are treated as a single statement. This means that in a situation where only one statement is expected, both sides of the comma are evaluated.

Related

Perl "reverse comma operator" (Example from the book Programming Perl, 4th Edition)

I'm reading "Programming Perl" and ran into a strange example that does not seem to make sense. The book describes how the comma operator in Perl will return only the last result when used in a scalar context.
Example:
# After this statement, $stuff = "three"
$stuff = ("one", "two", "three");
The book then gives this example of a "reverse comma operator" a few pages later (page 82)
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
However to me this doesn't seem to be reverse at all..?
Example:
# After this statement, $stuff = "three"
$stuff = reverse_comma("one", "two", "three");
# Implementation of the "reverse comma operator"
sub reverse_comma {
return (pop(#_), pop(#_))[0];
}
How is this in any way reverse of the normal comma operator? The results are the same, not reversed!
Here is a link to the exact page. The example is near the bottom.
It's a bad example, and should be forgotten.
What it's demonstrating is simple:
Normally, if you have a sequence of expressions separated by commas in scalar context, that can be interpreted an instance of the comma operator, which evaluates to the last thing in the sequence.
However, if you put that sequence in parentheses and stick [0] at the end, it turns that sequence into a list and takes its first element, e.g.
my $x = (1, 2, 3)[0];
For some reason, the book calls this the "reverse comma operator". This is a misnomer; it's just a list that's having its first element taken.
The book is confusing matters further by using the pop function twice in the arguments. These are evaluated from left to right, so the first pop evaluates to "three" and the second one to "two".
In any case: Don't ever use either the comma or "reverse comma" operators in real code. Both are likely to prove confusing to future readers.
It's a cute and clever example, but thinking too hard about it distracts from its purpose. That section of the book is showing off list slices. Anything beyond slicing a list, no matter what is in the list, is not germane to the purpose of the examples.
You're only on page 82 of a very big book (we literally couldn't fit any more pages in because we were at the limit of the binding method), so there's not much we could throw at you. Among the other list slices examples, there this clever one that I wouldn't use in real code. That's the curse of contrived examples though.
But let's say there were a reverse comma operator. It would have to evaluate both side of the comma. Many answers go right for "just return the first thing". That's not the feature though. You have to visit every expression even though you keep one of them.
Consider this much much advanced version with a series of anonymous subroutines that I immediately dereference, each of which prints something then returns a result:
use v5.10;
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
);
The parens are there only for precedence since the assignment operator binds more tightly than the comma operator. There's no list here, even though it looks like there is one.
The output shows that Perl evaluated each even though it kept on the last one:
First
wantarray is 0
Second
Third
Scalar is [137]
The poorly-named wantarray built-in returns false, noting that the subroutine thinks it is in scalar context.
Now, suppose that you wanted to flip that around so it still evaluates every expression but keeps the first one. You can use a literal list access:
my $scalar = (
sub { say "First"; 35 } -> (),
sub { say "wantarray is ", 0+wantarray } -> (),
sub { say "Second"; 27 } -> (),
sub { say "Third"; 137 } -> ()
)[0];
With the addition on the subscription, the righthand side is now a list and I pull out the first item. Notice that the second subroutine thinks it is in list context now. I get the result of the first subroutine:
First
wantarray is 1
Second
Third
Scalar is [35]
But, let's put this in a subroutine. I still need to call each subroutine even though I won't use the results of the other ones:
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
sub reverse_comma { ( map { $_->() } #_ )[0] }
Or, would I use the results of the other ones? What if I did something slightly different. I'll add a side effect of setting $last to the evaluated expression:
use v5.10;
my $last;
my $scalar = reverse_comma(
sub { say "First"; 35 },
sub { say "wantarray is ", 0+wantarray },
sub { say "Second"; 27 },
sub { say "Third"; 137 }
);
say "Scalar is [$scalar]";
say "Last is [$last]";
sub reverse_comma { ( map { $last = $_->() } #_ )[0] }
Now I see the feature that makes the scalar comma interesting. It evaluates all the expressions, and some of them might have side effects:
First
wantarray is 0
Second
Third
Scalar is [35]
Last is [137]
It's not a huge secret that tchrist is handy with shell scripts. The Perl to csh converter was basically his inbox (or comp.lang.perl). You'll see some shell like idioms popping up in his examples. Something like that trick to swap two numerical values with one statement:
use v5.10;
my $x = 17;
my $y = 137;
say "x => $x, y => $y";
$x = (
$x = $x + $y,
$y = $x - $y,
$x - $y,
);
say "x => $x, y => $y";
The side effects are important there.
So, back to the Camel, we have an example of where the thing that has the side effect is the pop array operator:
use v5.10;
my #foo = qw( g h j k );
say "list: #foo";
my $scalar = sub { return ( pop(#foo), pop(#foo) )[0] }->();
say "list: #foo";
This shows off that each expression on either side of all the commas are evaluated. I threw in the subroutine wrapper since we didn't show a complete example:
list: g h j k
list: g h
But, none of this was the point of that section, which was indexing into a list literal. The point of the example was not to return a different result than the other examples or Perl features. It was to return the same thing assuming the comma operator acted differently. The section is about list slices and is showing off list slices, so the stuff in the list wasn't the important part.
Let's make the examples more similar.
The comma operator:
$scalar = ("one", "two", "three");
# $scalar now contains "three"
The reverse comma operator:
$scalar = ("one", "two", "three")[0];
# $scalar now contains "one"
It's a "reverse comma" because $scalar gets the result of the first expression, where the normal comma operator gives the last expression. (If you know anything about Lisp, it's like the difference between progn and prog1.)
An implementation of the "reverse comma operator" as a subroutine would look something like this:
sub reverse_comma {
return shift #_;
}
An ordinary comma will evaluate its operands and then return the value of the right-hand operand
my $v = $a, $b
sets $v to the value of $b
For the purpose of demonstrating list slices, the Camel is proposing some code that behaves like the comma operator but instead evaluates its operands and then return the value of the left-hand operand
Something like that can be done with a list slice, like this
my $v = ($a, $b)[0]
which sets $v to the value of $a
That's all there is to it really. The book isn't trying to suggest that there should be a reverse comma subroutine, it is simply considering the problem of evaluating two expressions in order and returning the first. The order of evaluation is relevant only when the two expressions have side effects, which is why the example in the book uses pop, which changes the array as well as returning a value
The imaginary problem is this
Suppose I want a subroutine to remove the last two elements of an array, and then return the value of what used to be the last element
Ordinarily that would require a temporary variable, like this
my $last = pop #foo;
pop #foo;
return $last;
But as an example of a list slice the code suggests that this would also work
# A "reverse comma operator".
return (pop(#foo), pop(#foo))[0];
Please understand that this isn't a recommendation. There are a few ways to do this. Another single-statement way is
return scalar splice #foo, -2;
but that doesn't use a list slice, which is the topic of that section of the book. In reality I doubt if the book's authors would propose anything other than the simple solution with the temporary variable. It is purely an example of what a list slice can do
I hope that helps

How exactly does Perl handle operator chaining?

So I have this bit of code that does not work:
print $userInput."\n" x $userInput2; #$userInput = string & $userInput2 is a integer
It prints it out once fine if the number is over 0 of course, but it doesn't print out the rest if the number is greater than 1. I come from a java background and I assume that it does the concatenation first, then the result will be what will multiply itself with the x operator. But of course that does not happen. Now it works when I do the following:
$userInput .= "\n";
print $userInput x $userInput2;
I am new to Perl so I'd like to understand exactly what goes on with chaining, and if I can even do so.
You're asking about operator precedence. ("Chaining" usually refers to chaining of method calls, e.g. $obj->foo->bar->baz.)
The Perl documentation page perlop starts off with a list of all the operators in order of precedence level. x has the same precedence as other multiplication operators, and . has the same precedence as other addition operators, so of course x is evaluated first. (i.e., it "has higher precedence" or "binds more tightly".)
As in Java you can resolve this with parentheses:
print(($userInput . "\n") x $userInput2);
Note that you need two pairs of parentheses here. If you'd only used the inner parentheses, Perl would treat them as indicating the arguments to print, like this:
# THIS DOESN'T WORK
print($userInput . "\n") x $userInput2;
This would print the string once, then duplicate print's return value some number of times. Putting space before the ( doesn't help since whitespace is generally optional and ignored. In a way, this is another form of operator precedence: function calls bind more tightly than anything else.
If you really hate having more parentheses than strictly necessary, you can defeat Perl with the unary + operator:
print +($userInput . "\n") x $userInput2;
This separates the print from the (, so Perl knows the rest of the line is a single expression. Unary + has no effect whatsoever; its primary use is exactly this sort of situation.
This is due to precedence of . (concatenation) operator being less than the x operator. So it ends up with:
use strict;
use warnings;
my $userInput = "line";
my $userInput2 = 2;
print $userInput.("\n" x $userInput2);
And outputs:
line[newline]
[newline]
This is what you want:
print (($userInput."\n") x $userInput2);
This prints out:
line
line
As has already been mentioned, this is a precedence issue, in that the repetition operator x has higher precedence than the concatenation operator .. However, that is not all that's going on here, and also, the issue itself comes from a bad solution.
First off, when you say
print (($foo . "\n") x $count);
What you are doing is changing the context of the repetition operator to list context.
(LIST) x $count
The above statement really means this (if $count == 3):
print ( $foo . "\n", $foo . "\n", $foo . "\n" ); # list with 3 elements
From perldoc perlop:
Binary "x" is the repetition operator. In scalar context or if the left operand is not enclosed in parentheses, it returns a string consisting of the left operand repeated the number of times specified by the right operand. In list context, if the left operand is enclosed in parentheses or is a list formed by qw/STRING/, it repeats the list. If the right operand is zero or negative, it returns an empty string or an empty list, depending on the context.
The solution works as intended because print takes list arguments. However, if you had something else that takes scalar arguments, such as a subroutine:
foo(("text" . "\n") x 3);
sub foo {
# #_ is now the list ("text\n", "text\n", "text\n");
my ($string) = #_; # error enters here
# $string is now "text\n"
}
This is a subtle difference which might not always give the desired result.
A better solution for this particular case is to not use the concatenation operator at all, because it is redundant:
print "$foo\n" x $count;
Or even use more mundane methods:
for (0 .. $count) {
print "$foo\n";
}
Or
use feature 'say'
...
say $foo for 0 .. $count;

Why does print ($a = a..c) produce: 1E0

print (a..c) # this prints: abc
print ($a = "abc") # this prints: abc
print ($a = a..c); # this prints: 1E0
I would have thought it would print: abc
use strict;
print ($a = "a".."c"); # this prints 1E0
Why? Is it just my computer?
edit: I've got a partial answer (the range operator .. returns a boolean value in scalar context - thanks) but what I don't understand is:
why does: print ($a = "a"..."c") produce 1 instead of 0
why does: print ($a = "a".."c") produce 1E0 instead of 1 or 0
There are a number of subtle things going on here. The first is that .. is really two completely different operators depending on the context in which it's called. In list context it creates a list of values (incrementing by one) between the given starting and ending points.
#numbers = 1 .. 3; # 1, 2, 3
#letters = 'a' .. 'c'; # a, b, c (Yes, Perl can increment strings)
Because print interprets its arguments in list context
print 'a' .. 'c'; # <-- this
print 'a', 'b', 'c'; # <-- is equivalent to this
In scalar context, .. is flip-flop operator. From Range Operators in perlop:
It is false as long as its left operand is false. Once the left
operand is true, the range operator stays true until the right operand
is true, AFTER which the range operator becomes false again.
Assignment to a scalar value as in $a = ... creates scalar context. That means that the .. in print ($a = 'a' .. 'c') is an instance of the flip-flop operator, not the list creation operator.
The flip-flop operator is designed to be used when filtering lines in a file. e.g.
while (<$fh>) {
print if /first/ .. /last/;
}
would print all of the lines in a file starting with the one that contained first and ending with the one that contained last.
The flip-flop operator has some additional magic designed to make it easy to filter based on the line number.
while (<$fh>) {
print if 10 .. 20;
}
will print lines 10 through 20 of a file. It does this by employing special case behavior:
If either operand of scalar .. is a constant expression, that
operand is considered true if it is equal (==) to the current input
line number (the $. variable).
The strings a and c are both constant expressions so they trigger this special case. They aren't numbers, but they're used as numbers (== is a numeric comparison). Perl will convert scalar values between strings and numbers as needed. In this case, both values nummify to 0. Therefore
print ($a = 'a' .. 'c'); # <-- this
print ($a = 0 .. 0); # <-- is effectively this
print ($a = ($. == 0) .. ($. == 0)); # <-- which is really this
We're getting close to the bottom of the mystery. On to the next bit. More from perlop:
The value returned is either the empty string for false, or a sequence
number (beginning with 1) for true. The sequence number is reset for
each range encountered. The final sequence number in a range has the
string "E0" appended to it
If you haven't read any lines from a file yet, $. will be undef which is 0 in a numerical context. 0 == 0 is true, so the .. returns a true value. It's the first true value, so it's 1. Because both the left-hand and right-hand sides are true the first true value is also the last true value and the E0 "this is the last value" suffix is appended to the return value. That is why print ($a = 'a' .. 'c') prints 1E0. If you were to set $. to a non-zero value the .. would be false and return the empty string.
print ($a = 'a' .. 'c'); # prints "1E0"
$. = 1;
print ($a = 'a' .. 'c'); # prints nothing
The very final piece of the puzzle (and I might be going too far now) is that the assignment operator returns a value. In this case that's the value assigned to $a1 -- 1E0. This value is what is ultimately spit out by the print.
1: Technically, the assignment produces a lvalue for the item assigned to. i.e. it returns an lvalue for the variable $a which then evaluates to 1E0.
It's a matter of list context vs. scalar context, as explained in perldoc perlop:
In scalar context, ".." returns a boolean value. The operator is
bistable, like a flip-flop, and emulates the line-range (comma)
operator of sed, awk, and various editors. Each ".." operator
maintains its own boolean state, even across calls to a subroutine
that contains it. It is false as long as its left operand is false.
Once the left operand is true, the range operator stays true until the
right operand is true, AFTER which the range operator becomes false
again. It doesn't become false till the next time the range operator
is evaluated. It can test the right operand and become false on the
same evaluation it became true (as in awk), but it still returns true
once. If you don't want it to test the right operand until the next
evaluation, as in sed, just use three dots ("...") instead of two. In
all other regards, "..." behaves just like ".." does.
[snip]
The final sequence number in a range has the string "E0" appended to
it, which doesn't affect its numeric value, but gives you something to
search for if you want to exclude the endpoint.
EDIT in response to DanD man's comment:
I find it a bit hard to digest too; frankly, I rarely use the .. operator, and even more rarely in scalar context. But for example, the expression 5..10 in an input loop implicitly compares to the current value of $. (that's part of the description that I didn't quote; see the manual). On lines 5 through 9, it yields a true value (experiment shows that it's a number, but the documentation doesn't say so). On line 10, it yields a number with "E0" appended to it -- i.e., it's in exponential notation, but with the same value it would have without the "E0".
The point of the "E0" tweak is to let you detect whether you're in a specified range and to flag the last line in the range for special treatment. Without the "E0", you wouldn't be able to treat the final match specially.
An example:
#!/usr/bin/perl
use strict;
use warnings;
while (<>) {
my $dotdot = 2..4;
print "On line $., 2..4 yields \"$dotdot\"\n";
}
Given 5 lines of input, this prints:
On line 1, 2..4 yields ""
On line 2, 2..4 yields "1"
On line 3, 2..4 yields "2"
On line 4, 2..4 yields "3E0"
On line 5, 2..4 yields ""
This lets you detect whether a line is inside or outside the range and when it's the last line in the range.
But scalar .. is probably more commonly used just for its boolean result, often in one-liners; for example, perl -ne 'print if 2..4' will print lines 2, 3, and 4 of whatever input you give it. It's deliberately similar to sed -n '2,4p'.
The answer can be found by consulting perldoc's perlop page:
Binary ".." is the range operator, which is really two different operators depending on the context. In list context, it returns a list of values counting (up by ones) from the left value to the right value...
This is the familiar usage, which is invoked by print "a" .. "c"; because arguments to functions are evaluated in list context. (If they were evaluated in scalar context, then print #list would print the size of #list, which is almost definitely not what people usually want.)
In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each ".." operator maintains its own boolean state, even across calls to a subroutine that contains it. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. It doesn't become false till the next time the range operator is evaluated. It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once. If you don't want it to test the right operand until the next evaluation, as in sed, just use three dots ("...") instead of two. In all other regards, "..." behaves just like ".." does.
It goes into further detail, but the bolded sections are the important parts to understanding how the operator works. Scalar context is forced by $a =, i.e. assignment to a scalar lvalue. If you did #a =, it would print what you expect.
Note that "a" .. "b" doesn't produce the string "abc", it produces the list ("a", "b", "c"). You will get similar results if you used the list (though the value printed when the list is forced into scalar context will differ).

What pseudo-operators exist in Perl 5?

I am currently documenting all of Perl 5's operators (see the perlopref GitHub project) and I have decided to include Perl 5's pseudo-operators as well. To me, a pseudo-operator in Perl is anything that looks like an operator, but is really more than one operator or a some other piece of syntax. I have documented the four I am familiar with already:
()= the countof operator
=()= the goatse/countof operator
~~ the scalar context operator
}{ the Eskimo-kiss operator
What other names exist for these pseudo-operators, and do you know of any pseudo-operators I have missed?
=head1 Pseudo-operators
There are idioms in Perl 5 that appear to be operators, but are really a
combination of several operators or pieces of syntax. These pseudo-operators
have the precedence of the constituent parts.
=head2 ()= X
=head3 Description
This pseudo-operator is the list assignment operator (aka the countof
operator). It is made up of two items C<()>, and C<=>. In scalar context
it returns the number of items in the list X. In list context it returns an
empty list. It is useful when you have something that returns a list and
you want to know the number of items in that list and don't care about the
list's contents. It is needed because the comma operator returns the last
item in the sequence rather than the number of items in the sequence when it
is placed in scalar context.
It works because the assignment operator returns the number of items
available to be assigned when its left hand side has list context. In the
following example there are five values in the list being assigned to the
list C<($x, $y, $z)>, so C<$count> is assigned C<5>.
my $count = my ($x, $y, $z) = qw/a b c d e/;
The empty list (the C<()> part of the pseudo-operator) triggers this
behavior.
=head3 Example
sub f { return qw/a b c d e/ }
my $count = ()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats = ()= $string =~ /cat/g; #$cats is now 3
print scalar( ()= f() ), "\n"; #prints "5\n"
=head3 See also
L</X = Y> and L</X =()= Y>
=head2 X =()= Y
This pseudo-operator is often called the goatse operator for reasons better
left unexamined; it is also called the list assignment or countof operator.
It is made up of three items C<=>, C<()>, and C<=>. When X is a scalar
variable, the number of items in the list Y is returned. If X is an array
or a hash it it returns an empty list. It is useful when you have something
that returns a list and you want to know the number of items in that list
and don't care about the list's contents. It is needed because the comma
operator returns the last item in the sequence rather than the number of
items in the sequence when it is placed in scalar context.
It works because the assignment operator returns the number of items
available to be assigned when its left hand side has list context. In the
following example there are five values in the list being assigned to the
list C<($x, $y, $z)>, so C<$count> is assigned C<5>.
my $count = my ($x, $y, $z) = qw/a b c d e/;
The empty list (the C<()> part of the pseudo-operator) triggers this
behavior.
=head3 Example
sub f { return qw/a b c d e/ }
my $count =()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats =()= $string =~ /cat/g; #$cats is now 3
=head3 See also
L</=> and L</()=>
=head2 ~~X
=head3 Description
This pseudo-operator is named the scalar context operator. It is made up of
two bitwise negation operators. It provides scalar context to the
expression X. It works because the first bitwise negation operator provides
scalar context to X and performs a bitwise negation of the result; since the
result of two bitwise negations is the original item, the value of the
original expression is preserved.
With the addition of the Smart match operator, this pseudo-operator is even
more confusing. The C<scalar> function is much easier to understand and you
are encouraged to use it instead.
=head3 Example
my #a = qw/a b c d/;
print ~~#a, "\n"; #prints 4
=head3 See also
L</~X>, L</X ~~ Y>, and L<perlfunc/scalar>
=head2 X }{ Y
=head3 Description
This pseudo-operator is called the Eskimo-kiss operator because it looks
like two faces touching noses. It is made up of an closing brace and an
opening brace. It is used when using C<perl> as a command-line program with
the C<-n> or C<-p> options. It has the effect of running X inside of the
loop created by C<-n> or C<-p> and running Y at the end of the program. It
works because the closing brace closes the loop created by C<-n> or C<-p>
and the opening brace creates a new bare block that is closed by the loop's
original ending. You can see this behavior by using the L<B::Deparse>
module. Here is the command C<perl -ne 'print $_;'> deparsed:
LINE: while (defined($_ = <ARGV>)) {
print $_;
}
Notice how the original code was wrapped with the C<while> loop. Here is
the deparsing of C<perl -ne '$count++ if /foo/; }{ print "$count\n"'>:
LINE: while (defined($_ = <ARGV>)) {
++$count if /foo/;
}
{
print "$count\n";
}
Notice how the C<while> loop is closed by the closing brace we added and the
opening brace starts a new bare block that is closed by the closing brace
that was originally intended to close the C<while> loop.
=head3 Example
# count unique lines in the file FOO
perl -nle '$seen{$_}++ }{ print "$_ => $seen{$_}" for keys %seen' FOO
# sum all of the lines until the user types control-d
perl -nle '$sum += $_ }{ print $sum'
=head3 See also
L<perlrun> and L<perlsyn>
=cut
Nice project, here are a few:
scalar x!! $value # conditional scalar include operator
(list) x!! $value # conditional list include operator
'string' x/pattern/ # conditional include if pattern
"#{[ list ]}" # interpolate list expression operator
"${\scalar}" # interpolate scalar expression operator
!! $scalar # scalar -> boolean operator
+0 # cast to numeric operator
.'' # cast to string operator
{ ($value or next)->depends_on_value() } # early bail out operator
# aka using next/last/redo with bare blocks to avoid duplicate variable lookups
# might be a stretch to call this an operator though...
sub{\#_}->( list ) # list capture "operator", like [ list ] but with aliases
In Perl these are generally referred to as "secret operators".
A partial list of "secret operators" can be had here. The best and most complete list is probably in possession of Philippe Bruhad aka BooK and his Secret Perl Operators talk but I don't know where its available. You might ask him. You can probably glean some more from Obfuscation, Golf and Secret Operators.
Don't forget the Flaming X-Wing =<>=~.
The Fun With Perl mailing list will prove useful for your research.
The "goes to" and "is approached by" operators:
$x = 10;
say $x while $x --> 4;
# prints 9 through 4
$x = 10;
say $x while 4 <-- $x;
# prints 9 through 5
They're not unique to Perl.
From this question, I discovered the %{{}} operator to cast a list as a hash. Useful in
contexts where a hash argument (and not a hash assignment) are required.
#list = (a,1,b,2);
print values #list; # arg 1 to values must be hash (not array dereference)
print values %{#list} # prints nothing
print values (%temp=#list) # arg 1 to values must be hash (not list assignment)
print values %{{#list}} # success: prints 12
If #list does not contain any duplicate keys (odd-elements), this operator also provides a way to access the odd or even elements of a list:
#even_elements = keys %{{#list}} # #list[0,2,4,...]
#odd_elements = values %{{#list}} # #list[1,3,5,...]
The Perl secret operators now have some reference (almost official, but they are "secret") documentation on CPAN: perlsecret
You have two "countof" (pseudo-)operators, and I don't really see the difference between them.
From the examples of "the countof operator":
my $count = ()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats = ()= $string =~ /cat/g; #$cats is now 3
From the examples of "the goatse/countof operator":
my $count =()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats =()= $string =~ /cat/g; #$cats is now 3
Both sets of examples are identical, modulo whitespace. What is your reasoning for considering them to be two distinct pseudo-operators?
How about the "Boolean one-or-zero" operator: 1&!!
For example:
my %result_of = (
" 1&!! '0 but true' " => 1&!! '0 but true',
" 1&!! '0' " => 1&!! '0',
" 1&!! 'text' " => 1&!! 'text',
" 1&!! 0 " => 1&!! 0,
" 1&!! 1 " => 1&!! 1,
" 1&!! undef " => 1&!! undef,
);
for my $expression ( sort keys %result_of){
print "$expression = " . $result_of{$expression} . "\n";
}
gives the following output:
1&!! '0 but true' = 1
1&!! '0' = 0
1&!! 'text' = 1
1&!! 0 = 0
1&!! 1 = 1
1&!! undef = 0
The << >> operator, for multi-line comments:
<<q==q>>;
This is a
multiline
comment
q

How can I tell if a set of parens in Perl code will act as grouping parens or form a list?

In perl, parentheses are used for overriding precedence (as in most programming languages) as well as for creating lists. How can I tell if a particular pair of parens will be treated as a grouping construct or a one-element list?
For example, I'm pretty sure this is a scalar and not a one-element list: (1 + 1)
But what about more complex expressions? Is there an easy way to tell?
Three key principles are useful here:
Context is king. The evaluation of your example (1 + 1) depends on the context.
$x = (1 + 1); # Scalar context. $x will equal 2. Parentheses do nothing here.
#y = (1 + 1); # List context. #y will contain one element: (2).
# Parens do nothing (see below), aside from following
# syntax conventions.
In a scalar context, there is no such thing as a list. To see this, try to assign what appears to be a list to a scalar variable. The way to think about this is to focus on the behavior of the comma operator: in scalar context it evaluates its left argument, throws that value away, then evaluates its right argument, and returns that value. In list context, the comma operator inserts both arguments into the list.
#arr = (12, 34, 56); # Right side returns a list.
$x = (12, 34, 56); # Right side returns 56. Also, we get warnings
# about 12 and 34 being used in void context.
$x = (#arr, 7); # Right side returns 7. And we get a warning
# about using an array in a void context.
Parentheses do not create lists. The comma operator creates the list (provided that we are in list context). When typing lists in Perl code, the parentheses are needed for precedence reasons -- not for list-creation reasons. A few examples:
The parentheses have no effect: we are evaluating an array in scalar
context, so the right side returns the array size.
$x = (#arr);
Parentheses are not needed to create a list with one element.
#arr = 33; # Works fine, with #arr equal to (33).
But parentheses are needed with multiple items -- for precedence reasons.
#arr = 12, 34, 56; # #arr equals (12). And we get warnings about using
# 34 and 56 in void context.
Context.
Parentheses don't have the role you think they have in creating a list.
Examples:
$x = 1 + 1; # $x is 2.
$x = (1 + 1); # $x is 2.
#x = 1 + 1; # #x is (2).
#x = (1 + 1); # #x is (2).
$x = (1 + 1, 1 + 2); # $x is 3.
#x = (1 + 1, 1 + 2); # #x is (2, 3).
Roughly speaking, in list context the comma operator separates items of a list; in scalar context the comma operator is the C "serial comma", which evaluates its left and right sides, and returns the value of the right side. In scalar context, parentheses group expressions to override the order of operations, and in list context, parentheses do... the exact same thing, really. The reason they're relevant in assigning to arrays is this:
# Comma has precedence below assignment.
# #a is assigned (1), 2 and 3 are discarded.
#a = 1, 2, 3;
# #a is (1, 2, 3).
#a = (1, 2, 3);
As for your question "is it a scalar or a one-element list", it's just not a meaningful question to ask of an expression in isolation, because of context. In list context, everything is a list; in scalar context, nothing is.
Recommended reading: perlop, perldata, Programming Perl.