Is ~~ a short-circuit operator? - perl

From the Smart matching in detail section in perlsyn:
The smart match operator
short-circuits whenever possible.
Does ~~ have anything in common with short circuit operators (&&, ||, etc.) ?

The meaning of short-circuiting here is that evaluation will stop as soon as the boolean outcome is established.
perl -E "#x=qw/a b c d/; for (qw/b w/) { say qq($_ - ), $_ ~~ #x ? q(ja) : q(nein) }"
For the input b, Perl won't look at the elements following b in #x. The grep built-in, on the other hand, to which the document you quote makes reference, will process the entire list even though all that's needed might be a boolean.
perl -E "#x=qw/a b c/; for (qw/b d/) { say qq($_ - ), scalar grep $_, #x ? q(ja) : q(nein) }"

Yes, in the sense that when one of the arguments is an Array or a Hash, ~~ will only check elements until it can be sure of the result.
For instance, in sub x { ... }; my %h; ...; %h ~~ \&x, the smart match returns true only if x returns true for all the keys of %h; if one call returns false, the match can return false at once without checking the rest of the keys. This is similar to the && operator.
On the other hand, in /foo/ ~~ %h, the smart match can return true if it finds just one key that matches the regular expression; this is similar to ||.

Related

Perl dereferencing in non-strict mode

In Perl, if I have:
no strict;
#ARY = (58, 90);
To operate on an element of the array, say it, the 2nd one, I would write (possibly as part of a larger expression):
$ARY[1] # The most common way found in Perldoc's idioms.
Though, for some reason these also work:
#ARY[1]
#{ARY[1]}
Resulting all in the same object:
print (\$ARY[1]);
print (\#ARY[1]);
print (\#{ARY[1]});
Output:
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
SCALAR(0x9dbcdc)
What is the syntax rules that enable this sort of constructs? How far could one devise reliable program code with each of these constructs, or with a mix of all of them either? How interchangeable are these expressions? (always speaking in a non-strict context).
On a concern of justifying how I come into this question, I agree "use strict" as a better practice, still I'm interested at some knowledge on build-up non-strict expressions.
In an attemp to find myself some help to this uneasiness, I came to:
The notion on "no strict;" of not complaining about undeclared
variables and quirk syntax.
The prefix dereference having higher precedence than subindex [] (perldsc § "Caveat on precedence").
The clarification on when to use # instead of $ (perldata § "Slices").
The lack of "[]" (array subscript / slice) description among the Perl's operators (perlop), which lead me to think it is not an
operator... (yet it has to be something else. But, what?).
For what I learned, none of these hints, put together, make me better understand my issue.
Thanks in advance.
Quotation from perlfaq4:
What is the difference between $array[1] and #array[1]?
The difference is the sigil, that special character in front of the array name. The $ sigil means "exactly one item", while the # sigil means "zero or more items". The $ gets you a single scalar, while the # gets you a list.
Please see: What is the difference between $array[1] and #array[1]?
#ARY[1] is indeed a slice, in fact a slice of only one member. The difference is it creates a list context:
#ar1[0] = qw( a b c ); # List context.
$ar2[0] = qw( a b c ); # Scalar context, the last value is returned.
print "<#ar1> <#ar2>\n";
Output:
<a> <c>
Besides using strict, turn warnings on, too. You'll get the following warning:
Scalar value #ar1[0] better written as $ar1[0]
In perlop, you can read that "Perl's prefix dereferencing operators are typed: $, #, %, and &." The standard syntax is SIGIL { ... }, but in the simple cases, the curly braces can be omitted.
See Can you use string as a HASH ref while "strict refs" in use? for some fun with no strict refs and its emulation under strict.
Extending choroba's answer, to check a particular context, you can use wantarray
sub context { return wantarray ? "LIST" : "SCALAR" }
print $ary1[0] = context(), "\n";
print #ary1[0] = context(), "\n";
Outputs:
SCALAR
LIST
Nothing you did requires no strict; other than to hide your error of doing
#ARY = (58, 90);
when you should have done
my #ARY = (58, 90);
The following returns a single element of the array. Since EXPR is to return a single index, it is evaluated in scalar context.
$array[EXPR]
e.g.
my #array = qw( a b c d );
my $index = 2;
my $ele = $array[$index]; # my $ele = 'c';
The following returns the elements identified by LIST. Since LIST is to return 0 or more elements, it must be evaluated in list context.
#array[LIST]
e.g.
my #array = qw( a b c d );
my #indexes ( 1, 2 );
my #slice = $array[#indexes]; # my #slice = qw( b c );
\( $ARY[$index] ) # Returns a ref to the element returned by $ARY[$index]
\( #ARY[#indexes] ) # Returns refs to each element returned by #ARY[#indexes]
${foo} # Weird way of writing $foo. Useful in literals, e.g. "${foo}bar"
#{foo} # Weird way of writing #foo. Useful in literals, e.g. "#{foo}bar"
${foo}[...] # Weird way of writing $foo[...].
Most people don't even know you can use these outside of string literals.

Why do I get the last value in a list in scalar context in perl?

I had assumed that, in perl, $x=(2,3,4,5) and ($x)=(2,3,4,5) would give me the same result, but was surprised at what happened in my tests. I am wondering why this behavior is the way it is and why wantarray behaves differently. Here are my tests and the results:
>perl -e '$x=(1,2,3,5);print("$x\n")'
5
>perl -e '($x)=(1,2,3,5);print("$x\n")'
1
>perl -e '$x=(wantarray ? (1,2,3,5) : 4);print("$x\n")'
4
>perl -e '($x)=(wantarray ? (1,2,3,5) : 4);print("$x\n")'
4
Is this behavior consistent/reliable across all platforms?
Whoops. wantarray is for context of subroutine calls...
>perl -e '$x=test();sub test{return(1,2,3,5)};print("$x\n")'
5
>perl -e '($x)=test();sub test{return(1,2,3,5)};print("$x\n")'
1
>perl -e '$x=test();sub test{return(wantarray ? (1,2,3,5) : 4)};print("$x\n")'
4
>perl -e '($x)=test();sub test{return(wantarray ? (1,2,3,5) : 4)};print("$x\n")'
1
So I guess it is consistent, but why does the list return the last value in scalar context?
So I guess it is consistent, but why does the list return the last value in scalar context?
Because it's useful.
my $x = f() || ( warn('!!!'), 'default' );
Well, more useful than any other alternative, at least. It's also consistent with its stronger cousin, ;.
sub f { x(), y(), z() };
is the same as
sub f { x(); y(); z() };
Each operator determines decides what it returns in both scalar and list context.[1]
There are operators that only even return a single scalar.
say time; # Returns the number of seconds since epoch
say scalar( time ); # Ditto.
But operators that normally returns more than one scalar and those that return a variable number of scalars cannot possible return that in scalar context, so they will return something else. It's up to each one to decide what that is.
say scalar( 4,5,6 ); # Last item evaluated in scalar context.
say scalar( #a ); # Number of elements in #a.
say scalar( grep f(), g() ); # Number of matching items.
say scalar( localtime ); # Formatted timestamp.
The list operator (e.g. x,y,z) returns what the last item of the list (z) returns when evaluated in scalar context. For example,
my $x = (f(),g(),#a);
is a weird way of writing
f();
g();
my $x = #a;
Notes
Same goes for subs, though it's common to write subs that are useless to call in scalar context.
You have two types of assigment.
The first one is in scalar context and that is how the comma operators behave in this context (it evaluates both arguments but discard the one to the left and return the value to the right).
The second one is list context (when you enclose a variable in parentheses you are asking for list context). In list context the comma operator just separates the arguments and returns the whole list.
From the Documentation
Comma Operator
Binary "," is the comma operator. In scalar context it evaluates its
left argument, throws that value away, then evaluates its right
argument and returns that value. This is just like C's comma operator.
In list context, it's just the list argument separator, and inserts
both its arguments into the list. These arguments are also evaluated
from left to right.
http://perldoc.perl.org/perlop.html#Comma-Operator

Will there be any difference between 42, 42.0, "42.0", "42"

During the process of testing my Perl code using "Smart Match(~~)" I have faced this problem. Will there be any difference between 42, 42.0, "42.0", "42"
$var1 = "42";
$var2 = "42.0";
$a = $var1 ~~ $var2;
I am getting $a as 0; which means $var1 and $var2 are not equal.
Please Explain.
The smart match operator will "usually do what you want". Please read this as "not always".
42 ~~ 42.0 returns true.
42 ~~ "42.0" returns true as well: the string is compared to a number, and therefore seen as a number. Ditto for "42" ~~ 42.0.
"42" ~~ "42.0" returns false: both arguments are strings, and these strings do not compare as "equal", although their numerical meaning would. You wouldn't want Perl to view "two" ~~ "two-point-oh" as true.
A string can be forced to it's numeric interpretation by adding zero:
0+"42" ~~ "42.0" returns true again, as the first string is forced to the number 42, and the second follows suit.
The perldoc perlsyn or perldoc perlop page defines how smart matching works:
Object Any invokes ~~ overloading on $object, or falls back:
Any Num numeric equality $a == $b
Num numish[4] numeric equality $a == $b
undef Any undefined !defined($b)
Any Any string equality $a eq $b
You can see that string equality is the default.
You may want to reconsider using smart match for now. The current implementation is considered by the Perl community to be a mistake, among others because of your question and amon's answer.
Work is ongoing for a "saner" and simpler, but incompatible version of smart match which might be in the next major release of Perl (5.18). It will simply outlaw your example: $a ~~ $b will not be allowed when $b is a simple scalar value (like 42 or "42").
If you have too much time on your hand, you could peruse the Perl5 porters archives, for instance this thread.
Contextual typing.
$Var1 and $Var2 were last used as strings (in the assignment) so now they behave as strings. ~~ will do string comparison if both arguments are strings.
If you use one of them as a number - you don't even need to assign to it - then it will behave as a number and ~~ will use a numeric comparison.
This script will output NO YES:
my $v1 = "42";
my $v2 = "42.0";
print (($v1 ~~ $v2) ? 'YES ' : 'NO ');
$v1 + 0;
print (($v1 ~~ $v2) ? 'YES ' : 'NO ');
Reference for ~~ operator

How to test if a value exist in a hash?

Let's say I have this
#!/usr/bin/perl
%x = ('a' => 1, 'b' => 2, 'c' => 3);
and I would like to know if the value 2 is a hash value in %x.
How is that done?
Fundamentally, a hash is a data structure optimized for solving the converse question, knowing whether the key 2 is present. But it's hard to judge without knowing, so let's assume that won't change.
Possibilities presented here will depend on:
how often you need to do it
how dynamic the hash is
One-time op
grep $_==2, values %x (also spelled grep {$_==1} values %x) will return a list of as many 2s as are present in the hash, or, in scalar context, the number of matches. Evaluated as a boolean in a condition, it yields just what you want.
grep works on versions of Perl as old as I can remember.
use List::Util qw(first); first {$_==2} values %x returns only the first match, undef if none. That makes it faster, as it will short-circuit (stop examining elements) as soon as it succeeds. This isn't a problem for 2, but take care that the returned element doesn't necessarily evaluate to boolean true. Use defined in those cases.
List::Util is a part of the Perl core since 5.8.
use List::MoreUtils qw(any); any {$_==2} values %x returns exactly the information you requested as a boolean, and exhibits the short-circuiting behavior.
List::MoreUtils is available from CPAN.
2 ~~ [values %x] returns exactly the information you requested as a boolean, and exhibits the short-circuiting behavior.
Smart matching is available in Perl since 5.10.
Repeated op, static hash
Construct a hash that maps values to keys, and use that one as a natural hash to test key existence.
my %r = reverse %x;
if ( exists $r{2} ) { ... }
Repeated op, dynamic hash
Use a reverse lookup as above. You'll need to keep it up to date, which is left as an exercise to the reader/editor. (hint: value collisions are tricky)
Shorter answer using smart match (Perl version 5.10 or later):
print 2 ~~ [values %x];
my %reverse = reverse %x;
if( defined( $reverse{2} ) ) {
print "2 is a value in the hash!\n";
}
If you want to find out the keys for which the value is 2:
foreach my $key ( keys %x ) {
print "2 is the value for $key\n" if $x{$key} == 2;
}
Everyone's answer so far was not performance-driven. While the smart-match (~~) solution short circuits (e.g. stops searching when something is found), the grep ones do not.
Therefore, here's a solution which may have better performance for Perl before 5.10 that doesn't have smart match operator:
use List::MoreUtils qw(any);
if (any { $_ == 2 } values %x) {
print "Found!\n";
}
Please note that this is just a specific example of searching in a list (values %x) in this case and as such, if you care about performance, the standard performance analysis of searching in a list apply as discussed in detail in this answer
grep and values
my %x = ('a' => 1, 'b' => 2, 'c' => 3);
if (grep { $_ == 2 } values %x ) {
print "2 is in hash\n";
}
else {
print "2 is not in hash\n";
}
See also: perldoc -q hash
Where $count would be the result:
my $count = grep { $_ == 2 } values %x;
This will not only show you if it's a value in the hash, but how many times it occurs as a value. Alternatively you can do it like this as well:
my $count = grep {/2/} values %x;

What pseudo-operators exist in Perl 5?

I am currently documenting all of Perl 5's operators (see the perlopref GitHub project) and I have decided to include Perl 5's pseudo-operators as well. To me, a pseudo-operator in Perl is anything that looks like an operator, but is really more than one operator or a some other piece of syntax. I have documented the four I am familiar with already:
()= the countof operator
=()= the goatse/countof operator
~~ the scalar context operator
}{ the Eskimo-kiss operator
What other names exist for these pseudo-operators, and do you know of any pseudo-operators I have missed?
=head1 Pseudo-operators
There are idioms in Perl 5 that appear to be operators, but are really a
combination of several operators or pieces of syntax. These pseudo-operators
have the precedence of the constituent parts.
=head2 ()= X
=head3 Description
This pseudo-operator is the list assignment operator (aka the countof
operator). It is made up of two items C<()>, and C<=>. In scalar context
it returns the number of items in the list X. In list context it returns an
empty list. It is useful when you have something that returns a list and
you want to know the number of items in that list and don't care about the
list's contents. It is needed because the comma operator returns the last
item in the sequence rather than the number of items in the sequence when it
is placed in scalar context.
It works because the assignment operator returns the number of items
available to be assigned when its left hand side has list context. In the
following example there are five values in the list being assigned to the
list C<($x, $y, $z)>, so C<$count> is assigned C<5>.
my $count = my ($x, $y, $z) = qw/a b c d e/;
The empty list (the C<()> part of the pseudo-operator) triggers this
behavior.
=head3 Example
sub f { return qw/a b c d e/ }
my $count = ()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats = ()= $string =~ /cat/g; #$cats is now 3
print scalar( ()= f() ), "\n"; #prints "5\n"
=head3 See also
L</X = Y> and L</X =()= Y>
=head2 X =()= Y
This pseudo-operator is often called the goatse operator for reasons better
left unexamined; it is also called the list assignment or countof operator.
It is made up of three items C<=>, C<()>, and C<=>. When X is a scalar
variable, the number of items in the list Y is returned. If X is an array
or a hash it it returns an empty list. It is useful when you have something
that returns a list and you want to know the number of items in that list
and don't care about the list's contents. It is needed because the comma
operator returns the last item in the sequence rather than the number of
items in the sequence when it is placed in scalar context.
It works because the assignment operator returns the number of items
available to be assigned when its left hand side has list context. In the
following example there are five values in the list being assigned to the
list C<($x, $y, $z)>, so C<$count> is assigned C<5>.
my $count = my ($x, $y, $z) = qw/a b c d e/;
The empty list (the C<()> part of the pseudo-operator) triggers this
behavior.
=head3 Example
sub f { return qw/a b c d e/ }
my $count =()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats =()= $string =~ /cat/g; #$cats is now 3
=head3 See also
L</=> and L</()=>
=head2 ~~X
=head3 Description
This pseudo-operator is named the scalar context operator. It is made up of
two bitwise negation operators. It provides scalar context to the
expression X. It works because the first bitwise negation operator provides
scalar context to X and performs a bitwise negation of the result; since the
result of two bitwise negations is the original item, the value of the
original expression is preserved.
With the addition of the Smart match operator, this pseudo-operator is even
more confusing. The C<scalar> function is much easier to understand and you
are encouraged to use it instead.
=head3 Example
my #a = qw/a b c d/;
print ~~#a, "\n"; #prints 4
=head3 See also
L</~X>, L</X ~~ Y>, and L<perlfunc/scalar>
=head2 X }{ Y
=head3 Description
This pseudo-operator is called the Eskimo-kiss operator because it looks
like two faces touching noses. It is made up of an closing brace and an
opening brace. It is used when using C<perl> as a command-line program with
the C<-n> or C<-p> options. It has the effect of running X inside of the
loop created by C<-n> or C<-p> and running Y at the end of the program. It
works because the closing brace closes the loop created by C<-n> or C<-p>
and the opening brace creates a new bare block that is closed by the loop's
original ending. You can see this behavior by using the L<B::Deparse>
module. Here is the command C<perl -ne 'print $_;'> deparsed:
LINE: while (defined($_ = <ARGV>)) {
print $_;
}
Notice how the original code was wrapped with the C<while> loop. Here is
the deparsing of C<perl -ne '$count++ if /foo/; }{ print "$count\n"'>:
LINE: while (defined($_ = <ARGV>)) {
++$count if /foo/;
}
{
print "$count\n";
}
Notice how the C<while> loop is closed by the closing brace we added and the
opening brace starts a new bare block that is closed by the closing brace
that was originally intended to close the C<while> loop.
=head3 Example
# count unique lines in the file FOO
perl -nle '$seen{$_}++ }{ print "$_ => $seen{$_}" for keys %seen' FOO
# sum all of the lines until the user types control-d
perl -nle '$sum += $_ }{ print $sum'
=head3 See also
L<perlrun> and L<perlsyn>
=cut
Nice project, here are a few:
scalar x!! $value # conditional scalar include operator
(list) x!! $value # conditional list include operator
'string' x/pattern/ # conditional include if pattern
"#{[ list ]}" # interpolate list expression operator
"${\scalar}" # interpolate scalar expression operator
!! $scalar # scalar -> boolean operator
+0 # cast to numeric operator
.'' # cast to string operator
{ ($value or next)->depends_on_value() } # early bail out operator
# aka using next/last/redo with bare blocks to avoid duplicate variable lookups
# might be a stretch to call this an operator though...
sub{\#_}->( list ) # list capture "operator", like [ list ] but with aliases
In Perl these are generally referred to as "secret operators".
A partial list of "secret operators" can be had here. The best and most complete list is probably in possession of Philippe Bruhad aka BooK and his Secret Perl Operators talk but I don't know where its available. You might ask him. You can probably glean some more from Obfuscation, Golf and Secret Operators.
Don't forget the Flaming X-Wing =<>=~.
The Fun With Perl mailing list will prove useful for your research.
The "goes to" and "is approached by" operators:
$x = 10;
say $x while $x --> 4;
# prints 9 through 4
$x = 10;
say $x while 4 <-- $x;
# prints 9 through 5
They're not unique to Perl.
From this question, I discovered the %{{}} operator to cast a list as a hash. Useful in
contexts where a hash argument (and not a hash assignment) are required.
#list = (a,1,b,2);
print values #list; # arg 1 to values must be hash (not array dereference)
print values %{#list} # prints nothing
print values (%temp=#list) # arg 1 to values must be hash (not list assignment)
print values %{{#list}} # success: prints 12
If #list does not contain any duplicate keys (odd-elements), this operator also provides a way to access the odd or even elements of a list:
#even_elements = keys %{{#list}} # #list[0,2,4,...]
#odd_elements = values %{{#list}} # #list[1,3,5,...]
The Perl secret operators now have some reference (almost official, but they are "secret") documentation on CPAN: perlsecret
You have two "countof" (pseudo-)operators, and I don't really see the difference between them.
From the examples of "the countof operator":
my $count = ()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats = ()= $string =~ /cat/g; #$cats is now 3
From the examples of "the goatse/countof operator":
my $count =()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats =()= $string =~ /cat/g; #$cats is now 3
Both sets of examples are identical, modulo whitespace. What is your reasoning for considering them to be two distinct pseudo-operators?
How about the "Boolean one-or-zero" operator: 1&!!
For example:
my %result_of = (
" 1&!! '0 but true' " => 1&!! '0 but true',
" 1&!! '0' " => 1&!! '0',
" 1&!! 'text' " => 1&!! 'text',
" 1&!! 0 " => 1&!! 0,
" 1&!! 1 " => 1&!! 1,
" 1&!! undef " => 1&!! undef,
);
for my $expression ( sort keys %result_of){
print "$expression = " . $result_of{$expression} . "\n";
}
gives the following output:
1&!! '0 but true' = 1
1&!! '0' = 0
1&!! 'text' = 1
1&!! 0 = 0
1&!! 1 = 1
1&!! undef = 0
The << >> operator, for multi-line comments:
<<q==q>>;
This is a
multiline
comment
q