In Perl, is there any difference between direct glob aliasing and aliasing via the stash? - perl

In Perl, is there ever any difference between the following two constructs:
*main::foo = *main::bar
and
$main::{foo} = $main::{bar}
They appear to have the same function (aliasing all of the slots in *main::foo to those defined in *main::bar), but I am just wondering if this equivalency always holds.

Maybe not the kind of difference you were looking for, but there are two big differences between *main::foo and $main::{foo}; the former looks up the glob in the stash at compile time, creating it if necessary, while the latter looks for the glob in the stash at run time, and won't create it.
This may make a difference to anything else poking about in the stash, and it certainly can affect whether you get a used only once warning.

The following script:
#!/usr/bin/env perl
#mytest.pl
no warnings;
$bar = "this";
#bar = qw/ 1 2 3 4 5 /;
%bar = qw/ key value /;
open bar, '<', 'mytest.pl' or die $!;
sub bar {
return "Sub defined as 'bar()'";
}
$main::{foo} = $main::{bar};
print "The scalar \$foo holds $foo\n";
print "The array \#foo holds #foo\n";
print "The hash \%foo holds ", %foo, "\n";
my $line = <foo>;
print "The filehandle 'foo' is reads ", $line;
print 'The function foo() replies "', foo(), "\"\n";
Outputs:
The scalar $foo holds this
The array #foo holds 1 2 3 4 5
The hash %foo holds keyvalue
The filehandle 'foo' is reads #!/usr/bin/env perl
The function foo() replies "Sub defined as 'bar()'"
So if *main::foo = *main::bar; doesn't do the same thing as $main::{foo} = $main::{bar};, I'm at a loss as to how to detect a practical difference. ;) However, from a syntax perspective, there may be situations where it's easier to use one method versus another. ...the usual warnings about mucking around in the symbol table always apply.

Accessing the stash as $A::{foo} = $obj allows you to place anything on the symbol table while *A::foo = $obj places $obj on the expected slot of the typeglob according to $obj type.
For example:
DB<1> $ST::{foo} = [1,2,3]
DB<2> *ST::bar = [1,2,3]
DB<3> x #ST::foo
Cannot convert a reference to ARRAY to typeglob at (eval 7)[/usr/local/perl/blead-debug/lib/5.15.0/perl5db.pl:646] line 2.
at (eval 7)[/usr/local/perl/blead-debug/lib/5.15.0/perl5db.pl:646] line 2
eval '($#, $!, $^E, $,, $/, $\\, $^W) = #saved;package main; $^D = $^D | $DB::db_stop;
#ST::foo;
;' called at /usr/local/perl/blead-debug/lib/5.15.0/perl5db.pl line 646
DB::eval called at /usr/local/perl/blead-debug/lib/5.15.0/perl5db.pl line 3442
DB::DB called at -e line 1
DB<4> x #ST::bar
0 1
1 2
2 3
DB<5> x \%ST::
0 HASH(0x1d55810)
'bar' => *ST::bar
'foo' => ARRAY(0x1923e30)
0 1
1 2
2 3

See also "Scalars vs globs (*{} should not return FAKE globs)"
https://github.com/perl/perl5/issues/10625

Related

Why is a list of undef not a read-only or constant value in Perl?

Consider the following programs in Perl.
use strict;
use warnings;
my #foo = qw(a b c);
undef = shift #foo;
print scalar #foo;
This will die with an error message:
Modification of a read-only value attempted at ...
Using a constat will give a different error:
1 = shift #foo;
Can't modify constant item in scalar assignment at ...
Execution of ... aborted due to compilation errors.
The same if we do this:
(1) = shift #foo;
All of those make sense to me. But putting undef in a list will work.
(undef) = shift #foo;
Now it prints 2.
Of course this is common practice if you have a bunch of return values and only want specific ones, like here:
my (undef, undef ,$mode, undef ,$uid, $gid, undef ,$size) = stat($filename);
The 9th line of code example in perldoc -f undef shows this, butthere is no explaination.
My question is, how is this handled internally by Perl?
Internally, Perl has different operators for scalar assignment and list assignment, even though both of them are spelled = in the source code. And the list assignment operator has the special case for undef that you're asking about.

Where does a Perl subroutine get values missing from the actual parameters?

I came across the following Perl subroutine get_billable_pages while chasing a bug. It takes 12 arguments.
sub get_billable_pages {
my ($dbc,
$bill_pages, $page_count, $cover_page_count,
$domain_det_page, $bill_cover_page, $virtual_page_billing,
$job, $bsj, $xqn,
$direction, $attempt,
) = #_;
my $billable_pages = 0;
if ($virtual_page_billing) {
my #row;
### Below is testing on the existence of the 11th and 12th parameters ###
if ( length($direction) && length($attempt) ) {
$dbc->xdb_execute("
SELECT convert(int, value)
FROM job_attribute_detail_atmp_tbl
WHERE job = $job
AND billing_sub_job = $bsj
AND xqn = $xqn
AND direction = '$direction'
AND attempt = $attempt
AND attribute = 1
");
}
else {
$dbc->xdb_execute("
SELECT convert(int, value)
FROM job_attribute_detail_tbl
WHERE job = $job
AND billing_sub_job = $bsj
AND xqn = $xqn
AND attribute = 1
");
}
$cnt = 0;
...;
But is sometimes called with only 10 arguments
$tmp_det = get_billable_pages(
$dbc2,
$row[6], $row[8], $row[7],
$domain_det_page, $bill_cover_page, $virtual_page_billing,
$job1, $bsj1, $row[3],
);
The function does a check on the 11th and 12th arguments.
What are the 11th and 12th arguments when the function is passed only 10 arguments?
Is it a bug to call the function with only 10 arguments because the 11th and 12th arguments end up being random values?
I am thinking this may be the source of the bug because the 12th argument had a funky value when the program failed.
I did not see another definition of the function which takes only 10 arguments.
The values are copied out of the parameter array #_ to the list of scalar variables.
If the array is shorter than the list, then the excess variables are set to undef. If the array is longer than the list, then excess array elements are ignored.
Note that the original array #_ is unmodified by the assignment. No values are created or lost, so it remains the definitive source of the actual parameters passed when the subroutine is called.
ikegami suggested that I should provide some Perl code to demonstrate the assignment of arrays to lists of scalars. Here is that Perl code, based mostly on his edit
use strict;
use warnings;
use Data::Dumper;
my $x = 44; # Make sure that we
my $y = 55; # know if they change
my #params = (8); # Make a dummy parameter array with only one value
($x, $y) = #params; # Copy as if this is were a subroutine
print Dumper $x, $y; # Let's see our parameters
print Dumper \#params; # And how the parameter array looks
output
$VAR1 = 8;
$VAR2 = undef;
$VAR1 = [ 8 ];
So both $x and $y are modified, but if there are insufficient values in the array then undef is used instead. It is as if the source array was extended indefinitely with undef elements.
Now let's look at the logic of the Perl code. undef evaluates as false for the purposes of conditional tests, but you apply the length operator like this
if ( length($direction) && length($attempt) ) { ... }
If you have use warnings in place as you should, Perl would normally produce a Use of uninitialized value warning. However length is unusual in that, if you ask for the length of an undef value (and you are running version 12 or later of Perl 5) it will just return undef instead of warning you.
Regarding "I did not see another definition of the function which takes only 10 arguments", Perl doesn't have function templates like C++ and Java - it is up to the code in the subroutine to look at what it has been passed and behave accordingly.
No, it's not a bug. The remaining arguments are "undef" and you can check for this situation
sub foo {
my ($x, $y) = #_;
print " x is undef\n" unless defined $x;
print " y is undef\n" unless defined $y;
}
foo(1);
prints
y is undef

How to create an anonymous array ([]) with 'empty slots'?

I can create an array with 'empty slots' in it:
$ perl -wde 1
...
DB<1> $x[2] = 0
DB<2> x \#x
0 ARRAY(0x103d5768)
0 empty slot
1 empty slot
2 0
or
DB<3> $#y = 4
DB<4> x \#y
0 ARRAY(0x103d5718)
0 empty slot
1 empty slot
2 empty slot
3 empty slot
4 empty slot
Please note: this is not the same as assigning undef.
But how do I specify that for an anonymous array using [ and ]?
This will not work:
DB<5> x [,,0]
syntax error at (eval 27)[/usr/local/lib/perl5/5.10.0/perl5db.pl:638] line 2, near "[,"
And this fails too, since I only get the assigned value:
DB<6> x []->[2] = 0
0 0
Bonus question: how can I check for an 'empty array slot' in my Perl script?
Background: In my test scripts I would like to be able to compare array contents precisely. For example I want to distinguish between 'not assigned' and 'assigned with an undef value'.
Thanks for any insights.
use feature qw/ say /;
use strict;
use warnings;
my $aref;
$#{$aref} = 4;
$aref->[2] = undef;
$aref->[3] = '';
foreach my $idx ( 0 .. $#{$aref} ) {
say "Testing $idx.";
say "\t$idx exists." if exists $aref->[$idx];
say "\t$idx defined." if defined $aref->[$idx];
}
OUTPUT:
Testing 0.
Testing 1.
Testing 2.
2 exists.
Testing 3.
3 exists.
3 defined.
Testing 4.
We pre-allocated five spots in the anonymous array, #{$aref}. The top index is 4. We are able to find what the top index is the same way we created it; by testing the value of $#{$aref}. We can test for existence. We know everything between 0 and 4 was created. But Perl only reports "exists" for array elements that have specifically had something assigned to them (even if it's undef). Therefore, $aref->[2] is reported to exist, but isn't defined. Just for fun, we assigned '' to $aref->[3] to see a test report defined once. But the short story is that even though the array is pre-extended, we can still test for the difference between an element being initialized with undef, and an element being undef through array pre-extension, by using 'exists'.
I can't say that's documented behavior of exists. So there's no guarantee it wouldn't change someday. But it works on 5.8, 5.10, 5.12, and 5.14.
So, looking for a simple way to find which elements were initialized, which were defined, and which were not, here's an example:
use feature qw/ say /;
use strict;
use warnings;
my $aref;
$#{$aref} = 4;
$aref->[2] = undef;
$aref->[3] = '';
my #initialized = grep { exists $aref->[$_] } 0 .. $#{$aref};
my #defined = grep { defined $aref->[$_] } 0 .. $#{$aref};
my #uninitialized = grep { not exists $aref->[$_] } 0 .. $#{$aref};
my #init_undef = grep { exists $aref->[$_] and not defined $aref->[$_] } 0 .. $#{$aref};
say "Top index is $#{$aref}.";
say "These elements are initialized: #initialized.";
say "These elements are not initialized: #uninitialized.";
say "These elements were initialized with 'undef': #init_undef.";
say "These elements are defined: #defined."
That should do:
$a=[];
$#$a=4;
Update (replying to #hexcoder): In one statement:
$#{$a=[]}=4
And in one statement that returns the array:
$a = (map(($_,$#$_=4),[]))[0]
Though, not that I recommend using that construction...
Background: In my test scripts I would like to be able to compare array contents precisely. For example I want to distinguish between 'not assigned' and 'assigned with an undef value'.
You can check if the index is past the end. Beyond that, there's not much you can do.
$x = [];
undef $x->[9999];
print scalar #$x;
prints 10000. The undef $x->[9999] is equivalent to $x->[9999] = undef; Because none of the elements 0 to 9998 exist, perl will magically assign all of the intervening elements to undef.
You can only do that kind of thing from XS code (see for example Devel::Peek). Some, but not all, of it is exposed by the *::Util packages. (I've been working on a debugging/tracing package, so I know more about this than anyone should need to....)

What pseudo-operators exist in Perl 5?

I am currently documenting all of Perl 5's operators (see the perlopref GitHub project) and I have decided to include Perl 5's pseudo-operators as well. To me, a pseudo-operator in Perl is anything that looks like an operator, but is really more than one operator or a some other piece of syntax. I have documented the four I am familiar with already:
()= the countof operator
=()= the goatse/countof operator
~~ the scalar context operator
}{ the Eskimo-kiss operator
What other names exist for these pseudo-operators, and do you know of any pseudo-operators I have missed?
=head1 Pseudo-operators
There are idioms in Perl 5 that appear to be operators, but are really a
combination of several operators or pieces of syntax. These pseudo-operators
have the precedence of the constituent parts.
=head2 ()= X
=head3 Description
This pseudo-operator is the list assignment operator (aka the countof
operator). It is made up of two items C<()>, and C<=>. In scalar context
it returns the number of items in the list X. In list context it returns an
empty list. It is useful when you have something that returns a list and
you want to know the number of items in that list and don't care about the
list's contents. It is needed because the comma operator returns the last
item in the sequence rather than the number of items in the sequence when it
is placed in scalar context.
It works because the assignment operator returns the number of items
available to be assigned when its left hand side has list context. In the
following example there are five values in the list being assigned to the
list C<($x, $y, $z)>, so C<$count> is assigned C<5>.
my $count = my ($x, $y, $z) = qw/a b c d e/;
The empty list (the C<()> part of the pseudo-operator) triggers this
behavior.
=head3 Example
sub f { return qw/a b c d e/ }
my $count = ()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats = ()= $string =~ /cat/g; #$cats is now 3
print scalar( ()= f() ), "\n"; #prints "5\n"
=head3 See also
L</X = Y> and L</X =()= Y>
=head2 X =()= Y
This pseudo-operator is often called the goatse operator for reasons better
left unexamined; it is also called the list assignment or countof operator.
It is made up of three items C<=>, C<()>, and C<=>. When X is a scalar
variable, the number of items in the list Y is returned. If X is an array
or a hash it it returns an empty list. It is useful when you have something
that returns a list and you want to know the number of items in that list
and don't care about the list's contents. It is needed because the comma
operator returns the last item in the sequence rather than the number of
items in the sequence when it is placed in scalar context.
It works because the assignment operator returns the number of items
available to be assigned when its left hand side has list context. In the
following example there are five values in the list being assigned to the
list C<($x, $y, $z)>, so C<$count> is assigned C<5>.
my $count = my ($x, $y, $z) = qw/a b c d e/;
The empty list (the C<()> part of the pseudo-operator) triggers this
behavior.
=head3 Example
sub f { return qw/a b c d e/ }
my $count =()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats =()= $string =~ /cat/g; #$cats is now 3
=head3 See also
L</=> and L</()=>
=head2 ~~X
=head3 Description
This pseudo-operator is named the scalar context operator. It is made up of
two bitwise negation operators. It provides scalar context to the
expression X. It works because the first bitwise negation operator provides
scalar context to X and performs a bitwise negation of the result; since the
result of two bitwise negations is the original item, the value of the
original expression is preserved.
With the addition of the Smart match operator, this pseudo-operator is even
more confusing. The C<scalar> function is much easier to understand and you
are encouraged to use it instead.
=head3 Example
my #a = qw/a b c d/;
print ~~#a, "\n"; #prints 4
=head3 See also
L</~X>, L</X ~~ Y>, and L<perlfunc/scalar>
=head2 X }{ Y
=head3 Description
This pseudo-operator is called the Eskimo-kiss operator because it looks
like two faces touching noses. It is made up of an closing brace and an
opening brace. It is used when using C<perl> as a command-line program with
the C<-n> or C<-p> options. It has the effect of running X inside of the
loop created by C<-n> or C<-p> and running Y at the end of the program. It
works because the closing brace closes the loop created by C<-n> or C<-p>
and the opening brace creates a new bare block that is closed by the loop's
original ending. You can see this behavior by using the L<B::Deparse>
module. Here is the command C<perl -ne 'print $_;'> deparsed:
LINE: while (defined($_ = <ARGV>)) {
print $_;
}
Notice how the original code was wrapped with the C<while> loop. Here is
the deparsing of C<perl -ne '$count++ if /foo/; }{ print "$count\n"'>:
LINE: while (defined($_ = <ARGV>)) {
++$count if /foo/;
}
{
print "$count\n";
}
Notice how the C<while> loop is closed by the closing brace we added and the
opening brace starts a new bare block that is closed by the closing brace
that was originally intended to close the C<while> loop.
=head3 Example
# count unique lines in the file FOO
perl -nle '$seen{$_}++ }{ print "$_ => $seen{$_}" for keys %seen' FOO
# sum all of the lines until the user types control-d
perl -nle '$sum += $_ }{ print $sum'
=head3 See also
L<perlrun> and L<perlsyn>
=cut
Nice project, here are a few:
scalar x!! $value # conditional scalar include operator
(list) x!! $value # conditional list include operator
'string' x/pattern/ # conditional include if pattern
"#{[ list ]}" # interpolate list expression operator
"${\scalar}" # interpolate scalar expression operator
!! $scalar # scalar -> boolean operator
+0 # cast to numeric operator
.'' # cast to string operator
{ ($value or next)->depends_on_value() } # early bail out operator
# aka using next/last/redo with bare blocks to avoid duplicate variable lookups
# might be a stretch to call this an operator though...
sub{\#_}->( list ) # list capture "operator", like [ list ] but with aliases
In Perl these are generally referred to as "secret operators".
A partial list of "secret operators" can be had here. The best and most complete list is probably in possession of Philippe Bruhad aka BooK and his Secret Perl Operators talk but I don't know where its available. You might ask him. You can probably glean some more from Obfuscation, Golf and Secret Operators.
Don't forget the Flaming X-Wing =<>=~.
The Fun With Perl mailing list will prove useful for your research.
The "goes to" and "is approached by" operators:
$x = 10;
say $x while $x --> 4;
# prints 9 through 4
$x = 10;
say $x while 4 <-- $x;
# prints 9 through 5
They're not unique to Perl.
From this question, I discovered the %{{}} operator to cast a list as a hash. Useful in
contexts where a hash argument (and not a hash assignment) are required.
#list = (a,1,b,2);
print values #list; # arg 1 to values must be hash (not array dereference)
print values %{#list} # prints nothing
print values (%temp=#list) # arg 1 to values must be hash (not list assignment)
print values %{{#list}} # success: prints 12
If #list does not contain any duplicate keys (odd-elements), this operator also provides a way to access the odd or even elements of a list:
#even_elements = keys %{{#list}} # #list[0,2,4,...]
#odd_elements = values %{{#list}} # #list[1,3,5,...]
The Perl secret operators now have some reference (almost official, but they are "secret") documentation on CPAN: perlsecret
You have two "countof" (pseudo-)operators, and I don't really see the difference between them.
From the examples of "the countof operator":
my $count = ()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats = ()= $string =~ /cat/g; #$cats is now 3
From the examples of "the goatse/countof operator":
my $count =()= f(); #$count is now 5
my $string = "cat cat dog cat";
my $cats =()= $string =~ /cat/g; #$cats is now 3
Both sets of examples are identical, modulo whitespace. What is your reasoning for considering them to be two distinct pseudo-operators?
How about the "Boolean one-or-zero" operator: 1&!!
For example:
my %result_of = (
" 1&!! '0 but true' " => 1&!! '0 but true',
" 1&!! '0' " => 1&!! '0',
" 1&!! 'text' " => 1&!! 'text',
" 1&!! 0 " => 1&!! 0,
" 1&!! 1 " => 1&!! 1,
" 1&!! undef " => 1&!! undef,
);
for my $expression ( sort keys %result_of){
print "$expression = " . $result_of{$expression} . "\n";
}
gives the following output:
1&!! '0 but true' = 1
1&!! '0' = 0
1&!! 'text' = 1
1&!! 0 = 0
1&!! 1 = 1
1&!! undef = 0
The << >> operator, for multi-line comments:
<<q==q>>;
This is a
multiline
comment
q

Is Perl's flip-flop operator bugged? It has global state, how can I reset it?

I'm dismayed. OK, so this was probably the most fun Perl bug I've ever found. Even today I'm learning new stuff about Perl. Essentially, the flip-flop operator .. which returns false until the left-hand-side returns true, and then true until the right-hand-side returns false keep global state (or that is what I assume.)
Can I reset it (perhaps this would be a good addition to Perl 4-esque hardly ever used reset())? Or, is there no way to use this operator safely?
I also don't see this (the global context bit) documented anywhere in perldoc perlop is this a mistake?
Code
use feature ':5.10';
use strict;
use warnings;
sub search {
my $arr = shift;
grep { !( /start/ .. /never_exist/ ) } #$arr;
}
my #foo = qw/foo bar start baz end quz quz/;
my #bar = qw/foo bar start baz end quz quz/;
say 'first shot - foo';
say for search \#foo;
say 'second shot - bar';
say for search \#bar;
Spoiler
$ perl test.pl
first shot
foo
bar
second shot
Can someone clarify what the issue with the documentation is? It clearly indicates:
Each ".." operator maintains its own boolean state.
There is some vagueness there about what "Each" means, but I don't think the documentation would be well served by a complex explanation.
Note that Perl's other iterators (each or scalar context glob) can lead to the same problems. Because the state for each is bound to a particular hash, not a particular bit of code,each can be reset by calling (even in void context) keys on the hash. But for glob or .., there is no reset mechanism available except by calling the iterator until it is reset. A sample glob bug:
sub globme {
print "globbing $_[0]:\n";
print "got: ".glob("{$_[0]}")."\n" for 1..2;
}
globme("a,b,c");
globme("d,e,f");
__END__
globbing a,b,c:
got: a
got: b
globbing d,e,f:
got: c
Use of uninitialized value in concatenation (.) or string at - line 3.
got:
For the overly curious, here are some examples where the same .. in the source is a different .. operator:
Separate closures:
sub make_closure {
my $x;
return sub {
$x if 0; # Look, ma, I'm a closure
scalar( $^O..!$^O ); # handy values of true..false that don't trigger ..'s implicit comparison to $.
}
}
print make_closure()->(), make_closure()->();
__END__
11
Comment out the $x if 0 line to see that non-closures have a single .. operation shared by all "copies", with the output being 12.
Threads:
use threads;
sub coderef { sub { scalar( $^O..!$^O ) } }
coderef()->();
print threads->create( coderef() )->join(), threads->create( coderef() )->join();
__END__
22
Threaded code starts with whatever the state of the .. had been before thread creation, but changes to its state in the thread are isolated from affecting anything else.
Recursion:
sub flopme {
my $recurse = $_[0];
flopme($recurse-1) if $recurse;
print " "x$recurse, scalar( $^O..!$^O ), "\n";
flopme($recurse-1) if $recurse;
}
flopme(2)
__END__
1
1
2
1
3
2
4
Each depth of recursion is a separate .. operator.
The trick is not use the same flip-flop so you have no state to worry about. Just make a generator function to give you a new subroutine with a new flip-flop that you only use once:
sub make_search {
my( $left, $right ) = #_;
sub {
grep { !( /\Q$left\E/ .. /\Q$right\E/ ) } #{$_[0]};
}
}
my $search_sub1 = make_search( 'start', 'never_existed' );
my $search_sub2 = make_search( 'start', 'never_existed' );
my #foo = qw/foo bar start baz end quz quz/;
my $count1 = $search_sub1->( \#foo );
my $count2 = $search_sub2->( \#foo );
print "count1 $count1 and count2 $count2\n";
I also write about this in Make exclusive flip-flop operators.
The "range operator" .. is documented in perlop under "Range Operators". Looking through the doucmentation, it appears that there isn't any way to reset the state of the .. operator. Each instance of the .. operator keeps its own state, which means there isn't any way to refer to the state of any particular .. operator.
It looks like it's designed for very small scripts such as:
if (101 .. 200) { print; }
The documentation states that this is short for
if ($. == 101 .. $. == 200) { print; }
Somehow the use of $. is implicit there (toolic points out in a comment that that's documented too). The idea seems to be that this loop runs once (until $. == 200) in a given instance of the Perl interpreter, and therefore you don't need to worry about resetting the state of the .. flip-flop.
This operator doesn't seem too useful in a more general reusable context, for the reasons you've identified.
A workaround/hack/cheat for your particular case is to append the end value to your array:
sub search {
my $arr = shift;
grep { !( /start/ .. /never_exist/ ) } #$arr, 'never_exist';
}
This will guarantee that the RHS of range operator will eventually be true.
Of course, this is in no way a general solution.
In my opinion, this behavior is not clearly documented. If you can construct a clear explanation, you could apply a patch to perlop.pod via perlbug.
I found this problem, and as far as I know there's no way to fix it. The upshot is - don't use the .. operator in functions, unless you are sure you are leaving it in the false state when you leave the function, otherwise the function may return different output for the same input (or exhibit different behaviour for the same input).
Each use of the .. operator maintains its own state. Like Alex Brown said, you need to leave it in the false state when you leave the function. Maybe you could do something like:
sub search {
my $arr = shift;
grep { !( /start/ || $_ eq "my magic reset string" ..
/never_exist/ || $_ eq "my magic reset string" ) }
(#$arr, "my magic reset string");
}