I recently hit a bug when use warnings FATAL ... pragma interprets mute warnings from elsewhere as a reason to die. Consider the following sample:
use strict;
# In one file:
no warnings;
my %hash;
Foo->bar( my $temp = $hash{ +undef } ); # this lives
Foo->bar( $hash{ +undef } ); # this dies
# Elsewhere
package Foo;
use warnings FATAL => qw(uninitialized);
sub bar {
my ($self, $param) = #_; # prefectly safe
$param = "(undef)"
unless defined $param; # even safer
print "Param: $param\n";
}
Now this of course can be fixed big time using the same policy regarding warnings throughout the project. Or this can be fixed every time it occurs by ruling out undefs in specific places (see # this lives line).
My question is whether there is an acceptable solution for package Foo which doesn't require changing anything that calls it, and whether this is really a bug in Perl itself.
It's not a bug. You are experiencing a side-effect of a feature that prevents needless autovification of hash elements passed to subs.
Perl passes by reference. That means that changes to the arguments within the function will change the parameters on the outside.
$ perl -E'
sub f { $_[0] = "xyz"; }
f($x);
say $x;
'
xyz
This applies to hash elements too.
$ perl -E'
sub f { $_[0] = "xyz"; }
my %h;
f($h{x});
say $h{x};
'
xyz
The sub doesn't know anything about the hash, so the hash element must be created before the sub is entered for there to be something to which to assign.
...or does it? It would be generally undesirable for f($h{x}) to always create $h{x} if the element doesn't exist. As such, Perl postpones doing the hash lookup until $_[0] is accessed, at which point it's known whether the element needs to be vivified or not. This is why the warning is coming from within the sub.
Specifically, Perl doesn't pass $h{x} to the sub when you call f($h{x}). Instead, it passes a magical scalar that contains both a reference to %h and the key value (x). This postpones doing the hash lookup until $_[0] is accessed, where it's known whether $_[0] is used somewhere assignable or not.
If $_[0] is used in a manner in which it doesn't change (i.e. if it's used as an rvalue), the hash element is looked up without vivifying it.
If $_[0] is used in a manner in which it can change (i.e. if it's used as an lvalue), the hash element is vivified and returned.
$ perl -E'
sub f { my $x = $_[0]; } # $_[0] returns undef without vivifying $h{x}
sub g { $_[0] = "xyz"; } # $_[0] vivifies and returns $h{x}
my %h;
f($h{x});
say 0+keys(%h);
g($h{x});
say 0+keys(%h);
'
0
1
Related
I am still unclear about why by ref portion is showing undefined value for %Q and $_ uninitialized. I have been looking through perlreftut and still unable to see what I have done wrong. Passing the hash as a flat array has no issue.
Doing it by ref with testRef(\%mkPara) passes a scalar hash reference to the subroutine, right? So, does my %Q = %{$_} not turn it back into a hash?
use strict;
use diagnostics;
use warnings;
my %mkPara = ('aa'=>2,'bb'=>3,'cc'=>4,'dd'=>5);
sub testFlat
{
my %P = #_;
print "$P{'aa'}, $P{'bb'}, ", $P{'cc'}*$P{'dd'}, "\n";
}
sub testRef
{
my %Q = %{$_}; #can't use an undefined value as HASH reference
#print $_->{'aa'}, "\n";#Use of uninitialized value
print $Q{'aa'},"\n";
}
#testFlat(%mkPara);
testRef(\%mkPara);
When you use arguments in a function call (\%mkPara in your case), you can access them through #_ array inside the function.
Here, you pass a single argument to the function : \%mkPara, which you can then access by accessing the first element of #_ by using $_[0].
$_ is the default variable for some builtin functions/operators (print, m//, s///, chomp and a lot more). Usually seen in while or for loops. But in your code, you have no reason to use it (you are never setting it to anything, so it's still set to undef, hence the error "Can't use an undefined value as a HASH reference".
So your function should actually be :
sub testRef
{
my %Q = %{$_[0]}; # instead of %{$_}
print $_[0]->{'aa'}, "\n"; # instead of $_->{'aa'}
print $Q{'aa'},"\n";
}
If needed, you can find more about functions on perlsub.
However, as #Ikegami pointed out in the comments, using my %Q = %{$_[0]}; creates a copy of the hash you sent to the function, which in most cases (including that one where you just print a key of the hash) is very suboptimal as you could just use a hashref (like you are doing when you do $_[0]->{'aa'}).
You can use hash references like this (roughly the same example as the answer of #Zaid) :
sub testRef
{
my ( $Q ) = #_;
print $Q->{aa} ;
print $_, "\n" for keys %$Q;
}
testRef(\%mkPara);
There are quite a lot of resources about references online, for instance perlreftut that you were already looking at.
This can seem a bit tricky at first, but the reason is that $_ is not the same as #_.
From perlvar:
$_ is the implicit/"default" variable that does not have to be spelled out explicitly for certain functions (e.g. split )
Within a subroutine the array #_ contains the parameters passed to that subroutine
So the reason why
my %Q = %{$_};
says you can't use an undefined value as hash reference is because $_ is not defined.
What you really need here is
my %Q = %{$_[0]};
because that is the first element of #_, which is what was passed to testRef in the first place.
In practice I tend to find myself doing things a little differently because it lends itself to flexibility for future modifications:
sub testRef {
my ( $Q ) = #_;
print $_, "\n" for keys %$Q; # just as an example
}
I've read that perl uses call-by-reference when executing subrutines. I made a simple piece of code to check this property, but it behaves like if perl was call-by-value:
$x=50;
$y=70;
sub interchange {
($x1, $y1) = #_;
$z1 = $x1;
$x1 = $y1;
$y1 = $z1;
print "x1:$x1 y1:$y1\n";
}
&interchange ($x, $y);
print "x:$x y:$y\n";
This produces the following output:
$ perl example.pl
x1:70 y1:50
x:50 y:70
If arguments were treated in a call-by-reference way, shouldn't x be equal to x1 and y equal to y1?
Perl is always definitely call by reference. You're statement ($x1, $y1) = #_ is copying the original argument values, since #_ holds aliases to the original parameters.
From perlsub manpage:
Any arguments passed in show up in the array #_ . Therefore, if you called a function with two arguments, those would be stored in $[0] and $[1] . The array #_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the corresponding argument is updated (or an error occurs if it is not updatable).
To modify the values outside of the sub, you would have to modify the values of #_.
The following sub interchange does modify the values:
sub interchange {
($x1, $y1) = #_; # this line copies the values to 2 new variables
$z1 = $x1;
$x1 = $y1;
$y1 = $z1;
$_[0] = $x1; # this line added to change value outside sub
$_[1] = $y1; # this line added to change value outside sub
print "x1:$x1 y1:$y1\n";
}
This gives the output:
x1:70 y1:50
x:70 y:50
More info here: http://www.cs.cf.ac.uk/Dave/PERL/node51.html
But, to quote the article:
You can see that the function was able to affect the #array variable in the main program. Generally, this is considered bad programming practice because it does not isolate what the function does from the rest of the program.
I'm just starting with Perl as well, and I believe you're misunderstanding just what you're passing to the subroutine. When you pass $x and $y you are passing the scalars $x and $y are set to. You need to explicitly pass a reference, which also happens to be a scalar (being the only thing are ever allowed to pass to subroutines). I understand where you're coming from in thinking things are call-by-reference since for arrays and hashes, since you need to pass references to those.
This code should do what you're looking for:
#!/usr/bin/perl
$x=50;
$y=70;
sub interchange {
($x1, $y1) = #_;
$z1 = $$x1; # Dereferencing $x1
$$x1 = $$y1; # Dereferencing $x1 and $y1
$$y1 = $z1; # Dereferencing $y1
print "x1:$$x1 y1:$$y1\n";
}
&interchange (\$x, \$y); # Passing references to $x and $y, not their values
print "x:$x y:$y\n";
I pass in references to $x and $y using \$x and \$y. Then, I use $$x and $$y to dereference them within the subroutine.
I wrote a script in perl which does multi-threading, I then tried to convert it over into an object. However, I can't seem to figure out how to lock on a member variable. The closest I've come to is:
#!/usr/bin/perl
package Y;
use warnings;
use strict;
use threads;
use threads::shared;
sub new
{
my $class = shift;
my $val :shared = 0;
my $self =
{
x => \$val
};
bless $self, $class;
is_shared($self->{x}) or die "nope";
return $self;
}
package MAIN;
use warnings;
use strict;
use threads;
use threads::shared;
use Data::Dumper;
my $x = new Y();
{
lock($x->{x});
}
print Dumper('0'); # prints: $VAR = '0';
print Dumper($x->{x}); # prints: $VAR = \'0';
print "yes\n" if ($x->{x} == 0); # prints nothing
#print "yes\n" if ($$x->{x} == 0); # dies with msg: Not a SCALAR reference
my $tmp = $x->{x}; # this works. Must be a order of precedence thing.
print "yes\n" if ($$tmp == 0); # prints: yes
#++$$x->{x}; # dies with msg: Not a SCALAR reference
++$$tmp;
print Dumper($x->{x}); # prints: $VAR = \'1';
This allows me to put a lock on the member var x, but it means I'd be needing 2 member variables as the actual member var isn't really capable of being manipulated by assigning to it, incrementing it, etc. I can't even test against it.
EDIT:
I'm thinking that I should rename this question "How do you dereference a member variable in perl?" as the problem seems to boil down to that. Using $$x->{x} is invalid syntax and you can't force precedence rules with parentheses. I.e. $($x->{x}) doesn't work. Using a temporary works but it a nuisance.
I don't get what you are trying to do with threads and locking, but there are some simple errors in the way you use references.
$x->{x}
is a reference to a scalar, so the expressions
$x->{x} == 0
++$$x->{x}
both look suspect. $$x->{x} is parsed as {$$x}->{x} (dereference $x, then treat it as a hash reference and look up the value with key x). I think you mean to say
${$x->{x}} == 0
++${$x->{x}}
where ${$x->{x}} means to treat $x as a hash reference, to look up the value for key x in that hash, and then to dererence that value.
I just looked in disbelief at this sequence:
my $line;
$rc = getline($line); # read next line and store in $line
I had understood all along that Perl arguments were passed by value, so whenever I've needed to pass in a large structure, or pass in a variable to be updated, I've passed a ref.
Reading the fine print in perldoc, however, I've learned that #_ is composed of aliases to the variables mentioned in the argument list. After reading the next bit of data, getline() returns it with $_[0] = $data;, which stores $data directly into $line.
I do like this - it's like passing by reference in C++. However, I haven't found a way to assign a more meaningful name to $_[0]. Is there any?
You can, its not very pretty:
use strict;
use warnings;
sub inc {
# manipulate the local symbol table
# to refer to the alias by $name
our $name; local *name = \$_[0];
# $name is an alias to first argument
$name++;
}
my $x = 1;
inc($x);
print $x; # 2
The easiest way is probably just to use a loop, since loops alias their arguments to a name; i.e.
sub my_sub {
for my $arg ( $_[0] ) {
code here sees $arg as an alias for $_[0]
}
}
A version of #Steve's code that allows for multiple distinct arguments:
sub my_sub {
SUB:
for my $thisarg ( $_[0] ) {
for my $thatarg ($_[1]) {
code here sees $thisarg and $thatarg as aliases
last SUB;
}
}
}
Of course this brings multilevel nestings and its own code readability issues, so use it only when absolutely neccessary.
With warnings enabled, perl usually prints Use of uninitialized value $foo if $foo is used in an expression and hasn't been assigned a value, but in some cases it's OK, and the variable is treated as false, 0, or '' without a warning.
What are the cases where an uninitialized/undefined variable can be used without a warning?
Summary
Boolean tests
Incrementing or decrementing an undefined value
Appending to an undefined value
Autovivification
Other mutators
Boolean tests
According to the perlsyn documentation,
The number 0, the strings '0' and '', the empty list (), and undef are all false in a boolean context. All other values are true.
Because the undefined value is false, the following program
#! /usr/bin/perl
use warnings;
my $var;
print "A\n" if $var;
$var && print "B\n";
$var and print "C\n";
print "D\n" if !$var;
print "E\n" if not $var;
$var or print "F\n";
$var || print "G\n";
outputs D through G with no warnings.
Incrementing or decrementing an undefined value
There's no need to explicitly initialize a scalar to zero if your code will increment or decrement it at least once:
#! /usr/bin/perl
use warnings;
my $i;
++$i while "aaba" =~ /a/g;
print $i, "\n";
The code above outputs 3 with no warnings.
Appending to an undefined value
Similar to the implicit zero, there's no need to explicitly initialize scalars to the empty string if you'll append to it at least once:
#! /usr/bin/perl
use warnings;
use strict;
my $str;
for (<*>) {
$str .= substr $_, 0, 1;
}
print $str, "\n";
Autovivification
One example is "autovivification." From the Wikipedia article:
Autovivification is a distinguishing feature of the Perl programming language involving the dynamic creation of data structures. Autovivification is the automatic creation of a variable reference when an undefined value is dereferenced. In other words, Perl autovivification allows a programmer to refer to a structured variable, and arbitrary sub-elements of that structured variable, without expressly declaring the existence of the variable and its complete structure beforehand.
For example:
#! /usr/bin/perl
use warnings;
my %foo;
++$foo{bar}{baz}{quux};
use Data::Dumper;
$Data::Dumper::Indent = 1;
print Dumper \%foo;
Even though we don't explicitly initialize the intermediate keys, Perl takes care of the scaffolding:
$VAR1 = {
'bar' => {
'baz' => {
'quux' => '1'
}
}
};
Without autovivification, the code would require more boilerplate:
my %foo;
$foo{bar} = {};
$foo{bar}{baz} = {};
++$foo{bar}{baz}{quux}; # finally!
Don't confuse autovivification with the undefined values it can produce. For example with
#! /usr/bin/perl
use warnings;
my %foo;
print $foo{bar}{baz}{quux}, "\n";
use Data::Dumper;
$Data::Dumper::Indent = 1;
print Dumper \%foo;
we get
Use of uninitialized value in print at ./prog.pl line 6.
$VAR1 = {
'bar' => {
'baz' => {}
}
};
Notice that the intermediate keys autovivified.
Other examples of autovivification:
reference to array
my $a;
push #$a => "foo";
reference to scalar
my $s;
++$$s;
reference to hash
my $h;
$h->{foo} = "bar";
Sadly, Perl does not (yet!) autovivify the following:
my $code;
$code->("Do what I need please!");
Other mutators
In an answer to a similar question, ysth reports
Certain operators deliberately omit the "uninitialized" warning for your convenience because they are commonly used in situations where a 0 or "" default value for the left or only operand makes sense.
These are: ++ and -- (either pre- or post-), +=, -=, .=, |=, ^=, &&=, ||=.
Being "defined-or," //= happily mutates an undefined value without warning.
So far the cases I've found are:
autovivification (gbacon's answer)
boolean context, like if $foo or $foo || $bar
with ++ or --
left side of +=, -=, or .=
Are there others?
Always fix warnings even the pesky annoying ones.
Undefined warnings can to be turned off. You can do that by creating a new scope for the operation. See perldoc perllexwarn for more info. This method works across all versions of perl.
{
no warnings 'uninitialized';
my $foo = "foo" + undef = "bar";
}
For a lot of the binary operators, you can use the new Perl 5.10 stuff, ~~ and //; See perldoc perlop for more info.
use warnings;
my $foo = undef;
my $bar = $foo // ''; ## same as $bar = defined $foo ? $foo : ''
also is the //= variant which sets the variable if it is undefined:
$foo //= '';
The Smart Matching (~~) operator is kind of cool, and permits smart comparisons, this is kind of nifty check it out in perldoc perlsyn:
use warnings;
my $foo = "string";
say $foo eq undef; # triggers warnings
say $foo ~~ undef; # no undef warnings
The real answer should be: why would you want to turn on that warning? undef is a perfectly good value for a variable (as anyone who's ever worked with a database can tell you), and it often makes sense to differentiate between true (something happened), false (nothing happened) and undef (an error occurred).
Rather than saying
use strict;
use warnings;
say
use common::sense;
and you'll get all the benefits of warnings, but with the annoying ones like undefined variables turned off.
common::sense is available from the CPAN.