Perl: passing hash by ref using rule1 - perl

I am still unclear about why by ref portion is showing undefined value for %Q and $_ uninitialized. I have been looking through perlreftut and still unable to see what I have done wrong. Passing the hash as a flat array has no issue.
Doing it by ref with testRef(\%mkPara) passes a scalar hash reference to the subroutine, right? So, does my %Q = %{$_} not turn it back into a hash?
use strict;
use diagnostics;
use warnings;
my %mkPara = ('aa'=>2,'bb'=>3,'cc'=>4,'dd'=>5);
sub testFlat
{
my %P = #_;
print "$P{'aa'}, $P{'bb'}, ", $P{'cc'}*$P{'dd'}, "\n";
}
sub testRef
{
my %Q = %{$_}; #can't use an undefined value as HASH reference
#print $_->{'aa'}, "\n";#Use of uninitialized value
print $Q{'aa'},"\n";
}
#testFlat(%mkPara);
testRef(\%mkPara);

When you use arguments in a function call (\%mkPara in your case), you can access them through #_ array inside the function.
Here, you pass a single argument to the function : \%mkPara, which you can then access by accessing the first element of #_ by using $_[0].
$_ is the default variable for some builtin functions/operators (print, m//, s///, chomp and a lot more). Usually seen in while or for loops. But in your code, you have no reason to use it (you are never setting it to anything, so it's still set to undef, hence the error "Can't use an undefined value as a HASH reference".
So your function should actually be :
sub testRef
{
my %Q = %{$_[0]}; # instead of %{$_}
print $_[0]->{'aa'}, "\n"; # instead of $_->{'aa'}
print $Q{'aa'},"\n";
}
If needed, you can find more about functions on perlsub.
However, as #Ikegami pointed out in the comments, using my %Q = %{$_[0]}; creates a copy of the hash you sent to the function, which in most cases (including that one where you just print a key of the hash) is very suboptimal as you could just use a hashref (like you are doing when you do $_[0]->{'aa'}).
You can use hash references like this (roughly the same example as the answer of #Zaid) :
sub testRef
{
my ( $Q ) = #_;
print $Q->{aa} ;
print $_, "\n" for keys %$Q;
}
testRef(\%mkPara);
There are quite a lot of resources about references online, for instance perlreftut that you were already looking at.

This can seem a bit tricky at first, but the reason is that $_ is not the same as #_.
From perlvar:
$_ is the implicit/"default" variable that does not have to be spelled out explicitly for certain functions (e.g. split )
Within a subroutine the array #_ contains the parameters passed to that subroutine
So the reason why
my %Q = %{$_};
says you can't use an undefined value as hash reference is because $_ is not defined.
What you really need here is
my %Q = %{$_[0]};
because that is the first element of #_, which is what was passed to testRef in the first place.
In practice I tend to find myself doing things a little differently because it lends itself to flexibility for future modifications:
sub testRef {
my ( $Q ) = #_;
print $_, "\n" for keys %$Q; # just as an example
}

Related

How to have sub use $_ when parameter omitted?

How can I get a perl sub to use $_ when the parameter is omitted, like chr does? Is this the best way?
my #chars = map { chr } #numbers; # example
my #trimmed_names = map { trim } #names;
sub trim
{
my $str = shift || $_;
$str =~ s/^\s+|\s+$//g;
return $str;
}
The $_ is directly seen in a sub called in its scope, so you can indeed just use it
sub trim { s/^\s+|\s+$//gr } # NOTE: doesn't change $_
where with /r modifier the changed string is returned and original isn't changed, crucial here.
However, this can be tricky and can (easily) result in subtle and very hard-to-find bugs. Here is one ready example. If we changed the $_ in the sub during processing, like
sub trim { # WARNING: caller's data changed
s/^\s+|\s+$//g;
return $_;
}
then the elements of #names in the caller have been changed, what is generally not expected. This is because the changed upper-scope $_ itself is aliased in map's body.† As $_ is a convenient default for many things we'd have to keep track of everything used in the sub. So I'd indeed first copy $_, or safer yet localize it, in the sub and work with that.
Finally, in order to use either a passed parameter or $_ (at the point of the call)
sub trim {
my $str = #_ ? shift : $_; #/
$str =~ s/^\s+|\s$//gr;
}
my #trimmed_names = map { trim } #names; # may omit () if sub declared before
This is because the visibility of $_ is unrelated to the argument list in #_ so one can also pass arguments. Here we also get the (much) safer copying of $_.
The shift || $_ from the question would dismiss a 0 or '' (empty string) in #_, what is in principle valid input; the shift // $_ would dismiss an undef, also a possible input. Thanks to ikegami's comment on this. Thus explicitly test whether there is anything in #_.
While passing a variable that's undef isn't valid here it may be valid input in general. More to the point, the premise here is to use an argument if provided, so we should do that and then (hopefully) detect an error from the calling code (if passing undef shouldn't have happened), instead of quietly side-stepping it, by switching to $_.
So, my answer is a qualified "yes" -- that's one way to do it; but I may find it uncomfortable to work with a codebase where user's subs mix scopes. This example trim in map is perfectly safe as it stands, but where else may such a function wind up used? Why not just pass arguments?
Note: In order to be able to call a user-defined sub without parenthesis we must have it declared in the source before the point of invocation so that the interpreter knows what that bareword (trim) is, since without parens it doesn't have any hints.
† I think it's worth recalling at this point that arguments to a sub are aliased, not copied, so if elements of #_ themselves are changed then caller's data gets changed. This isn't directly related to $_ but the behavior can be.
You can use the _ prototype.
sub trim(_) { $_[0] =~ s/^\s+|\s+\z//rg }
Otherwise, you can simply use $_ if no arguments were provided.
sub trim { ( #_ ? $_[0] : $_ ) =~ s/^\s+|\s+\z//rg }
Either way,
say for map trim, #strings;
-or-
say for map trim($_), #strings;

run a subroutine by using argument

My code:
my $aaa = "abc";
sub myp{
print "$_";
}
myp($aaa);
I hope myp can print the argument it get.
But it said
Use of uninitialized value $_ in string at ./arg line 17.
The arguments to a subroutine in Perl are passed in the #_ array. This is not the same as the $_ variable.
A common idiom is to "unpack" these arguments in the first line of a function, e.g.
sub example {
my ($arg1, $arg2) = #_;
print "$arg1 and $arg2";
}
It's also possible to refer to arguments directly as elements of #_, e.g. as $_[0], but this is much harder to read and as such is best avoided.
I usually do something like:
my $first_arg = shift #_;
my $second_arg = shift #_;
You can also use the method of the other response:
my ($first_arg, $second_arg) = #_;
But be careful saying:
my $first_arg = #_;
Since you will get the number of arguments passed to the subroutine.
When you refer to $_ you are referencing the default string variable, you probably want in this case to refer #_, if you want to get a specific argument, you must say $_[narg], be also careful passing arrays to subroutines if you do:
some_sub(#myarray);
You will pass the entire array as it was the argument list, instead you should say:
some_sub(\#myarray);

FATAL uninitialized warnings - action at a distance

I recently hit a bug when use warnings FATAL ... pragma interprets mute warnings from elsewhere as a reason to die. Consider the following sample:
use strict;
# In one file:
no warnings;
my %hash;
Foo->bar( my $temp = $hash{ +undef } ); # this lives
Foo->bar( $hash{ +undef } ); # this dies
# Elsewhere
package Foo;
use warnings FATAL => qw(uninitialized);
sub bar {
my ($self, $param) = #_; # prefectly safe
$param = "(undef)"
unless defined $param; # even safer
print "Param: $param\n";
}
Now this of course can be fixed big time using the same policy regarding warnings throughout the project. Or this can be fixed every time it occurs by ruling out undefs in specific places (see # this lives line).
My question is whether there is an acceptable solution for package Foo which doesn't require changing anything that calls it, and whether this is really a bug in Perl itself.
It's not a bug. You are experiencing a side-effect of a feature that prevents needless autovification of hash elements passed to subs.
Perl passes by reference. That means that changes to the arguments within the function will change the parameters on the outside.
$ perl -E'
sub f { $_[0] = "xyz"; }
f($x);
say $x;
'
xyz
This applies to hash elements too.
$ perl -E'
sub f { $_[0] = "xyz"; }
my %h;
f($h{x});
say $h{x};
'
xyz
The sub doesn't know anything about the hash, so the hash element must be created before the sub is entered for there to be something to which to assign.
...or does it? It would be generally undesirable for f($h{x}) to always create $h{x} if the element doesn't exist. As such, Perl postpones doing the hash lookup until $_[0] is accessed, at which point it's known whether the element needs to be vivified or not. This is why the warning is coming from within the sub.
Specifically, Perl doesn't pass $h{x} to the sub when you call f($h{x}). Instead, it passes a magical scalar that contains both a reference to %h and the key value (x). This postpones doing the hash lookup until $_[0] is accessed, where it's known whether $_[0] is used somewhere assignable or not.
If $_[0] is used in a manner in which it doesn't change (i.e. if it's used as an rvalue), the hash element is looked up without vivifying it.
If $_[0] is used in a manner in which it can change (i.e. if it's used as an lvalue), the hash element is vivified and returned.
$ perl -E'
sub f { my $x = $_[0]; } # $_[0] returns undef without vivifying $h{x}
sub g { $_[0] = "xyz"; } # $_[0] vivifies and returns $h{x}
my %h;
f($h{x});
say 0+keys(%h);
g($h{x});
say 0+keys(%h);
'
0
1

perl subroutine argument lists - "pass by alias"?

I just looked in disbelief at this sequence:
my $line;
$rc = getline($line); # read next line and store in $line
I had understood all along that Perl arguments were passed by value, so whenever I've needed to pass in a large structure, or pass in a variable to be updated, I've passed a ref.
Reading the fine print in perldoc, however, I've learned that #_ is composed of aliases to the variables mentioned in the argument list. After reading the next bit of data, getline() returns it with $_[0] = $data;, which stores $data directly into $line.
I do like this - it's like passing by reference in C++. However, I haven't found a way to assign a more meaningful name to $_[0]. Is there any?
You can, its not very pretty:
use strict;
use warnings;
sub inc {
# manipulate the local symbol table
# to refer to the alias by $name
our $name; local *name = \$_[0];
# $name is an alias to first argument
$name++;
}
my $x = 1;
inc($x);
print $x; # 2
The easiest way is probably just to use a loop, since loops alias their arguments to a name; i.e.
sub my_sub {
for my $arg ( $_[0] ) {
code here sees $arg as an alias for $_[0]
}
}
A version of #Steve's code that allows for multiple distinct arguments:
sub my_sub {
SUB:
for my $thisarg ( $_[0] ) {
for my $thatarg ($_[1]) {
code here sees $thisarg and $thatarg as aliases
last SUB;
}
}
}
Of course this brings multilevel nestings and its own code readability issues, so use it only when absolutely neccessary.

How to "keys %h" if $h is an object?

$h below is an object, but it only contains a regular hash.
my $h = YAML::Syck::LoadFile('have_seen.yaml');
If it was a normal hash then the number of keys would just be keys $h.
Question
How to get the numbers of keys when the hash is in an object?
Update
This is code
#!/usr/bin/perl
use strict;
use YAML::Syck;
my $h = YAML::Syck::LoadFile('h.yaml');
my $links = 100;
print $links - keys $h . "\n";
The yaml file contains
---
010711: 1
---
$h is not an object, but a plain hashref. This is really an operator precedence problem. Use parentheses to bind the argument to the keys function tight.
print $links - keys($h) . "\n";
As Greg Bacon pointed out, on old Perls it is necessary to manually dereference first with %$h or %{ $h } (which is the better style).
Use the keys operator as in
print scalar keys %$h;
Most of the time, an explicit scalar is unnecessary, e.g.,
my $n = keys %$h;
But it’s usually a bad idea to go poking into the internals of an object. Use the public interface instead. Why do you want to do it this way?
My code was also producing the same error Type of argument to keys on reference must be unblessed hashref or arrayref but the difference is that the hash was produced from my own object.
sub getAttributes {
my $self = shift;
return $self->{ATTRIBUTES};
}
I tried a few ways to get keys to de-reference what is returned by $instance->getAttributes() but it seems that once it has been blessed the keys function doesn't want to know.