Passing a hash to a subroutine without changing it input - perl

I'm trying to debug some strange behavior while handling with a hash in Perl.
I'm passing a hash (not ref) to a subroutine and for some reason it updates it.
some_sub($a,%{$hash});
sub some_sub {
my ($a,%hash) = #_;
my #struct;
while (my ($dir, $data) = each %hash) {
foreach my $id (keys(%{$data})) {
my $entry = $data->{$id};
$entry->{id} = $id;
my $parent = $data->{$entry->{id}};
unless ($parent) {
push #struct, $entry
} else {
push #{$parent->{children}},$entry;
}
}
}
}
my %h= %{$hash};
print Dumper(\%h);
The sub some_sub does change %hash but only for the inner scope, so it should not change the data of the outside %hash. Also, I pass the hash as a hash and not as a hash ref. I suspected the sub some_sub inserts memory addresses into the %hash, but I'm not sure.
How should I debug and solve this issue?
EDIT: I also tried to pass a hash ref to the subroutine and do a dereferencing of the hash ref into another hash while doing all of the operations on the new hash.

Every value in a hash is a scalar. If you have a nested hash, the inner hash is stored as a scalar - a hash reference. Therefore, when changing the nested structures, the changes happen in the referenced hash, which is referenced from the original hash, too.
#! /usr/bin/perl
use warnings;
use strict;
sub change {
my %hash2 = #_;
for my $key (keys %hash2) {
++$_ for values $hash2{$key};
}
}
my %hash = (a => {b => 12, c => 24});
change(%hash);
use Data::Dumper; print Dumper \%hash;
Output:
$VAR1 = {
'a' => {
'b' => 13,
'c' => 25
}
};
The process of obtaining a structure that's similar as the original but contains different references is called cloning or deep copying. See Clone or dclone from Storable.

Arguments are passed to a function as a flat list of scalars, so
some_sub($a, %{$hashref})
has the keys and values of the hash passed as a list after $a
some_sub($a, key, value, ...);
since a function call always takes merely a list of scalars.
These key-value pairs are assigned to a hash in the function so when you work with that hash you directly use references from the calling code, your hash values. So data in the caller gets changed if those references are written to.
The details aren't given but in general one way to avoid changing caller's data in the sub is by introducing local variables for each reference the processing encounters. But then those may themselves contain references so you'd still need to be very careful.
It is simpler to make a full deep copy of the hash, ff the data structure isn't huge. For example
use Storable qw(dclone);
some_sub($v, $hashref);
sub some_sub {
my ($var, $hr) = #_;
my $cloned_hashref = dclone($hr);
# work away with $cloned_hashref
}

Related

Why is the Hashref passed to the Net::Ping constructor, set to an empty hashref after Net::Ping->new($args)?

What am I missing here?
When passing arguments to Net::Ping like this, then $args and $args_copy will both be set to an empty hashref after initializing the constructor Net::Ping->new($args).
use strict;
use warnings;
use Data::Dumper qw(Dumper);
use Net::Ping;
sub _ping {
my ($args) = #_;
my $p = Net::Ping->new($args);
$p->close();
}
my $args = { proto => 'udp' };
my $args_copy = $args;
print Dumper $args; # $VAR1 = { 'proto' => 'udp' }
print Dumper $args_copy; # $VAR1 = { 'proto' => 'udp' }
_ping($args);
print Dumper $args; # $VAR1 = {}
print Dumper $args_copy; # $VAR1 = {}
I see the same behavior on both Strawberry Perl and WSL2 running Ubuntu 20.04.4 LTS with Perl v5.30.0.
This is interesting, a class (constructor) deleting caller's data.
The shown code passes a reference to the Net::Ping constructor and that data gets cleared, and right in the constructor (see below).
To avoid having $args cleared, if that is a problem, pass its copy instead
_ping( { %$args } );
This first de-references the hash and then constructs an anonymous hash reference with it,† and passes that. So $args is safe.
The constructor new uses data from #_ directly (without making local copies), and as it then goes through the keys it also deletes them, I presume for convenience in further processing. (I find that scary, I admit.)
Since a reference is passed to new the data in the calling code can get changed.‡
† When copying a hash (or array) with a complex data structure in it -- when its values themselves contain references -- we need to make a deep copy. One way is to use Storable for it
use Storable qw(dclone);
my $deep_copy = dclone $complex_data_structure;
Here that would mean _ping( dclone $args );. It seems that new can only take a reference to a flat hash (or scalars) so this wouldn't be necessary.
‡ When a sub works directly with the references it gets then it can change data in the caller
sub sub_takes_ref {
my ($ref_data) = #_;
for my $k (keys %$ref_data) {
$ref_data->{$k} = ...; # !!! data in caller changed !!!
}
}
...
my $data = { ... }; # a hashref
sub_takes_ref( $data );
However, if a local copy of arguments is made in the sub then caller's data cannot be changed
use Storable qw(dclone); # for deep copy below
sub sub_takes_ref {
my ($ref_data) = #_;
my $local_copy_of_data = dclone $ref_data;
for my $k (keys %$local_copy_of_data) {
$local_copy_of_data->{$k} = ...; # data in caller safe
}
}
(Just remember to not touch $ref_data but to use the local copy.)
This way of changing data in the caller is of course useful when the sub is meant to work on data structures with large amounts of data, since this way they don't have to be copied. But when that is not the purpose of the sub then we need to be careful, or just make a local copy to be safe.

Subroutine arguments as key-value pairs without a temp variable

In Perl, I've always liked the key-value pair style of argument passing,
fruit( apples => red );
I do this a lot:
sub fruit {
my %args = #_;
$args{apples}
}
Purely for compactness and having more than one way to do it, is there a way to either:
access the key-value pairs without assigning #_ to a hash? I.e. in a single statement?
have the subroutine's arguments automatically become a hash reference, perhaps via a subroutine prototype?
Without:
assigning to a temp variable my %args = #_;
having the caller pass by reference i.e. fruit({ apples => red }); purely for aesthetics
Attempted
${%{\#_}}{apples}
Trying to reference #_, interpret that as a hash ref, and access a value by key.
But I get an error that it's not a hash reference. (Which it isn't ^.^ ) I'm thinking of C where you can cast pointers, amongst other things, and avoid explicit reassignment.
I also tried subroutine prototypes
sub fruit (%) { ... }
...but the arguments get collapsed into #_ as usual.
You can't perform a hash lookup (${...}{...}) without having a hash. But you could create an anonymous hash.
my $apples = ${ { #_ } }{apples};
my $oranges = ${ { #_ } }{oranges};
You could also use the simpler post-dereference syntax
my $apples = { #_ }->{apples};
my $oranges = { #_ }->{oranges};
That would be very inefficient though. You'd be creating a new hash for each parameter. That's why a named hash is normally used.
my %args = #_;
my $apples = $args{apples};
my $oranges = $args{oranges};
An alternative, however, would be to use a hash slice.
my ($apples, $oranges) = #{ { #_ } }{qw( apples oranges )};
The following is the post-derefence version, but it's only available in 5.24+[1]:
my ($apples, $oranges) = { #_ }->#{qw( apples oranges )};
It's available in 5.20+ if you use the following:
use feature qw( postderef );
no warnings qw( experimental::postderef );
If you're more concerned about compactness than efficiency, you can do it this way:
sub fruit {
print( +{#_}->{apples}, "\n" );
my $y = {#_}->{pears};
print("$y\n");
}
fruit(apples => 'red', pears => 'green');
The reason +{#_}->{apples} was used instead of {#_}->{apples} is that it conflicts with the print BLOCK LIST syntax of print without it (or some other means of disambiguation).

Filling hash by reference in a procedure

I'm trying to call a procedure, which is filling a hash by reference. The reference to the hash is given as a parameter. The procedure fills the hash, but when I return, the hash is empty. Please see the code below.
What is wrong?
$hash_ref;
genHash ($hash_ref);
#hash is empty
sub genHash {
my ($hash_ref)=(#_);
#cut details; filling hash in a loop like this:
$hash_ref->{$lid} = $sid;
#hash is generetad , filled and i can dump it
}
You might want to initialize hashref first,
my $hash_ref = {};
as autovivification happens inside function to another lexical variable.
(Not so good) alternative is to use scalars inside #_ array which are directly aliased to original variables,
$_[0]{$lid} = $sid;
And btw, consider use strict; use warnings; to all your scripts.
The caller's $hash_ref is undefined. The $hash_ref in the sub is therefore undefined too. $hash_ref->{$lid} = $sid; autovivifies the sub's $hash_ref, but nothing assigns that hash reference to the caller's $hash_ref.
Solution 1: Actually passing in a hash ref to assign to the caller's $hash_ref.
sub genHash {
my ($hash_ref) = #_;
...
}
my $hash_ref = {};
genHash($hash_ref);
Solution 2: Taking advantage of the fact that Perl passes by reference.
sub genHash {
my $hash_ref = $_[0] ||= {};
...
}
my $hash_ref;
genHash($hash_ref);
-or-
genHash(my $hash_ref);
Solution 3: If the hash is going to be empty initially, why not just create it in the sub?
sub genHash {
my %hash;
...
return \%hash;
}
my $hash_ref = genHash();

perl how to reference hash itself

This might seem to be an odd thing to do, but how do I reference a hash while 'inside' the hash itself? Here's what I'm trying to do:
I have a hash of hashes with a sub at the end, like:
my $h = { A => [...], B => [...], ..., EXPAND => sub { ... } };
. I'm looking to implement EXPAND to see if the key C is present in this hash, and if so, insert another key value pair D.
So my question is, how do I pass the reference to this hash to the sub, without using the variable name of the hash? I expect to need to do this to a few hashes and I don't want to keep having to change the sub to reference the name of the hash it's currently in.
What you've got there is some nested array references, not hashes. Let's assume you actually meant that you have something like this:
my $h = { A => {...}, B => {...}, ..., EXPAND() };
In that case, you can't reference $h from within its own definition, because $h does not exist until the expression is completely evaluated.
If you're content to make it two lines, then you can do this:
my $h = { A=> {...}, B => {...} };
$h = { %$h, EXPAND( $h ) };
The general solution is to write a function that, given a hash and a function to expand that hash, returns that hash with the expansion function added to it. We can close over the hash in the expansion function so that the hash's name doesn't need to be mentioned in it. That looks like this:
use strict;
use warnings;
use 5.010;
sub add_expander {
my ($expanding_hash, $expander_sub) = #_;
my $result = { %$expanding_hash };
$result->{EXPAND} = sub { $expander_sub->($result) };
return $result;
}
my $h = add_expander(
{
A => 5,
B => 6,
},
sub {
my ($hash) = #_;
my ($maxkey) = sort { $b cmp $a } grep { $_ ne 'EXPAND' } keys %$hash;
my $newkey = chr(ord($maxkey) + 1);
$hash->{$newkey} = 'BOO!';
}
);
use Data::Dumper;
say Dumper $h;
$h->{EXPAND}->();
say Dumper $h;
Notice that we are creating $h but that the add_expander call contains no mention of $h. Instead, the sub passed into the call expects the hash it is meant to expand as its first argument. Running add_expander on the hash on the sub creates a closure that will remember which hash the expander is associated with and incorporates it into the hash.
This solution assumes that what should happen when a hash is expanded can vary by subject hash, so add_expander takes an arbitrary sub. If you don't need that degree of freedom, you can incorporate the expansion sub into add_expander.
The hash being built (potentially) happens after EXPAND() runs. I would probably use something like this:
$h = EXPAND( { A=>... } )
Where EXPAND(...) returns the modified hashref or a clone if the original needs to remain intact.

How do I pass a hash to subroutine?

Need help figuring out how to do this. My code:
my %hash;
$hash{'1'}= {'Make' => 'Toyota','Color' => 'Red',};
$hash{'2'}= {'Make' => 'Ford','Color' => 'Blue',};
$hash{'3'}= {'Make' => 'Honda','Color' => 'Yellow',};
&printInfo(%hash);
sub printInfo{
my (%hash) = %_;
foreach my $key (keys %_{
my $a = $_{$key}{'Make'};
my $b = $_{$key}{'Color'};
print "$a $b\n";
}
}
The easy way, which may lead to problems when the code evolves, is simply by assigning the default array #_ (which contains all key-value-pairs as an even list) to the %hash which then rebuilds accordingliy. So your code would look like this:
sub printInfo {
my %hash = #_;
...
}
The better way would be to pass the hash as reference to the subroutine. This way you could still pass more parameters to your subroutine.
printInfo(\%hash);
sub PrintInfo {
my %hash = %{$_[0]};
...
}
An introduction to using references in Perl can be found in the perlreftut
You're so very, very close. There is no %_ for passing hashes, it must be passed in #_. Luckily, Hashes are assigned using a list context, so
sub printInfo {
my %hash = #_;
...
}
will make it work!
Also note, using the & in front of the subroutine call has been, in most cases, unnecessary since at least Perl 5.000. You can call Perl subroutines just like in other languages these days, with just the name and arguments. (As #mob points out in the comments, there are some instances where this is still necessary; see perlsub to understand this more, if interested.)
The best way to pass hashes and arrays is by reference. A reference is simply a way to talk about a complex data structure as a single data point -- something that can be stored in a scalar variable (like $foo).
Read up on references, so you understand how to create a reference and dereference a reference in order to get your original data back.
The very basics: You precede your data structure with a backslash to get the reference to that structure.
my $hash_ref = \%hash;
my $array_ref = \#array;
my $scalar_ref = \$scalar; #Legal, but doesn't do much for you...
A reference is a memory location of the original structure (plus a clue about the structure):
print "$hash_ref\n";
Will print something like:
HASH(0x7f9b0a843708)
To get the reference back into a useable format, you simply put the reference into the correct sigil in front:
my %new_hash = %{ $hash_ref };
You should learn about using references since this is the way you can create extremely complex data structures in Perl, and how Object Oriented Perl works.
Let's say you want to pass three hashes to your subroutine. Here are the three hashes:
my %hash1 = ( this => 1, that => 2, the => 3, other => 4 );
my %hash2 = ( tom => 10, dick => 20, harry => 30 );
my %hash3 = ( no => 100, man => 200, is => 300, an => 400, island => 500 );
I'll create the references for them
my $hash_ref1 = \%hash1;
my $hash_ref2 = \%hash2;
my $hash_ref3 = \%hash3;
And now just pass the references:
mysub ( $hash_ref1, $hash_ref2, $hash_ref3 );
The references are scalar data, so there's no problem passing them to my subroutine:
sub mysub {
my $sub_hash_ref1 = shift;
my $sub_hash_ref2 = shift;
my $sub_hash_ref3 = shift;
Now, I just dereference them, and my subroutine can use them.
my %sub_hash1 = %{ $sub_hash_ref1 };
my %sub_hash2 = %{ $sub_hash_ref2 };
my %sub_hash3 = %{ $sub_hash_ref3 };
You can see what a reference is a reference to by using the ref command:
my $ref_type = ref $sub_hash_ref; # $ref_type is now equal to "HASH"
This is useful if you want to make sure you're being passed the correct type of data structure.
sub mysub {
my $hash_ref = shift;
if ( ref $hash_ref ne "HASH" ) {
croak qq(You need to pass in a hash reference);
}
Also note that these are memory references, so modifying the reference will modify the original hash:
my %hash = (this => 1, is => 2, a => 3 test => 4);
print "$hash{test}\n"; # Printing "4" as expected
sub mysub ( \%hash ); # Passing the reference
print "$hash{test}\n"; # This is printing "foo". See subroutine:
sub mysub {
my $hash_ref = shift;
$hash_ref->{test} = "foo"; This is modifying the original hash!
}
This can be good -- it allows you to modify data passed to the subroutine, or bad -- it allows you to unintentionally modify data passed to the original subroutine.
I believe you want
my %hash;
$hash{'1'}= {'Make' => 'Toyota','Color' => 'Red',};
$hash{'2'}= {'Make' => 'Ford','Color' => 'Blue',};
$hash{'3'}= {'Make' => 'Honda','Color' => 'Yellow',};
printInfo(%hash);
sub printInfo{
my %hash = #_;
foreach my $key (keys %hash){
my $a = $hash{$key}{'Make'};
my $b = $hash{$key}{'Color'};
print "$a $b\n";
}
}
In the line printInfo(%hash) the %hash is expanded to a list with the alternating key-value pairs.
In printInfo, the #_ is this list that, and assigned to %hash it creates again the keys with their corresponding value from the alternating elements in the list.
You can pass them as
The argument list do_hash_thing( %hash )
A reference to the hash in the argument list
`do_hash_thing( #args_before, \%hash, #args_after )
As a reference by prototype, working like keys and other hash operators.
The list works like so:
sub do_hash_thing {
my %hash = #_;
...
}
do_hash_thing( %hash );
This also allows you to "stream" hash arguments as well:
do_hash_thing( %hash_1, %hash_2, parameter => 'green', other => 'pair' );
By reference works like this:
sub do_hash_thing {
my $hash_ref = shift;
...
}
do_hash_thing( \%hash, #other_args );
Here by prototype (\%#). The prototype makes perl look for a hash in the first argument and pass it by reference.
sub do_hash_thing (\%#) {
my $hash_ref = shift;
...
}
do_hash_thing( %hash => qw(other args) );
# OR
do_hash_thing %hash => qw(other args);
Caveat: prototypes don't work on methods.