perl passing hash reference not behaving as expected - perl

(NOTE: I sort of figured this out, see all the way at the end)
It's kind of late and I've been staring at this code for far too long. I finally wrote a short test program to test hashes and passing them by reference and it is not behaving as I expect. I'm sure there is something very simple I'm missing ... can anyone spot it?
#!/usr/bin/perl
use Data::Dumper;
my %hash = ();
print "BEFORE ADDING KEYS\n";
print Dumper (\%hash);
test (\%hash, 10);
print "AFTER ADDING KEYS\n";
print Dumper (\%hash);
sub test {
my %hash = %{$_[0]};
my $number = $_[1];
if ($number == 0) { return; }
print "BEFORE ADDING KEY HASH_REF=$_[0] NUMBER=$number\n";
print Dumper (\%hash);
$hash{$number} = $number;
print "AFTER ADDING KEY\n";
print Dumper (\%hash);
test ($_[0], $number - 1);
}
I expect this code to add the numbers 10 to 1 into my hash, but instead the hash gets wiped out and doesn't contain anything once the test routine finishes recursing. What am I missing? Here is the output:
BEFORE ADDING KEYS
$VAR1 = {};
BEFORE ADDING KEY HASH_REF=HASH(0xdb82fd0) NUMBER=10
$VAR1 = {};
AFTER ADDING KEY
$VAR1 = {
'10' => 10
};
BEFORE ADDING KEY HASH_REF=HASH(0xdb82fd0) NUMBER=9
$VAR1 = {};
AFTER ADDING KEY
$VAR1 = {
'9' => 9
};
...
BEFORE ADDING KEY HASH_REF=HASH(0xdb82fd0) NUMBER=1
$VAR1 = {};
AFTER ADDING KEY
$VAR1 = {
'1' => 1
};
AFTER ADDING KEYS
$VAR1 = {};
Changing this line:
$hash{$number} = $number;
to:
$_[0]->{$number} = $number;
made everything work as expected. Why does the first statement modify a fresh local %hash when I would expect this local %hash to be pointing to the same de-referenced hash reference I originally passed into the routine?

Everything works as intended. You first statement in the test sub makes a copy of passed hash:
my %hash = %{$_[0]};
To mutate passed hash, you should work with hashref, like:
my $hashref= $_[0];
$hashref->{key} = 'val';
This approach will change original hash, not it's copy.

Related

Reading input file and creating new hash

My input files have percentage exposures and I read in only the highest value.
I keep getting errors
odd number of elements in anonymous hash
and
Can't use string as an ARRAY ref while "strict refs" in use
I tried forcing the numbers to be read in as integers but fell flat. Any advice? I'd like to create a hash with key and highest value.
while ( <$DATA_FILE> ) {
chomp;
my #line_values = split( /,/, $_ );
my $state_id = $line_values[0];
# skip header row
next if ( $state_id eq $HEADER_VALUE );
for ( #line_values ) {
tr/%%//d
};
# assign data used as hash keys to variables
my $var1 = int( $line_values[1] );
my $var2 = int( $line_values[2] );
if ( $var1 > $var2 ) {
%report_data = ( { $state_id } => { \#$var1 } )
}
else {
%report_data = ( { $state_id } => { \#$var2 } )
}
} # end while
print \%report_data;
# close file
close( $DATA_FILE ) || printf( STDERR "Failed to close $file_path\n" );
}
It's hard to be sure without any indication of what the input and expected output should be, but at a guess your if blocks should look like this
if ( $var1 > $var2 ) {
$report_data{$state_id} = $var1;
}
else {
$report_data{$state_id} = $var2;
}
or, more simply
$report_data{$state_id} = $var1 > $var2 ? $var1 : $var2;
This line is the culprit:
%report_data = ({$state_id} => {\#$var1})
{ } creates a hashref. You are doing two weird things in this line: use a hashref with a single key ($state_id) as the key in %report_data and a hashref with a single key (\#$var1, which tries to derefence the scalar $var1 and use it in array context (#$var1) and then tries to turn that into a reference again (I'm confused, no wonder the interpreter is as well) ).
That'd make more sense as
%report_data = ($state_id => $var1);
But that would reset the hash %report_data for every line you read.
What you want is to just set the key of that hash, so first, define the variable before the loop:
my %report_data = ();
and in your while loop, just set a key to a value:
$report_data{ $state_id } = $var1; # or $var2
Finally, you are trying to print a reference to the hash, which makes little sense. I'd suggest a simple loop:
for my $key (keys %report_data) {
print $key . " = " . $report_data{ $key } . "\n";
}
This iterates over all the keys in the hash and then prints them with their values, and looks easy to read and understand, I hope.
In general: don't try to use references unless you know how they work. They are not a magic bullet that you fire at your code and all the problems go poof; quite the opposite, they can become that painful bullet that ends up in your own foot. It's good to learn about references, though, because you will need them when you advance and work more with Perl. perlreftut is a good place to start.

Perl: Passing by reference does not modify the hash

My understanding was that in Perl we pass hashes to functions by reference
Consider the following example, where we modify the hash in the modifyHash function
#!/usr/local/bin/perl
my %hash;
$hash{"A"} = "1";
$hash{"B"} = "2";
print (keys %hash);
print "\n";
modifyHash(\%hash);
print (keys %hash);
print "\n";
sub modifyHash {
my $hashRef = #_[0];
my %myHash = %$hashRef;
$myHash{"C"} = "3";
print (keys %myHash);
print "\n";
}
The output of this script is:
AB
ABC
AB
I would have expected it to be:
AB
ABC
ABC
...as we pass the hash by reference.
What concept am I missing here about passing hashes to functions?
That's because when you do my %myHash = %$hashRef;, you're taking a copy of the dereferenced $hashref and putting it into %myHash which is the same thing as my %myHash = %hash;, so you're not working on the referenced hash at all.
To work on the hash specified by the reference, try this...
sub modifyHash {
my $hashRef = $_[0];
$hashRef->{"C"} = "3";
print (keys %$hashRef);
print "\n";
}
As pointed out by ThisSuitIsBlackNot in the comments below, #_[0] is better written as $_[0]. You should always be using use strict; and use warnings;, as this would have been caught. Because you're sending in a reference, you could also have used my $hashRef = shift;.
The problem is with the assignment:
my %myHash = %$hashRef;
This is akin to saying:
$x = 5;
$y = $x;
You're not setting $y to reference the same spot in memory, you're just giving the value of $x to $y. In your example, you're creating a new hash (%myHash) and giving it the value of the hash stored at $hashRef. Any future changes are to the new hash, not the original.
If you want to manipulate the original, you should do something like:
${$hashRef}{"C"} = "3";
or
$hashRef->{"D"} = 4;
There might be a more elegant way of doing it, but as far as I know you want to work with the hash reference.

Build hash of hash in perl

I'm new to using perl and I'm trying to build a hash of a hash from a tsv. My current process is to read in a file and construct a hash and then insert it into another hash.
my %hoh = ();
while (my $line = <$tsv>)
{
chomp $line;
my %hash;
my #data = split "\t", $line;
my $id;
my $iter = each_array(#columns, #data);
while(my($k, $v) = $iter->())
{
$hash{$k} = $v;
if($k eq 'Id')
{
$id = $v;
}
}
$hoh{$id} = %hash;
}
print "dump: ", Dumper(%hoh);
This outputs:
dump
$VAR1 = '1234567890';
$VAR2 = '17/32';
$VAR3 = '1234567891';
$VAR4 = '17/32';
.....
Instead of what I would expect:
dump
{
'1234567890' => {
'k1' => 'v1',
'k2' => 'v2',
'k3' => 'v3',
'k4' => 'v4',
'id' => '1234567890'
},
'1234567891' => {
'k1' => 'v1',
'k2' => 'v2',
'k3' => 'v3',
'k4' => 'v4',
'id' => '1234567891'
},
........
};
My limited understanding is that when I do $hoh{$id} = %hash; its inserting in a reference to %hash? What am I doing wrong? Also is there a more succint way to use my columns and data array's as key,value pairs into my %hash object?
-Thanks in advance,
Niru
To get a reference, you have to use \:
$hoh{$id} = \%hash;
%hash is the hash, not the reference to it. In scalar context, it returns the string X/Y wre X is the number of used buckets and Y the number of all the buckets in the hash (i.e. nothing useful).
To get a reference to a hash variable, you need to use \%hash (as choroba said).
A more succinct way to assign values to columns is to assign to a hash slice, like this:
my %hoh = ();
while (my $line = <$tsv>)
{
chomp $line;
my %hash;
#hash{#columns} = split "\t", $line;
$hoh{$hash{Id}} = \%hash;
}
print "dump: ", Dumper(\%hoh);
A hash slice (#hash{#columns}) means essentially the same thing as ($hash{$columns[0]}, $hash{$columns[1]}, $hash{$columns[2]}, ...) up to however many columns you have. By assigning to it, I'm assigning the first value from split to $hash{$columns[0]}, the second value to $hash{$columns[1]}, and so on. It does exactly the same thing as your while ... $iter loop, just without the explicit loop (and it doesn't extract the $id).
There's no need to compare each $k to 'Id' inside a loop; just store it in the hash as a normal field and extract it afterwards with $hash{Id}. (Aside: Is your column header Id or id? You use Id in your loop, but id in your expected output.)
If you don't want to keep the Id field in the individual entries, you could use delete (which removes the key from the hash and returns the value):
$hoh{delete $hash{Id}} = \%hash;
Take a look at the documentation included in Perl. The command perldoc is very helpful. You can also look at the Perldoc webpage too.
One of the tutorials is a tutorial on Perl references. It all help clarify a lot of your questions and explain about referencing and dereferencing.
I also recommend that you look at CPAN. This is an archive of various Perl modules that can do many various tasks. Look at Text::CSV. This module will do exactly what you want, and even though it says "CSV", it works with tab separated files too.
You missed putting a slash in front of your hash you're trying to make a reference. You have:
$hoh{$id} = %hash;
Probably want:
$hoh{$id} = \%hash;
also, when you do a Data::Dumper of a hash, you should do it on a reference to a hash. Internally, hashes and arrays have similar structures when a Data::Dumper dump is done.
You have:
print "dump: ", Dumper(%hoh);
You should have:
print "dump: ", Dumper( \%hoh );
My attempt at the program:
#! /usr/bin/env perl
#
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
use constant {
FILE => "test.txt",
};
open my $fh, "<", FILE;
#
# First line with headers
#
my $line = <$fh>;
chomp $line;
my #headers = split /\t/, $line;
my %hash_of_hashes;
#
# Rest of file
#
while ( my $line = <$fh> ) {
chomp $line;
my %line_hash;
my #values = split /\t/, $line;
for my $index ( ( 0..$#values ) ) {
$line_hash{ $headers[$index] } = $values[ $index ];
}
$hash_of_hashes{ $line_hash{id} } = \%line_hash;
}
say Dumper \%hash_of_hashes;
You should only store a reference to a variable if you do so in the last line before the variable goes go of scope. In your script, you declare %hash inside the while loop, so placing this statement as the last in the loop is safe:
$hoh{$id} = \%hash;
If it's not the last statement (or you're not sure it's safe), create an anonymous structure to hold the contents of the variable:
$hoh{$id} = { %hash };
This makes a copy of %hash, which is slower, but any subsequent changes to it will not effect what you stored.

How do I return an array and a hashref?

I want to make a subroutine that adds elements (keys with values) to a previously-defined hash. This subroutine is called in a loop, so the hash grows. I don’t want the returning hash to overwrite existing elements.
At the end, I would like to output the whole accumulated hash.
Right now it doesn’t print anything. The final hash looks empty, but that shouldn’t be.
I’ve tried it with hash references, but it does not really work. In a short form, my code looks as follows:
sub main{
my %hash;
%hash=("hello"=>1); # entry for testing
my $counter=0;
while($counter>5){
my(#var, $hash)=analyse($one, $two, \%hash);
print ref($hash);
# try to dereference the returning hash reference,
# but the error msg says: its not an reference ...
# in my file this is line 82
%hash=%{$hash};
$counter++;
}
# here trying to print the final hash
print "hash:", map { "$_ => $hash{$_}\n" } keys %hash;
}
sub analyse{
my $one=shift;
my $two=shift;
my %hash=%{shift #_};
my #array; # gets filled some where here and will be returned later
# adding elements to %hash here as in
$hash{"j"} = 2; #used for testing if it works
# test here whether the key already exists or
# otherwise add it to the hash
return (#array, \%hash);
}
but this doesn’t work at all: the subroutine analyse receives the hash, but its returned hash reference is empty or I don’t know. At the end nothing got printed.
First it said, it's not a reference, now it says:
Can't use an undefined value as a HASH reference
at C:/Users/workspace/Perl_projekt/Extractor.pm line 82.
Where is my mistake?
I am thankful for any advice.
Arrays get flattened in perl, so your hashref gets slurped into #var.
Try something like this:
my ($array_ref, $hash_ref) = analyze(...)
sub analyze {
...
return (\#array, \#hash);
}
If you pass the hash by reference (as you're doing), you needn't return it as a subroutine return value. Your manipulation of the hash in the subroutine will stick.
my %h = ( test0 => 0 );
foreach my $i ( 1..5 ) {
do_something($i, \%h);
}
print "$k = $v\n" while ( my ($k,$v) = each %h );
sub do_something {
my $num = shift;
my $hash = shift;
$hash->{"test${num}"} = $num; # note the use of the -> deference operator
}
Your use of the #array inside the subroutine will need a separate question :-)

Perl modifying hash reference in subroutine

I am having trouble understanding the hash references and changing the hash in place, instead of returning it. I want to write a sub routine which will return a value from hash and also modify the hash. I was facing some issues while coding for it. So, I wrote the following basic code to understand modifying the hash in place.
#!/usr/local/bin/perl
#Check hash and array references
#Author: Sidartha Karna
use warnings;
use strict;
use Data::Dumper;
sub checkHashRef{
my ($hashRef, $arrVal) = #_;
my %hashDeref = %{$hashRef};
$hashDeref{'check'} = 2;
push(#{$arrVal}, 3);
print "There:" ;
print Dumper $hashRef;
print Dumper %hashDeref;
print Dumper $arrVal
}
my %hashVal = ('check', 1);
my #arrVal = (1, 2);
checkHashRef(\%hashVal, \#arrVal);
print "here\n";
print Dumper %hashVal;
print Dumper #arrVal;
The output observed is:
There:$VAR1 = {
'check' => 1
};
$VAR1 = 'check';
$VAR2 = 2;
$VAR1 = [
1,
2,
3
];
here
$VAR1 = 'check';
$VAR2 = 1;
$VAR1 = 1;
$VAR2 = 2;
$VAR3 = 3;
From the output, I inferred that, changes to hashDeref are not modifying the data in the reference. Is my understanding correct? Is there a way to modify the hash variable in place instead of returning it.
This is making a (shallow) copy of %hashVal:
my %hashDeref = %{$hashRef};
The hash-ref $hashRef still points to %hashVal but %hashDeref doesn't, it is just a copy. If you want to modify the passed hash-ref in-place, then work with the passed hash-ref:
sub checkHashRef{
my ($hashRef, $arrVal) = #_;
$hashRef->{'check'} = 2;
#...
That will leave your changes in %hashVal. In the array case, you never make a copy, you just dereference it in-place:
push(#{$arrVal}, 3);
and the change to $arrVal shows up in #arrVal.