Perl hash, array and references - perl

I have this 3 lines of code in a sub and I'm trying to write them together on one line only.. but I'm quite lost
my %p = #_;
my $arr = $p{name};
my #a = #$arr;
what's the correct way of doing this?
thank you!

my %p = #_;
#_ is assumed to contain key-value pairs which are then used to construct the hash %p.
my $arr = $p{name};
The argument list is assumed to have contained something along the lines of name, [1, 2, 3,] so that $p{name} is an reference to an array.
my #a = #$arr;
Dereference that array reference to get the array #.
Here is an invocation that might work with this prelude in a sub:
func(this => 'that', name => [1, 2, 3]);
If you want to reduce the whole prelude to a single statement, you can use:
my #a = #{ { #_ }->{name} };
as in:
#!/usr/bin/env perl
use strict;
use warnings;
use YAML::XS;
func(this => 'that', name => [1, 2, 3]);
sub func {
my #a = #{ { #_ }->{name} };
print Dump \#a;
}
Output:
---
- 1
- 2
- 3
If the array pointed to by name is large, and if you do not need a shallow copy, however, it may be better to just stick with references:
my $aref = { #_ }->{ name };

OK so what you're doing is:
Assign a list of elements passed to the sub, to a hash.
extract a value from that hash (that appears to be an array reference)
dereference that into a standalone array.
Now, I'm going to have to make some guesses as to what you're putting in:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
sub test {
my %p = #_;
my $arr = $p{name};
my #a = #$arr;
print Dumper \#a;
}
my %input = ( fish => [ "value", "another value" ],
name => [ "more", "less" ], );
test ( %input );
So with that in mind:
sub test {
print join "\n", #{{#_}->{name}},"\n";
}
But actually, I'd suggest what you probably want to do is pass in the hashref in the first place:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
sub test {
my ( $input_hashref ) = #_;
print Dumper \#{$input_hashref -> {name}};
}
my %input = ( fish => [ "value", "another value" ],
name => [ "more", "less" ], );
test ( \%input );
Also:
Don't use single letter variable names. It's bad style.
that goes double for a and b because $a and $b are for sorting. (And using #a is confusing as a result).

Related

what is the difference between taking reference to a variable using \ and {},[] in perl?

below code works fine but if I replace push #array,{%hash} with push #array,\%hash then it doesn't. Can someone please help me understand the difference. I believe {%hash} refers to an anonymous hash. Does it mean a anonymous hash lives longer than a reference to a named hash ( \%hash ).
use strict;
use warnings;
use Data::Dumper;
my #array;
my %hash;
%hash = ('a' => 1,
'b' => 2,
'c' => 3,);
push #array,{%hash};
%hash = ('e' => 1,
'f' => 2,
'd' => 3,);
push #array,{%hash};
print Dumper \#array;
output
$VAR1 = [
{
'c' => 3,
'a' => 1,
'b' => 2
},
{
'e' => 1,
'd' => 3,
'f' => 2
}
];
UPDATE
Below is the actual code I am working on. I think in this case taking copy of the reference is the only possible solution I believe. Please correct me if I am wrong.
use Data::Dumper;
use strict;
use warnings;
my %csv_data;
my %temp_hash;
my #cols_of_interest = qw(dev_file test_file diff_file status);
<DATA>; #Skipping the header
while (my $row = <DATA>) {
chomp $row;
my #array = split /,/,$row;
#temp_hash{#cols_of_interest} = #array[3..$#array];
push #{$csv_data{$array[0]}{$array[1] . ':' . $array[2]}},{%temp_hash};
}
print Dumper \%csv_data;
__DATA__
dom,type,id,dev_file,test_file,diff_file,status
A,alpha,1234,dev_file_1234_1.txt,test_file_1234_1.txt,diff_file_1234_1.txt,pass
A,alpha,1234,dev_file_1234_2.txt,test_file_1234_2.txt,diff_file_1234_2.txt,fail
A,alpha,1234,dev_file_1234_3.txt,test_file_1234_3.txt,diff_file_1234_3.txt,pass
B,beta,4567,dev_file_4567_1.txt,test_file_4567_1.txt,diff_file_4567_1.txt,pass
B,beta,4567,dev_file_4567_2.txt,test_file_4567_2.txt,diff_file_4567_2.txt,fail
C,gamma,3435,dev_file_3435_1.txt,test_file_3435_1.txt,diff_file_3435_1.txt,pass
D,hexa,6768,dev_file_6768_1.txt,test_file_6768_1.txt,diff_file_6768_1.txt,fail
Both \%hash and {%hash} create references, but they reference two different things.
\%hash is a ref to %hash. If dereferenced, its values will change with the values in %hash.
{%hash} creates a new anonymous hash reference from the values in %hash. It creates a copy. It's the simplest way of creating a shallow copy of a data structure in Perl. If you alter %hash, this copy is not affected.
How long a variable lives has nothing to do with what kind the variable is, or how it was created. Only the scope is relevant for that. References in Perl are a special case here, because there is an internal ref counter that keeps track of references to a value, so that it is kept alive if there are still references around somewhere even if it goes out of scope. That's why this works:
sub frobnicate {
my %hash = ( foo => 'bar' );
return \%hash;
}
If you want to disassociate the reference from the initial value, you need to turn it into a weak reference via weaken from Scalar::Util. That way, the ref count will not be influenced by it, but it will still be related to the value, while a copy would not be.
See perlref and perlreftut for more information on references. This question deals with how to see the ref count. A description for that is also available in the chapter Reference Counts and Mortality in perlguts.
You can't really compare \ to {} and [] since they don't do the same thing at all.
{ LIST } is short for my %anon = LIST; \%anon
[ LIST ] is short for my #anon = LIST; \#anon
Maybe you meant to compare
 
my %hash = ...;
push #a, \%hash;
 
push #a, { ... };
 
my %hash = ...;
push #a, { %hash };
The first snippet places a reference to %hash in #a. This is presumably found in a loop. As long as my %hash is found in the loop, a reference to a new hash will be placed in #a each time.
The second snippet does the same, just using an anonymous hash.
The third snippet makes a copy of %hash, and places a reference to that copy in #a. It gives the impression of wastefulness, so it's discouraged. (It's not actually not that wasteful because it allows %hash to be reused.)
You could also write your code
# In reality, the two blocks below are probably the body of one sub or one loop.
{
my %hash = (
a => 1,
b => 2,
c => 3,
);
push #a, \%hash;
}
{
my %hash = (
d => 3,
e => 1,
f => 2,
);
push #a, \%hash;
}
or
push #a, {
a => 1,
b => 2,
c => 3,
};
push #a, {
d => 3,
e => 1,
f => 2,
};
my #cols_of_interest = qw( dev_file test_file diff_file status );
my %csv_data;
if (defined( my $row = <DATA> )) {
chomp $row;
my #cols = split(/,/, $row);
my %cols_of_interest = map { $_ => 1 } #cols_of_interest;
my #cols_to_delete = grep { !$cols_of_interest{$_} } #cols;
while ( my $row = <DATA> ) {
chomp $row;
my %row; #row{#cols} = split(/,/, $row);
delete #row{#cols_to_delete};
push #{ $csv_data{ $row{dev_file} }{ "$row{test_file}:$row{diff_file}" } }, \%row;
}
}
Better yet, let's use a proper CSV parser.
use Text::CSV_XS qw( );
my #cols_of_interest = qw( dev_file test_file diff_file status );
my $csv = Text::CSV_XS->new({
auto_diag => 2,
binary => 1,
});
my #cols = $csv->header(\*DATA);
my %cols_of_interest = map { $_ => 1 } #cols_of_interest;
my #cols_to_delete = grep { !$cols_of_interest{$_} } #cols;
my %csv_data;
while ( my $row = $csv->getline_hr(\*DATA) ) {
delete #$row{#cols_to_delete};
push #{ $csv_data{ $row->{dev_file} }{ "$row->{test_file}:$row->{diff_file}" } }, $row;
}

Size of array inside hash

EDIT: Providing code as per demand from mppac:
I wanted the length of array inside hash.
Why is below showing undef?
$ >cat test.pl
#!/usr/bin/perl
use Data::Dumper;
my %static_data_key = (
'NAME' =>['RAM','SHYAM','RAVI','HARI'],
);
print Dumper(\%static_data_key);
$ >./test.pl
$VAR1 = {
'NAME' => [
'RAM',
'SHYAM',
'RAVI',
'HARI'
]
};
The return value of a Perl array in scalar context is the array's size. For example:
my #array = ( 'a', 'b', 'c' );
my $size = #array;
print "$size\n";
This code will print '3'. When dereferenced, anonymous arrays share this characteristic:
my $aref = [ 'a', 'b', 'c' ];
print $aref, "\n"; # ARRAY(0x1e33148)... useless in this case.
my $size = #{$aref}; # Dereference $aref, in scalar context.
print "$size\n"; # 3
The code I'm demonstrating takes a few unnecessary steps to lend clarity. Now consider this:
print scalar #{[ 'a', 'b', 'c']}, "\n"; # 3
Here we're constructing an anonymous array and immediately dereferencing it. We obtain its return value in scalar context, which happens to be 3.
Finally, let's put that anonymous array into a hash:
my %hash = (
NAME => [ 'joseph', 'frank', 'pete' ]
);
print scalar #{$hash{NAME}}, "\n";
Read that last line from the middle outward; first we obtain the value stored the NAME element within %hash. That is a reference to an anonymous array. So we dereference it with #{ ..... }. And we use scalar to force scalar context. The output is 3.
#!/usr/bin/perl
# your code goes here
use strict;
use warnings;
my %static_data_key = (
'NAME' =>['RAM','SHYAM','RAVI','HARI'],
);
print scalar #{$static_data_key{'NAME'}};
Demo

sum hash of hash values using perl

I have a Perl script that parses an Excel file and does the following : It counts for each value in column A, the number of elements it has in column B, the script looks like this :
use strict;
use warnings;
use Spreadsheet::XLSX;
use Data::Dumper;
use List::Util qw( sum );
my $col1 = 0;
my %hash;
my $excel = Spreadsheet::XLSX->new('inout_chartdata_ronald.xlsx');
my $sheet = ${ $excel->{Worksheet} }[0];
$sheet->{MaxRow} ||= $sheet->{MinRow};
my $count = 0;
# Iterate through each row
foreach my $row ( $sheet->{MinRow}+1 .. $sheet->{MaxRow} ) {
# The cell in column 1
my $cell = $sheet->{Cells}[$row][$col1];
if ($cell) {
# The adjacent cell in column 2
my $adjacentCell = $sheet->{Cells}[$row][ $col1 + 1 ];
# Use a hash of hashes
$hash{ $cell->{Val} }{ $adjacentCell->{Val} }++;
}
}
print "\n", Dumper \%hash;
The output looks like this :
$VAR1 = {
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
This works great, my question is : How can I access the elements of this output $VAR1 in order to do : for value 13, klm + hij = 3 and get a final output like this :
$VAR1 = {
'13' => {
'somename' => 3,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
So basically what I want to do is loop through my final hash of hashes and access its specific elements based on a unique key and finally do their sum.
Any help would be appreciated.
Thanks
I used #do_sum to indicate what changes you want to make. The new key is hardcoded in the script. Note that the new key is not created if no key exists in the subhash (the $found flag).
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my %hash = (
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
);
my #do_sum = qw(klm hij);
for my $num (keys %hash) {
my $found;
my $sum = 0;
for my $key (#do_sum) {
next unless exists $hash{$num}{$key};
$sum += $hash{$num}{$key};
delete $hash{$num}{$key};
$found = 1;
}
$hash{$num}{somename} = $sum if $found;
}
print Dumper \%hash;
It sounds like you need to learn about Perl References, and maybe Perl Objects which are just a nice way to deal with references.
As you know, Perl has three basic data-structures:
Scalars ($foo)
Arrays (#foo)
Hashes (%foo)
The problem is that these data structures can only contain scalar data. That is, each element in an array can hold a single value or each key in a hash can hold a single value.
In your case %hash is a Hash where each entry in the hash references another hash. For example:
Your %hash has an entry in it with a key of 13. This doesn't contain a scalar value, but a references to another hash with three keys in it: klm, hij, and lkm. YOu can reference this via this syntax:
${ hash{13} }{klm} = 1
${ hash{13} }{hij} = 2
${ hash{13} }{lkm} = 4
The curly braces may or may not be necessary. However, %{ hash{13} } references that hash contained in $hash{13}, so I can now reference the keys of that hash. You can imagine this getting more complex as you talk about hashes of hashes of arrays of hashes of arrays. Fortunately, Perl includes an easier syntax:
$hash{13}->{klm} = 1
%hash{13}->{hij} = 2
%hash{13}->{lkm} = 4
Read up about hashes and how to manipulate them. After you get comfortable with this, you can start working on learning about Object Oriented Perl which handles references in a safer manner.

Perl 2x2 array addition using Subroutines and Reference

My question in Perl is:
Define 2x2 arrays using anonymous lists. Pass the arrays to a subroutine and add them together. Return a reference to sum array and print the values from the main part of the program.
My script is:
#!/usr/bin/perl
use strict;
use warnings;
my #array = ([1,2],[4,5]);
my $refarray = \#array;
print sumarray($refarray);
sub sumarray
{
$refarray = shift;
foreach (#{$refarray})
{
$refarray = ($refarray[0]->[0]+$refarray[1]->[0],$refarray[0]->[1]+$refarray[1]->[1]);
}
return $refarray;
}
Where am I going wrong? Please help. Thanks in advance.
I am getting the output as 0.
If I use use strict; and use warnings; I will get the error message as
Global symbol "#refarray" requires explicit package name at line 23.
Global symbol "#refarray" requires explicit package name at line 23.
Global symbol "#refarray" requires explicit package name at line 23.
Global symbol "#refarray" requires explicit package name at line 23.
Execution aborted due to compilation errors.
Few problems with your code: -
First, in your for-loop, you are modifying your reference $refarray which you should not do.
Second, $refarray[0]->[0] will not compile. Since $refarray is a reference to an array, you should either use its 1st element using arrow: - $refarray->[0][0], or you need to de-reference it before using the way you are using: - $$refarray[0]->[0].
Having said that, I think you should replace your subroutine with this one: -
use strict;
use warnings;
my #array = ([1,2],[4,5]);
my $refarray = \#array;
my $sum = sumarray($refarray);
print $sum->[0], $sum->[1];
sub sumarray {
my $ref = shift;
return [$ref->[0][0] + $ref->[1][0], $ref->[0][1] + $ref->[1][1]];
}
OUTPUT: -
5 7
Try this:
#!/usr/bin/perl -w
my $r = sumarray([1,2],[3,4]);
print $r->[0], " ", $r->[1], "\n";
sub sumarray {
my ($a, $b) = #_;
return [
$a->[0]+$b->[0],
$a->[1]+$b->[1]
];
}
It could be expressed very simply with a combination of the list operations sum (from the core module List::Util) and map.
Code
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
use List::Util 'sum';
sub sum_square {
my #arrays = #_;
return [ map {sum #$_} #arrays ];
}
say join ', ' => #{ sum_square([1,2], [4,5]) };
Output:
3, 9
Since this is Perl, it could be expressed in a more compact way.
Shortification
sub sum_square { [ map {sum #$_} #_ ] }
say join ', ' => #{ sum_square([1,2], [4,5]) };
Shortification2
say join ', ' => map {sum #$_} [1,2], [4,5];
Edit: sum the other way round
If the function should be a column sum instead of a line sum, this modification should do the trick (iterate over the indices of the first array):
sub sum_square {
my #arrays = #_;
return [ map { my $i = $_; # $i: all indices of the first array
sum map $_->[$i] => #arrays # sum over all $i-th values of #arrays
} 0 .. $#{$arrays[0]} ];
}
Output:
5, 7

perl list context of hash refs

Why does this work? that is line 2
DB<1> $a = {'a'=>1}; $b = {'a'=>2, 'b'=>0};
DB<2> $c = ($a, $b);
DB<3> print $c;
HASH(0x8743e68)
DB<4> print $c->{a},$c->{b};
20
I understand if I carefully use %$a and %$b that perl would know what I meant, but with just bare refs in the list, why does it work?
Or maybe it just looks like it works and I really did something else?
There is no list context in
$c = ($a, $b);
Instead, what you are seeing is the comma operator in action:
Binary "," is the comma operator. In scalar context it evaluates its left argument, throws that value away, then evaluates its right argument and returns that value. This is just like C's comma operator.
To see this more clearly, take a look at:
#!/usr/bin/perl
use strict; use warnings;
my $x = {a => 1};
my $y = {a => 2, b => 0};
my $z = ($x, $y);
print "\$x = $x\t\$y = $y\t\$z = $z\n";
my #z = ($x, $y);
print "#z\n";
First, I used warnings. Therefore, when I run this script, I get the warning:
Useless use of private variable in void context at C:\Temp\t.pl line 7.
Always enable warnings.
Now, the output shows what's happening:
$x = HASH(0x39cbc) $y = HASH(0x39dac) $z = HASH(0x39dac)
HASH(0x39cbc) HASH(0x39dac)
Clearly, $z refers to the same anonymous hash as does $y. No copying of values was done.
And, $z[0] refers to the same anonymous hash as does $x and $z[1] refers to the same anonymous has as do $y and $z.
Note that parentheses alone do not create list context. In the case of
my #z = ($x, $y);
they are necessary because = binds more tightly than the comma operator.
my #z = $x, $y;
would assign $x to $z[0] and discard $y (and emit a warning) whereas
my #z = 1 .. 5;
would work as expected.
Finally, if you wanted to assign to $z a new anonymous hash which contains copies of the anonymous hashes to which both $x and $y point, you would do
#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper;
my $x = {a => 1};
my $y = {a => 2, b => 0};
my $z = { %$x, %$y };
print Dumper $z;
which would output:
$VAR1 = {
'a' => 2,
'b' => 0
};
because hash keys are, by definition, unique. If you want to preserve all values associated with the keys of both hashes, you need to do something slightly more complicated (and use anonymous arrayrefs as values in the "union" hash):
#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper;
my $x = {a => 1};
my $y = {a => 2, b => 0};
my $z;
push #{ $z->{$_} }, $x->{$_} for keys %$x;
push #{ $z->{$_} }, $y->{$_} for keys %$y;
print Dumper $z;
Output:
VAR1 = {
'a' => [
1,
2
],
'b' => [
0
]
};