Perl, convert numerically-keyed hash to array - perl

If I have a hash in Perl that contains complete and sequential integer mappings (ie, all keys from from 0 to n are mapped to something, no keys outside of this), is there a means of converting this to an Array?
I know I could iterate over the key/value pairs and place them into a new array, but something tells me there should be a built-in means of doing this.

You can extract all the values from a hash with the values function:
my #vals = values %hash;
If you want them in a particular order, then you can put the keys in the desired order and then take a hash slice from that:
my #sorted_vals = #hash{sort { $a <=> $b } keys %hash};

If your original data source is a hash:
# first find the max key value, if you don't already know it:
use List::Util 'max';
my $maxkey = max keys %hash;
# get all the values, in order
my #array = #hash{0 .. $maxkey};
Or if your original data source is a hashref:
my $maxkey = max keys %$hashref;
my #array = #{$hashref}{0 .. $maxkey};
This is easy to test using this example:
my %hash;
#hash{0 .. 9} = ('a' .. 'j');
# insert code from above, and then print the result...
use Data::Dumper;
print Dumper(\%hash);
print Dumper(\#array);
$VAR1 = {
'6' => 'g',
'3' => 'd',
'7' => 'h',
'9' => 'j',
'2' => 'c',
'8' => 'i',
'1' => 'b',
'4' => 'e',
'0' => 'a',
'5' => 'f'
};
$VAR1 = [
'a',
'b',
'c',
'd',
'e',
'f',
'g',
'h',
'i',
'j'
];

OK, this is not very "built in" but works. It's also IMHO preferrable to any solution involving "sort" as it's faster.
map { $array[$_] = $hash{$_} } keys %hash; # Or use foreach instead of map
Otherwise, less efficient:
my #array = map { $hash{$_} } sort { $a<=>$b } keys %hash;

Perl does not provide a built-in to solve your problem.
If you know that the keys cover a particular range 0..N, you can leverage that fact:
my $n = keys(%hash) - 1;
my #keys_and_values = map { $_ => $hash{$_} } 0 .. $n;
my #just_values = #hash{0 .. $n};

This will leave keys not defined in %hashed_keys as undef:
# if we're being nitpicky about when and how much memory
# is allocated for the array (for run-time optimization):
my #keys_arr = (undef) x scalar %hashed_keys;
#keys_arr[(keys %hashed_keys)] =
#hashed_keys{(keys %hashed_keys)};
And, if you're using references:
#{$keys_arr}[(keys %{$hashed_keys})] =
#{$hashed_keys}{(keys %{$hashed_keys})};
Or, more dangerously, as it assumes what you said is true (it may not always be true … Just sayin'!):
#keys_arr = #hashed_keys{(sort {$a <=> $b} keys %hashed_keys)};
But this is sort of beside the point. If they were integer-indexed to begin with, why are they in a hash now?

As DVK said, there is no built in way, but this will do the trick:
my #array = map {$hash{$_}} sort {$a <=> $b} keys %hash;
or this:
my #array;
keys %hash;
while (my ($k, $v) = each %hash) {
$array[$k] = $v
}
benchmark to see which is faster, my guess would be the second.

#a = #h{sort { $a <=> $b } keys %h};

Combining FM's and Ether's answers allows one to avoid defining an otherwise unnecessary scalar:
my #array = #hash{ 0 .. $#{[ keys %hash ]} };
The neat thing is that unlike with the scalar approach, $# works above even in the unlikely event that the default index of the first element, $[, is non-zero.
Of course, that would mean writing something silly and obfuscated like so:
my #array = #hash{ $[ .. $#{[ keys %hash ]} }; # Not recommended
But then there is always the remote chance that someone needs it somewhere (wince)...

$Hash_value =
{
'54' => 'abc',
'55' => 'def',
'56' => 'test',
};
while (my ($key,$value) = each %{$Hash_value})
{
print "\n $key > $value";
}

We can write a while as below:
$j =0;
while(($a1,$b1)=each(%hash1)){
$arr[$j][0] = $a1;
($arr[$j][1],$arr[$j][2],$arr[$j][3],$arr[$j][4],$arr[$j][5],$arr[$j][6]) = values($b1);
$j++;
}
$a1 contains the key and
$b1 contains the values
In the above example i have Hash of array and the array contains 6 elements.

An easy way is to do #array = %hash
For example,
my %hash = (
"0" => "zero",
"1" => "one",
"2" => "two",
"3" => "three",
"4" => "four",
"5" => "five",
"6" => "six",
"7" => "seven",
"8" => "eight",
"9" => "nine",
"10" => "ten",
);
my #array = %hash;
print "#array"; would produce the following output,
3 three 9 nine 5 five 8 eight 2 two 4 four 1 one 10 ten 7 seven 0 zero
6 six

Related

Sort Hash Key and Value simultaneously Perl

I have a hash that I want to sort the keys numerically in ascending order and
its values in ascending alphabetically manner.
#!/usr/bin/perl
use warnings;
use strict;
use List::MoreUtils;
use Tie::IxHash;
my %KEY_VALUE;
#tie %KEY_VALUE,'Tie::IxHash';
my %KEY_VALUE= (
0 => [ 'A', 'C', 'B', 'A' ,'D'],
5 => [ 'D', 'F', 'E', ],
2 => [ 'Z', 'X', 'Y' ],
4 => [ 'E', 'R', 'M' ],
3 => [ 'A', 'B', 'B', 'A' ],
1 => [ 'C', 'C', 'F', 'E' ],
);
#while (my ($k, $av) = each %KEY_VALUE)
#{
# print "$k #$av\n ";
#}
#Sort the key numerically
foreach my $key (sort keys %KEY_VALUE)
{
print "$key\n";
}
#To sort the value alphabetically
foreach my $key (sort {$KEY_VALUE{$a} cmp $KEY_VALUE{$b}} keys %KEY_VALUE){
print "$key: $KEY_VALUE{$key}\n";
}
The wanted input is like this, and I want to print out the sorted keys and values.
%KEY_VALUE= (
0 => [ 'A','A','B','C','D'],
1 => [ 'C','C','E','F' ],
2 => [ 'X','Y','Z' ],
3 => [ 'A', 'A', 'B', 'B' ],
4 => [ 'E','M','R' ],
5 => [ 'D','E','F', ],
);
Additional problem, how to print the key and the scalar value of the first different value
Wanted Output:
KEY= 0 VALUE:0 2 3 4 #The scalar value of first A B C D, start with 0
KEY= 1 VALUE:0 2 3 #The scalar value of first C E F
KEY= 2 VALUE:0 1 2 #The scalar value of first X Y Z
KEY= 3 VALUE:0 2 #The scalar value of first A B
KEY= 4 VALUE:0 1 2 #The scalar value of first E M R
KEY= 5 VALUE:0 1 2 #The scalar value of first D E F
Hash keys have no defined order. Generally you sort the keys as you're iterating through the hash.
The values can be sorted as you iterate through the hash.
# Iterate through the keys in numeric order.
for my $key (sort {$a <=> $b } keys %hash) {
# Get the value
my $val = $hash{$key};
# Sort it in place
#$val = sort { $a cmp $b } #$val;
# Display it
say "$key -> #$val";
}
Note that by default sort sorts in ASCII order as strings. That means sort keys %KEY_VALUE is not sorting as numbers but as strings. sort(2,3,10) is (10,2,3). "10" is less than "2" like "ah" is less than "b". Be sure to use sort { $a <=> $b } for numeric sorting and sort { $a cmp $b } for strings.
You could use a different data structure such as Tie::Ixhash though tying has a significant performance penalty. Generally it's better to sort in place unless your hash gets very large.
You can't sort a hash, you can at best print it sorted (or keep the sorted keys in another array). Finding the position of the first value can be done with first_index; we remove duplicates with uniq.
foreach my $key (sort keys %KEY_VALUE) {
my #value = #{$KEY_VALUE{$key}};
my #indices = map { my $e = $_; first_index { $_ eq $e } #value } (uniq (sort #value));
print "$key: " . (join ', ', #indices) . "\n";
}

Adding hash keys

I am adding data to a hash using an incrementing numeric key starting at 0. The key/value is fine. When I add the second one, the first key/value pair points back to the second. Each addition after that replaces the value of the second key and then points back to it. The Dumper output would be something like this.
$VAR1 = { '0' => { ... } };
After the first key/value is added. After the second one is added I get
$VAR1= { '1' => { ... }, '0' => $VAR1->{'1} };
After the third key/value is added, it looks like this.
$VAR1 = { '1' => { ... }, '0' => $VAR1->{'1'}, '2' => $VAR1->{'1'} };
My question is why is it doing this? I want each key/value to show up in the hash. When I iterate through the hash I get the same data for every key/value. How do I get rid of the reference pointers to the second added key?
You are setting the value of every element to a reference to the same hash. Data::Dumper is merely reflecting that.
If you're using Data::Dumper as a serializing tool (yuck!), then you should set $Data::Dumper::Purity to 1 to get something eval can process.
use Data::Dumper qw( Dumper );
my %h2 = (a=>5,b=>6,c=>7);
my %h;
$h{0} = \%h2;
$h{1} = \%h2;
$h{2} = \%h2;
print("$h{0}{c} $h{2}{c}\n");
$h{0}{c} = 9;
print("$h{0}{c} $h{2}{c}\n");
{
local $Data::Dumper::Purity = 1;
print(Dumper(\%h));
}
Output:
7 7
9 9
$VAR1 = {
'0' => {
'c' => 9,
'a' => 5,
'b' => 6
},
'1' => {},
'2' => {}
};
$VAR1->{'0'} = $VAR1->{'1'};
$VAR1->{'2'} = $VAR1->{'1'};
If, on the other hand, you didn't mean to use store references to different hashes, you could use
# Shallow copies
$h{0} = { %h2 }; # { ... } means do { my %anon = ( ... ); \%anon }
$h{1} = { %h2 };
$h{2} = { %h2 };
or
# Deep copies
use Storable qw( dclone );
$h{0} = dclone(\%h2);
$h{1} = dclone(\%h2);
$h{2} = dclone(\%h2);
Output:
7 7
9 7
$VAR1 = {
'0' => {
'a' => 5,
'b' => 6,
'c' => 9
},
'1' => {
'a' => 5,
'b' => 6,
'c' => 7
},
'2' => {
'a' => 5,
'b' => 6,
'c' => 7
}
};
You haven't posted the actual code you're using to build the hash, but I assume it looks something like this:
foreach my $i (1 .. 3) {
%hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = \%hash2;
}
(Actually, I'll guess that, in your actual code, you're probably reading data from a file in a while (<>) loop and assigning values to %hash2 based on it, but the foreach loop will do for demonstration purposes.)
If you run the code above and dump the resulting %hash1 using Data::Dumper, you'll get the output:
$VAR1 = {
'1' => {
'baz' => 'whatever',
'number' => 3,
'foo' => 'bar'
},
'3' => $VAR1->{'1'},
'2' => $VAR1->{'1'}
};
Why does it look like that? Well, it's because the values in %hash1 are all references pointing to the same hash, namely %hash2. When you assign new values to %hash2 in your loop, those values will overwrite the old values in %hash2, but it will still be the same hash. Data::Dumper is just highlighting that fact.
So, how can you fix it? Well, there are (at least) two ways. One way is to replace \%hash2, which gives a reference to %hash2, with { %hash2 }, which copies the contents of %hash2 into a new anonymous hash and returns a reference to that:
foreach my $i (1 .. 3) {
%hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = { %hash2 };
}
The other (IMO preferable) way is to declare %hash2 as a (lexically scoped) local variable within the loop using my:
foreach my $i (1 .. 3) {
my %hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = \%hash2;
}
This way, each iteration of the loop will create a new, different hash named %hash2, while the hashes created on previous iterations will continue to exist (since they're referenced from %hash1) independently.
By the way, you wouldn't have had this problem in the first place if you'd followed standard Perl best practices, specifically:
Always use strict; (and use warnings;). This would've forced you to declare %hash2 with my (although it wouldn't have forced you to do so inside the loop).
Always declare local variables in the smallest possible scope. In this case, since %hash2 is only used within the loop, you should've declared it inside the loop, like above.
Following these best practices, the example code above would look like this:
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my %hash1;
foreach my $i (1 .. 3) {
my %hash2 = (number => $i, foo => "bar", baz => "whatever");
$hash1{$i} = \%hash2;
}
print Dumper(\%hash1);
which, as expected, will print:
$VAR1 = {
'1' => {
'baz' => 'whatever',
'number' => 1,
'foo' => 'bar'
},
'3' => {
'baz' => 'whatever',
'number' => 3,
'foo' => 'bar'
},
'2' => {
'baz' => 'whatever',
'number' => 2,
'foo' => 'bar'
}
};
It's hard to see what the problem is when you don't post the code or the actual results of Data::Dumper.
There is one thing you should know about Data::Dumper: When you dump an array or (especially) a hash, you should dump a reference to it. Otherwise, Data::Dumper will treat it like a series of variables. Also notice that hashes do not remain in the order you create them. I've enclosed an example below. Make sure that your issue isn't related to a confusing Data::Dumper output.
Another question: If you're keying your hash by sequential keys, would you be better off with an array?
If you can, please edit your question to post your code and the ACTUAL results.
use strict;
use warnings;
use autodie;
use feature qw(say);
use Data::Dumper;
my #array = qw(one two three four five);
my %hash = (one => 1, two => 2, three => 3, four => 4);
say "Dumped Array: " . Dumper #array;
say "Dumped Hash: " . Dumper %hash;
say "Dumped Array Reference: " . Dumper \#array;
say "Dumped Hash Reference: " . Dumper \%hash;
The output:
Dumped Array: $VAR1 = 'one';
$VAR2 = 'two';
$VAR3 = 'three';
$VAR4 = 'four';
$VAR5 = 'five';
Dumped Hash: $VAR1 = 'three';
$VAR2 = 3;
$VAR3 = 'one';
$VAR4 = 1;
$VAR5 = 'two';
$VAR6 = 2;
$VAR7 = 'four';
$VAR8 = 4;
Dumped Array Reference: $VAR1 = [
'one',
'two',
'three',
'four',
'five'
];
Dumped Hash Reference: $VAR1 = {
'three' => 3,
'one' => 1,
'two' => 2,
'four' => 4
};
The reason it is doing this is you are giving it the same reference to the same hash.
Presumably in a loop construct.
Here is a simple program which has this behaviour.
use strict;
use warnings;
# always use the above two lines until you
# understand completely why they are recommended
use Data::Printer;
my %hash;
my %inner; # <-- wrong place to put it
for my $index (0..5){
$inner{int rand} = $index; # <- doesn't matter
$hash{$index} = \%inner;
}
p %hash;
To fix it just make sure that you are creating a fresh hash reference every time through the loop.
use strict;
use warnings;
use Data::Printer;
my %hash;
for my $index (0..5){
my %inner; # <-- place the declaration here instead
$inner{int rand} = $index; # <- doesn't matter
$hash{$index} = \%inner;
}
p %hash;
If you are only going to use numbers for your indexes, and they are monotonically increasing starting from 0, then I would recommend using an array.
An array would be faster and more memory efficient.
use strict;
use warnings;
use Data::Printer;
my #array; # <--
for my $index (0..5){
my %inner;
$inner{int rand} = $index;
$array[$index] = \%inner; # <--
}
p #array;

Perl - Hash of Array: Key look-up by Value

I have a hash of arrays (HoA). I have been processing the values of this HoA using $arrayrefs. However, now I need to retrieve the $key based on the $arrayrefs.
my %a = ( 1 => "ONE" ,
2 => "TWO" ,
3 => " Three", );
my %aa = ( 4 => [ 'ONE' , 'TWO', 'THREE'],
5 => ['one' , 'two', 'three'],
6 => ['more', 'dos', 'some'],
);
my #array = ('ONE' , 'TWO', 'THREE');
my $array_ref = \#array;
# returns the $key where the $value is 'ONE'
my ($any_match) = grep { $a{$_} eq 'ONE' } keys %a;
print $any_match."\n"; # this returns '1', as expected.. Good!
my ($match) = grep { $aa{$_} eq #$array_ref } keys %aa;
print $match."\n"; # <--- error: says that match is uninitialized
In the last print statement, I would like it to return 4. Does anyone know how to do this?
You can't compare arrays with eq. A simple solution is to turn both arrays into strings and comparing the strings using eq:
my ($match) = grep { join("", #{$aa{$_}}) eq join("", #$array_ref) } keys %aa;
For comparing arrays you could also utilize one of many modules from CPAN, e.g. Array::Compare, List::Compare, etc.
Always use strict; use warnings;. Add use v5.10; since Perl's (v5.10+) smart matching will be used to compare arrays. Do the following:
my ($match) = grep { #{$aa{$_}} ~~ #$array_ref } keys %aa;
The smart operator ~~ is used here to compare the arrays.

Perl extract range of elements from a hash

If I have a hash:
%hash = ("Dog",1,"Cat",2,"Mouse",3,"Fly",4);
How can I extract the first X elements of this hash. For example if I want the first 3 elements, %newhash would contain ("Dog",1,"Cat",2,"Mouse",3).
I'm working with large hashes (~ 8000 keys).
"first X elements of this hash" doesn't mean anything. First three elements in order by numeric value?
my %hash = ( 'Dog' => 1, 'Cat' => 2, 'Mouse' => 3, 'Fly' => 4 );
my #hashkeys = sort { $hash{$a} <=> $hash{$b} } keys %hash;
splice(#hashkeys, 3);
my %newhash;
#newhash{#hashkeys} = #hash{#hashkeys};
You might want to use something like this:
my %hash = ("Dog",1,"Cat",2,"Mouse",3,"Fly",4);
for ( (sort keys %hash)[0..2] ) {
say $hash{$_};
}
You should have an array 1st:
my %hash = ("Dog" => 1,"Cat"=>2,"Mouse"=>3,"Fly"=>4);
my #array;
foreach $value (sort {$hash{$a} <=> $hash{$b} }
keys %hash)
{
push(#array,{$value=>$hash{$value}});
}
#get range:
my #part=#array[0..2];
print part of result;
print $part[0]{'Cat'}."\n";

sum hash of hash values using perl

I have a Perl script that parses an Excel file and does the following : It counts for each value in column A, the number of elements it has in column B, the script looks like this :
use strict;
use warnings;
use Spreadsheet::XLSX;
use Data::Dumper;
use List::Util qw( sum );
my $col1 = 0;
my %hash;
my $excel = Spreadsheet::XLSX->new('inout_chartdata_ronald.xlsx');
my $sheet = ${ $excel->{Worksheet} }[0];
$sheet->{MaxRow} ||= $sheet->{MinRow};
my $count = 0;
# Iterate through each row
foreach my $row ( $sheet->{MinRow}+1 .. $sheet->{MaxRow} ) {
# The cell in column 1
my $cell = $sheet->{Cells}[$row][$col1];
if ($cell) {
# The adjacent cell in column 2
my $adjacentCell = $sheet->{Cells}[$row][ $col1 + 1 ];
# Use a hash of hashes
$hash{ $cell->{Val} }{ $adjacentCell->{Val} }++;
}
}
print "\n", Dumper \%hash;
The output looks like this :
$VAR1 = {
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
This works great, my question is : How can I access the elements of this output $VAR1 in order to do : for value 13, klm + hij = 3 and get a final output like this :
$VAR1 = {
'13' => {
'somename' => 3,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
};
So basically what I want to do is loop through my final hash of hashes and access its specific elements based on a unique key and finally do their sum.
Any help would be appreciated.
Thanks
I used #do_sum to indicate what changes you want to make. The new key is hardcoded in the script. Note that the new key is not created if no key exists in the subhash (the $found flag).
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my %hash = (
'13' => {
'klm' => 1,
'hij' => 2,
'lkm' => 4,
},
'12' => {
'abc' => 2,
'efg' => 2
}
);
my #do_sum = qw(klm hij);
for my $num (keys %hash) {
my $found;
my $sum = 0;
for my $key (#do_sum) {
next unless exists $hash{$num}{$key};
$sum += $hash{$num}{$key};
delete $hash{$num}{$key};
$found = 1;
}
$hash{$num}{somename} = $sum if $found;
}
print Dumper \%hash;
It sounds like you need to learn about Perl References, and maybe Perl Objects which are just a nice way to deal with references.
As you know, Perl has three basic data-structures:
Scalars ($foo)
Arrays (#foo)
Hashes (%foo)
The problem is that these data structures can only contain scalar data. That is, each element in an array can hold a single value or each key in a hash can hold a single value.
In your case %hash is a Hash where each entry in the hash references another hash. For example:
Your %hash has an entry in it with a key of 13. This doesn't contain a scalar value, but a references to another hash with three keys in it: klm, hij, and lkm. YOu can reference this via this syntax:
${ hash{13} }{klm} = 1
${ hash{13} }{hij} = 2
${ hash{13} }{lkm} = 4
The curly braces may or may not be necessary. However, %{ hash{13} } references that hash contained in $hash{13}, so I can now reference the keys of that hash. You can imagine this getting more complex as you talk about hashes of hashes of arrays of hashes of arrays. Fortunately, Perl includes an easier syntax:
$hash{13}->{klm} = 1
%hash{13}->{hij} = 2
%hash{13}->{lkm} = 4
Read up about hashes and how to manipulate them. After you get comfortable with this, you can start working on learning about Object Oriented Perl which handles references in a safer manner.