Perl: multidimensional hash - perl

suppose I have the following data
cluster1:d(A),f(C)s,(A)
cluster2:r(D),h(D),f(A)
I want this out put
output:
cluster1:A->2
cluster1:C->1
cluster2:D->2
cluster2:A->1
here is my try,but it is not correct , the part that I am trying to count characters has a problem that I cant fix
the code is a part of very big code ,and I want exactly multidimensional hash
use strict;
use Data::Dumper;
my %count;
while (<DATA>) {
my %HoH;
my ( $cluster, $ch ) = split (/:/,$_);
$HoH{$cluster}={split /[()]+/,$ch};
for my $clust ( keys %HoH ) {
for my $character ( keys %{ $HoH{$clust} } ) {
$count{$clust}{$HoH{$clust}{$character}}++;
}
}
}
print Dumper(\%count);
foreach my $name (sort keys %count) {
foreach my $subject (keys %{$count{$name}}) {
print "$name:$subject->$count{$name}{$subject}\n";
}
}
DATA
cluster1:d(A),f(C)s,(A)
cluster2:r(D),h(D),f(A)

It would be nice if you try to understand the below code so that you can get an idea for solving the problem:-
use strict;
use Data::Dumper;
my $data = "cluster1:A,B,C,A";
my %cluster = ();
my ($cluster_key, $cluster_val ) = split (':', $data);
my #cluster1_data = split(',', $cluster_val);
foreach my $val ( #cluster1_data ) {
$cluster{$cluster_key}{$val}++;
}
print Dumper(\%cluster);
foreach my $clus ( keys %cluster ) {
my $clus_ref = $cluster{$clus};
foreach my $clu ( keys %{ $clus_ref } ){
my $count = $clus_ref->{$clu};
print"$clus:$clu->$count\n";
}
}
Output:
$VAR1 = {
'cluster1' => {
'A' => 2,
'C' => 1,
'B' => 1
}
};
cluster1:A->2
cluster1:C->1
cluster1:B->1

What do you expect $count{$cluster}{$characters}+=1; to do exactly? You have to loop over your input data to populate %count if you expect to get the desired result:
while (<DATA>) {
next unless /^(cluster\d+):(.+)/;
$count{$1}{$_}++ for split/,/, $2;
}
If you also add sort to the second foreach you'll get the output you want.
EDIT: This solves the question for the updated input and requirements:
my %count;
while (<DATA>) {
next unless /^(cluster\d+):(.+)/;
my $cluster = $1;
$count{$cluster}{$_}++ for $2 =~ /\((\w)\)/g;
}
for my $key (sort keys %count) {
for my $value (sort {
$count{$key}{$b} <=> $count{$key}{$a}
} keys %{$count{$key}}) {
print "$key:$value->$count{$key}{$value}\n";
}
}

Related

How to print the previous Key Value in Perl?

In this code, I'm checking if a certain key is present or not.
Here I am checking if key "Uri" present. I am getting output as "3".
use strict;
use warnings;
my %Names = (
Martha =>2,
Vivek =>9,
Jason =>6,
Socrates=>7,
Uri =>3,
Nitin =>1,
Plato =>0,
);
if (exists $Names{Uri} ) {
print "$Names{Uri}\n";
}
foreach my $name (sort {$Names{$a} cmp $Names{$b}} keys %Names)
{
print $name, $Names{$name}."\n";
}
Output
3
Plato 0
Nitin 1
Martha 2
Uri 3
Jason 6
Socrates 7
Vivek 9
But, I want the previous key value present before that key. For example:
If I search for key "Uri" Output should be "2"
If I search for key "Vivek" Output should be "7"
If I search for key "Plato" Output should be "0"
Does anyone know how to do it?
Create a sorted array of the hash values, then search through the array to get the value just lower than the value of your search key.
use strict;
use warnings;
my %Names = (
Martha =>2,
Vivek =>9,
Jason =>6,
Socrates=>7,
Uri =>3,
Nitin =>1,
Plato =>0,
);
my #vals = sort {$a <=> $b} values %Names;
get_prev('Uri');
get_prev('Vivek');
get_prev('Plato');
sub get_prev {
my $k = shift;
if (exists $Names{$k}) {
for (#vals) {
if ($Names{$k} == $vals[$_]) {
my $idx = ($_ == 0) ? 0 : $_ - 1;
print $vals[$idx], "\n";
last;
}
}
}
}
Prints:
2
7
0
If you want to print them all:
my $prev;
for my $name (
sort { $Names{$a} <=> $Names{$b} } # Note to use of <=> for numerical comparisons.
keys(%Names)
) {
say "$name $Names{$prev}" if $prev;
$prev = $name;
}
Similarly, to print just one
my $find = 'Uri';
my $prev;
for my $name (
sort { $Names{$a} <=> $Names{$b} }
keys(%Names)
) {
if ($name eq $find) {
say "$name $Names{$prev}" if $prev;
last;
}
$prev = $name;
}
The above hat would be an expensive way to perform multiple lookups.
For that, we'd build a mapping from names to the previous names.
my %prev_name_lkup;
my $prev;
for my $name (
sort { $Names{$a} <=> $Names{$b} }
keys(%Names)
) {
$prev_name_lkup{$name} = $prev if $prev;
$prev = $name;
}
This could also be done as follows:
my #sorted_names =
sort { $Names{$a} <=> $Names{$b} }
keys(%Names);
my %prev_name_lkup =
map { $sorted_names[$_-1] => $sorted_names[$_] }
1..$#sorted_names;
Either way, the lookups would look like this:
say "Uri $Names{$prev_name_lkup{Uri}}";

Perl sorting hash by values in the hash

I think I have the right idea but there's some syntax/convention thing I'm messing up, because I get the error "Global symbol %timeHash requires explicit package name".
Code:
foreach $key (sort hashValueDescendingNum (keys(%timeHash))) {
print "\t$key\t\t $timeHash{$key}\n";
}
sub hashValueDescendingNum {
my $hash = shift;
$hash{$b} <=> $hash{$a};
}
Inline
foreach my $key (sort { $timeHash{$b} <=> $timeHash{$a} } keys %timeHash) {
print "\t$key\t\t $timeHash{$key}\n";
}
Using a custom sort function the way you are trying to will not work well, because then your sub would need to access the original hash.
foreach my $key (sort hashValueDescendingNum (keys(%timeHash))) {
print "\t$key\t\t $timeHash{$key}\n";
}
sub hashValueDescendingNum {
$timeHash{$b} <=> $timeHash{$a}; # Ew.
}
Instead you can abstract it further:
foreach my $key (sortedHashKeysByValueDescending(%timeHash)) {
print "\t$key\t\t $timeHash{$key}\n";
}
sub sortedHashKeysByValueDescending {
my %hash = #_;
my #keys = sort { $hash{$b} <=> $hash{$a} } keys %hash;
return #keys;
}
The code is not efficient because it passes around the %hash though, references would be better:
foreach my $key (sortedHashKeysByValueDescending(\%timeHash)) {
print "\t$key\t\t $timeHash{$key}\n";
}
sub sortedHashKeysByValueDescending {
my $hash = shift;
return sort { $hash->{$b} <=> $hash->{$a} } keys %$hash;
}
use List::UtilsBy qw( rev_nsort_by );
foreach my $key ( rev_nsort_by { $timeHash{$_} } keys %timeHash ) {
...
}

compare multiple hashes for common keys merge values

I have a working bit of code here where I am comparing the keys of six hashes together to find the ones that are common amongst all of them. I then combine the values from each hash into one value in a new hash. What I would like to do is make this scaleable. I would like to be able to easily go from comparing 3 hashes to 100 without having to go back into my code and altering it. Any thoughts on how I would achieve this? The rest of the code already works well for different input amounts, but this is the one part that has me stuck.
my $comparison = List::Compare->new([keys %{$posHashes[0]}], [keys %{$posHashes[1]}], [keys %{$posHashes[2]}], [keys %{$posHashes[3]}], [keys %{$posHashes[4]}], [keys %{$posHashes[5]}]);
my %comboHash;
for ($comparison->get_intersection) {
$comboHash{$_} = ($posHashes[0]{$_} . $posHashes[1]{$_} . $posHashes[2]{$_} . $posHashes[3]{$_} . $posHashes[4]{$_} . $posHashes[5]{$_});
}
my %all;
for my $posHash (#posHashes) {
for my $key (keys(%$posHash)) {
push #{ $all{$key} }, $posHash->{$key};
}
}
my %comboHash;
for my $key (keys(%all)) {
next if #{ $all{$key} } != #posHashes;
$comboHash{$key} = join('', #{ $all{$key} });
}
Just make a subroutine and pass it hash references
my $combination = combine(#posHashes);
sub combine {
my #hashes = #_;
my #keys;
for my $href (#hashes) {
push #keys, keys %$href;
}
# Insert intersection code here..
# .....
my %combo;
for my $href (#hashes) {
for my $key (#intersection) {
$combo{$key} .= $href->{$key};
}
}
return \%combo;
}
Create a subroutine:
sub combine_hashes {
my %result = ();
my #hashes = #_;
my $first = shift #hashes;
for my $element (keys %$first) {
my $count = 0;
for my $href (#hashes) {
$count += (grep {$_ eq $element} (keys %$href));
}
if ($count > $#hashes) {
$result{$element} = $first->{$element};
$result{$element} .= $_->{$element} for #hashes;
}
}
\%result;
}
and call it by:
my %h = %{combine_hashes(\%h1, \%h2, \%h3)};
...or as:
my %h = %{combine_hashes(#posHashes)};
There is pretty straightforward solution:
sub merge {
my $first = shift;
my #hashes = #_;
my %result;
KEY:
for my $key (keys %$first) {
my $accu = $first->{$key};
for my $hash (#hashes) {
next KEY unless exists $hash->{$key};
$accu .= $hash->{$key};
}
$result{$key} = $accu;
}
return \%result;
}
You have to call it with references to hashes and it will return also hash reference e.g.:
my $comboHashRef = merge(#posHashes);

Optimize perl hash mess

I have Perl code, which looks messy:
my $x = $h->[1];
foreach my $y (keys %$x) {
my $ax = $x->{$y};
foreach my $ay (keys %$ax) {
if (ref($ax->{$ay}) eq 'JE::Object::Proxy') {
my $bx = $ax->{$ay};
if ($$bx->{class_info}->{name} eq 'HTMLImageElement') {
print $$bx->{value}->{src}, "\n";
}
}
}
}
Is it possible to optimize the code above to not use any variables, just $h, as that one is an input?
Here's my crack at it:
print $$_->{value}{src}, "\n" for grep {
ref $_ eq 'JE::Object::Proxy' &&
$$_->{class_info}{name} eq 'HTMLImageElement'
} map {
values %$_
} values %{ $h->[1] };
You're using keys, when you really just want values.
foreach my $h ( grep { ref() eq 'HASH' } values %$x ) {
foreach my $obj (
grep { ref() eq 'JE::Object::Proxy'
and $_->{class_info}{name} eq 'HTMLImageElement'
} values %$h
) {
say $obj->{value}{src};
}
}
A lot of the "messiness" can be cleaned up by reducing your line count and minimizing how much nested code you have. Use the each command to get the next key and its associated value from the hash in one line. [EDIT: as Axeman pointed out, you really only need the values, so I'm replacing my use of each with values]. Also, use a pair of next statement to skip the print statement.
for my $ax (values %{$h->[1]} ) {
for my $bx (values %$ax ) {
next unless ref($bx) eq 'JE::Object::Proxy';
next unless $$bx->{class_info}->{name} eq 'HTMLImageElement';
print "$$bx->{value}->{src}\n";
}
}
Just removing the helper variables is easy, something like this should do it:
foreach my $y (keys %{$h->[1]}) {
foreach my $ax (%{$h->[1]->{$y}) {
foreach my $ay (keys %$ax) {
if(ref($h->[1]->{$y}->{$ay}) eq 'JE::Object::Proxy') {
if($h->[1]->{$y}->{$ay}->{class_info}->{name} eq 'HTMLImageElement') {
print $h->[1]->{$y}->{$ay}->{value}->{src}, "\n";
}
}
}
}
}
You could also remove the duplicated if:
foreach my $y (keys %{$h->[1]}) {
foreach my $ax (%{$h->[1]->{$y}) {
foreach my $ay (keys %$ax) {
if(ref($h->[1]->{$y}->{$ay}) eq 'JE::Object::Proxy' && $h->[1]->{$y}->{$ay}->{class_info}->{name} eq 'HTMLImageElement') {
print $h->[1]->{$y}->{$ay}->{value}->{src}, "\n";
}
}
}
}
But I don't really see how to make it more readable: it is a iteration over a three dimensional structure.

array to hash in perl

I have a source list from which I am picking up random items and populating the destination list. The item that are in the list have a particular format. For example:
item1{'name'}
item1{'date'}
etc and many more fields.
while inserting into the destination list I check for unique names on items and insert it into that list. For this I have to traverse the entire destination list to check if an item with a given name exists and if not insert it.
I thought it would be nice if I make the destination list as hash instead of a list again so that I can look up for the item faster and efficiently. I am new to Perl and am not getting how to do this. Anybody, Please help me on how to insert an item, find for a particular item name, and delete an item in hash?
How can I make both the name and date as key and the entire item as value?
my %hash;
Insert an item $V with a key $K?
$hash{$K} = $V
Find for a particular name / key $K?
if (exists $hash{$K}) {
print "it is in there with value '$hash{$K}'\n";
} else {
print "it is NOT in there\n"
}
Delete a particular name / key?
delete $hash{$K}
Make name and date as key and entire item as value?
Easy Way: Just string everything together
set: $hash{ "$name:$date" } = "$name:$date:$field1:$field2"
get: my ($name2,$date2,$field1,$field2) = split ':', $hash{ "$name:$date" }
del: delete $hash{ "$name:$date" }
Harder Way: Store as a hash in the hash (google "perl object")
set:
my %temp;
$temp{"name"} = $name;
$temp{"date"} = $date;
$temp{"field1"} = $field1;
$temp{"field2"} = $field2
$hash{"$name:$date"} = \$temp;
get:
my $find = exists $hash{"$name:$date"} ? $hash{"$name:$date"} : undef;
if (defined find) { # i.e. it was found
printf "field 1 is %s\n", $find->{"field1"}
} else {
print "Not found\n";
}
delete:
delete $hash{"$name:$date"}
It is not easy to understand what you are asking because you do not describe the input and the desired outputs specifically.
My best guess is something along the lines of:
#!/usr/bin/perl
use strict; use warnings;
my #list = (
q(item1{'name'}),
q(item1{'date'}),
);
my %lookup;
for my $entry ( #list ) {
my ($name, $attrib) = $entry =~ /([^{]+){'([^']+)'}/;
$lookup{ $name }{ $attrib } = $entry;
}
for my $entry ( keys %lookup ) {
my %entry = %{ $lookup{$entry} };
print "#entry{keys %entry}\n"
}
use YAML;
print Dump \%lookup;
Output:
item1{'date'} item1{'name'}
---
item1:
date: "item1{'date'}"
name: "item1{'name'}"
If you know what items, you are going to need and what order you'll need them in
for keys, then re parsing the key is of questionable value. I prefer to store
them in levels.
$hash{ $h->{name} }{ $h->{date} } = $h;
# ... OR ...
$hash{ $h->{date} }{ $h->{name} } = $h;
foreach my $name ( sort keys %hash ) {
my $name_hash = $hash{$name};
foreach my $date ( keys %$name_hash ) {
print "\$hash{$name}{$date} => " . Dumper( $name_hash->{$date} ) . "\n";
}
}
For arbitrary levels, you may want a traversal function
sub traverse_hash (&#) {
my ( $block, $hash_ref, $path ) = #_;
$path = [] unless $path;
my ( #res, #results );
my $want = wantarray;
my $want_something = defined $want;
foreach my $key ( %$hash_ref ) {
my $l_path = [ #$path, $key ];
my $value = $hash_ref->{$key};
if ( ref( $value ) eq 'HASH' ) {
#res = traverse_hash( $block, $value, $l_path );
push #results, #res if $want_something && #res;
}
elsif ( $want_something ) {
#res = $block->( $l_path, $value );
push #results, #res if #res;
}
else {
$block->( $path, $value );
}
}
return unless $want_something;
return $want ? #results : { #results };
}
So this does the same thing as above:
traverse_hash {
my ( $key_path, $value ) = #_;
print( '$hash{' . join( '}{', #$key_path ) . '} => ' . ref Dumper( $value ));
();
} \%hash
;
Perl Solution
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
sub main{
my %hash;
my #keys = qw(firstname lastname age); # hash's keys
# fname lname age
# --------|--------|-----
my #arr = ( [ 'foo1', 'bar1', '1' ],
[ 'foo2', 'bar2', '2' ],
[ 'foo3', 'bar3', '3' ]
);
# test if array set up correctly
print "\$arr[1][1] : $arr[1][1] \n"; # bar2
# loads the multidimensional array into the hash
for my $row (0..$#arr){
for my $col ( 0..$#{$arr[$row]} ){
my $itemnum = "item" . ($row+1); # using the item# format you used
$hash{$itemnum}->{$keys[$col]} = $arr[$row][$col];
}
}
# manually add a 4th item
$hash{item4} = {"firstname", "foo", "lastname", "bar", "age", "35"};
# How to Retrieve
# -----------------------
# single item pull
print "item1->firstname : $hash{item1}->{firstname} \n"; # foo1
print "item3->age : $hash{item3}->{age} \n"; # 3
# whole line 1
{ local $, = " ";
print "full line :" , %{$hash{item2}} , "\n"; # firstname foo2 lastname bar2 age 2
}
# whole line 2
foreach my $key (sort keys %{$hash{item2}}){
print "$key : $hash{item2}{$key} \n";
}
# Clearer description
#print "Hash:\n", Dumper %hash;
}
main();
This should be used in addition to the accepted answer. Your question was a little vague on the array to hash requirement, perhaps this is the model you are looking for?