executing a function within an array within a hash in perl - perl

I have a Perl data structurte like so
%myhash = (
k1 => v1,
kArray => [
{
name => "anonymous hash",
...
},
\&funcThatReturnsHash,
{
name => "another anonymous hash",
...
}
]
);
Elsewhere I iterate through the list in kArray which contains a bunch of hashes. I would like to either process the actual hash OR the hash returned by the function.
foreach my $elem( #{myhash{kArray}} ) {
if (ref($elem) == "CODE") {
%thisHash = &$elem;
}
else {
%thisHash = %$elem;
}
...
}
However ref ($elem) is always scalar or undefined. I tried func, &func, \&func, \%{&func}, in %myhash to no effect.
how do I extract the hash within the function in the main body?

Apart from the code sample you give being invalid Perl, the main problems seem to be that you are using == to compare strings instead of eq, and you are assigning a hash reference to a hash variable %thishash. I assure you that ref $elem never returns SCALAR with the data you show
It would help you enormously if you followed the common advice to use strict and use warnings at the top of your code
This will work for you
for my $elem ( #{ $myhash{kArray} } ) {
my $this_hash;
if ( ref $elem eq 'CODE' ) {
$this_hash = $elem->();
}
else {
$this_hash = $elem;
}
# Do stuff with $this_hash
}
or you could just use a map like this
use strict;
use warnings;
use 5.010;
use Data::Dump;
my %myhash = (
k1 => v1,
kArray => [
{
name => "anonymous hash",
},
\&funcThatReturnsHash,
{
name => "another anonymous hash",
}
]
);
for my $hash ( map { ref eq 'CODE' ? $_->() : $_ } #{ $myhash{kArray} } ) {
say $hash->{name};
}
sub funcThatReturnsHash {
{ name => 'a third anonymous hash' };
}
output
anonymous hash
a third anonymous hash
another anonymous hash

If you turn on strict and warnings, you'll see that:
foreach my $elem(#{mynahs{kArray}}) {
Isn't valid. You need at the very least a $ before mynahs.
But given something like this - your approach works - here's an example using map to 'run' the code references:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
sub gimme_hash {
return { 'fish' => 'paste' };
}
my $stuff =
[ { 'anon1' => 'value' },
\&gimme_hash,
{ 'anon2' => 'anothervalue' }, ];
my $newstuff = [ map { ref $_ eq "CODE" ? $_->() : $_ } #$stuff ];
print Dumper $newstuff;
Turns that hash into:
$VAR1 = [
{
'anon1' => 'value'
},
{
'fish' => 'paste'
},
{
'anon2' => 'anothervalue'
}
];
But your approach does work:
foreach my $element ( #$stuff ) {
my %myhash;
if ( ref $element eq "CODE" ) {
%myhash = %{$element -> ()};
}
else {
%myhash = %$element;
}
print Dumper \%myhash;
}
Gives:
$VAR1 = {
'anon1' => 'value'
};
$VAR1 = {
'fish' => 'paste'
};
$VAR1 = {
'anon2' => 'anothervalue'
};

Related

Creating hash of hash dynamically in perl

I am trying to create a hash of hash of - the nesting depth depends on the number of arguments passed into #aGroupByFields array.
In the below implementation, I am getting the desired hash structure.But I have hard coded the fields [ example - $phBugRecord->{createdBy} ] instead of deriving it from the array.
I am not sure how to dynamically create this.
my (#aGroupByFields) = ['createdBy','status','devPriority'];
# In real case,these are passed in as arguments
my (%hTemp);
# This is the final hash which will be structured according to above fields
# %hBugDetails is the hash containing details of all bugs
foreach my $phBugRecord ( #{ $hBugDetails{records} } ) {
# The below statement needs to be generated dynamically as
# opposed to the hard-coded values.
push(
#{
$hTemp{ $phBugRecord->{createdBy} }{ $phBugRecord->{status} }
{ $phBugRecord->{devPriority} }
},
$phBugRecord
);
}
Any pointer will be a great help.Thanks.
Here is a working implementation with Data::Diver.
use strict;
use warnings;
use Data::Diver 'DiveVal';
use Data::Printer;
my %hBugDetails = (
records => [
{
createdBy => 'created_by1',
status => 'status1',
devPriority => 'dev_priority1',
foo => 'foo1',
bar => 'bar1',
},
{
createdBy => 'created_by1',
status => 'status2',
devPriority => 'dev_priority2',
foo => 'foo',
bar => 'bar',
},
],
);
# we want to group by these fields
my #group_by = ( 'createdBy', 'status', 'devPriority' );
my $grouped_bugs = {}; # for some reason we need to start with an empty hashref
foreach my $bug ( #{ $hBugDetails{records} } ) {
# this will auto-vivify the hash for us
push #{ DiveVal( $grouped_bugs, map { $bug->{$_} } #group_by ) }, $bug;
}
p $grouped_bugs;
The output looks like this.
\ {
created_by1 {
status1 {
dev_priority1 [
[0] {
bar "bar1",
createdBy "created_by1",
devPriority "dev_priority1",
foo "foo1",
status "status1"
}
]
},
status2 {
dev_priority2 [
[0] {
bar "bar",
createdBy "created_by1",
devPriority "dev_priority2",
foo "foo",
status "status2"
}
]
}
}
}
Note that I renamed your variables. It was very hard to read the code like that. It makes more sense to just use speaking names instead of cryptic abbreviations for the type of variable. The sigil already does that for you.
This code will do what you need
my #aGroupByFields = qw/ createdBy status devPriority /;
my %hTemp;
for my $phBugRecord ( #{ $hBugDetails{records} } ) {
my $hash = \%hTemp;
for my $field ( #aGroupByFields ) {
my $key = $phBugRecord->{$field};
if ( $field eq $aGroupByFields[-1] ) {
push #{ $hash->{ $key } }, $phBugRecord;
}
else {
$hash = $hash->{ $key } //= {};
}
}
}

Find key for greatest value in hash of hashes in Perl

I have a hash of hashes containing keys, values, and counts of the form ((k1, v1), c1). I am trying to write a subroutine that returns the value of the key passed as a parameter with the greatest count. For example, if I had:
%hash = (
"a" => {
"b" => 2,
"c" => 1,
},
"d" => {
"e" => 4,
},
);
and made the call:
print &function("a");
it should print "b" because key "a" has the highest count of 2 with "b" as its value. Here is the code I have so far:
sub function() {
$key = $_[0];
if(exists($hash{$key})) {
while (my ($value, $count) = each %{$hash{$key}}) {
#logic goes here
}
} else {
return "$key does not exist";
}
}
The sub doesn't need to know anything about the outer hash, so it makes far more sense to call the sub as follows:
print key_with_highest_val($hash{a});
The sub simply needs to iterate over all the elements of that hash, keeping track of the highest value seen, and the key at which it was seen.
sub key_with_highest_val {
my ($h) = #_;
my $hi_v;
my $hi_k;
for my $k (keys(%$h)) {
my $v = $h->{$k};
if (!defined($hi_v) || $v > $hi_v) {
$hi_v = $v;
$hi_k = $k;
}
}
return $hi_k;
}
As Chris Charley points out, List::Util's reduce can simply this function. With the calling convention I recommended above, the reduce solution becomes the following:
use List::Util qw( reduce );
sub key_with_highest_val {
my ($h) = #_;
return reduce { $h->{$a} >= $h->{$b} ? $a : $b } keys(%$h);
}
Both versions return an arbitrary key among those that tied when there's a tie.
Use the reduce function from List::Util (which is part of core perl).
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw/reduce/;
my %hash = (
"a" => {
"b" => 2,
"c" => 1,
},
"d" => {
"e" => 4,
},
);
my $key = 'a';
print "For key: $key, max key is ", max_key($key, %hash), "\n";
sub max_key {
my ($key, %hash) = #_;
return "$key does not exist" unless exists $hash{$key};
my $href = $hash{$key};
return reduce { $href->{$a} > $href->{$b} ? $a : $b } keys %$href;
}
You should always include use strict and use warnings at the top of your programs to catch errors so you can find and fix them. This requires declaring of your variables with my, like my $key = 'a';, for example and my %hash = ...
This program prints:
For key: a, max key is b
This code makes the following assumptions:
The values of your nested hashes are always numeric.
You don't have duplicate values.
Anything else is left as an exercise for the reader.
use strict;
use warnings;
use Data::Dump;
use List::Util qw(max);
my %hash = (
a => {
b => 2,
c => 1,
},
d => {
e => 4,
},
);
dd(max_hash_value(\%hash, $_)) for 'a' .. 'd';
sub max_hash_value {
my ($hash_ref, $search_key) = #_;
return unless $hash_ref->{$search_key};
my %lookup;
while (my ($key, $value) = each(%{$hash_ref->{$search_key}})) {
$lookup{$value} = $key;
}
return $lookup{max(keys(%lookup))};
}
Output:
"b"
()
()
"e"

not able to access hash of hash of array values

I have written the following code in Perl. The code is reading a pdb file and getting some values. Ignore the top part of the code,where everything is working perfect.
Problem is in the sub-routine part, where I try to store arrays in the hash3 with model as key another key position
the array values can be accessed inside the if condition using this :
$hash3{$model}{$coordinates}[1].
but when I go out of all foreach loop and try to access the elements I only get one value.
Please look at the end foreach loop and tell me is it the wrong way to access the hash values.
The pdb file I am using can be downloaded from this link http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=1NZS
#!/usr/bin/perl
open(IN,$ARGV[0]);
my #phosphosites;
my $model=1;
my %hash3;
while(<IN>)
{
#findmod(#line);
#finddist;
#findfreq;
if((/^MODRES/) && (/PHOSPHO/))
{
#line=split;
push(#phosphosites, $line[2]);
#print "$line[4]";
}
foreach $elements (#phosphosites){
if(/^HETATM\s+\d+\s+CA\s+$i/)
{
#line1=split;
#print "$line1[5]";
#print "$line1[6] $line1[7] $line1[8]\n";
push(#phosphositesnum, $line1[5]);
}
}
$pos=$line1[5];
#findspatial(\#line,\#line1);
}
my #ori_data=removeDuplicates(#phosphositesnum);
sub removeDuplicates {
my %seen = ();
my #vals = ();
foreach my $i (#_) {
unless ($seen{$i}) {
push #vals, $i;
$seen{$i} = 1;
}
}
return #vals;
}
$a=(#phosphosites);
print "$a\n";
print "#phosphosites\n";
print "#ori_data\n";
close(IN);
open(IN1,$ARGV[0]);
my (#data)=<IN1>;
spatial(\#ori_data);
sub spatial {
my #spatial_array1=#{$_[0]};
foreach $coordinates(#spatial_array1)
{
$model=1;
{foreach $data1(#data){
if($data1=~ m/^HETATM\s+\d+\s+CA\s+[A-Z]*\s+[A-Z]*\s+$coordinates/)
{
#cordivals=split(/\s+/,$data1);
push #{ $sphash{$model} },[$cordivals[6], $cordivals[7], $cordivals[8]];
$hash3{$model}{$coordinates}= \#cordivals;
#print "$model $coordinates $hash3{$model}{$coordinates}[6] $hash3{$model}{$coordinates}[7] $hash3{$model}{$coordinates}[8]\n";
#print "$model $sphash{$model}[$i][0] $sphash{$model}[$i][1] $sphash{$model}[$i][2]\n";
}
elsif($data1=~ m/^ENDMDL/)
{
$model++;
}
#print "$model $coordinates $hash3{$model}{$coordinates}[6] $hash3{$model}{$coordinates}[7] $hash3{$model}{$coordinates}[8]\n";
}
}
}
#foreach $z1 (sort keys %hash3)
# {
# foreach $z2(#spatial_array1){
# print "$z1 $z2";
# print "$hash3{$z1}{$z2}[6]\n";
# print "$z2\n";
# }
# }
}
After using the Data::Dumper option it is giving me this kind of output
$VAR1 = {
'11' => {
'334' => [
'HETATM',
'115',
'CA',
'SEP',
'A',
'343',
'-0.201',
'-2.884',
'1.022',
'1.00',
'99.99',
'C'
],
'342' => $VAR1->{'11'}{'334'},
'338' => $VAR1->{'11'}{'334'},
'335' => $VAR1->{'11'}{'334'},
'340' => $VAR1->{'11'}{'334'},
'343' => $VAR1->{'11'}{'334'},
'336' => $VAR1->{'11'}{'334'}
},
'7' => {
'334' => $VAR1->{'11'}{'334'},
'342' => $VAR1->{'11'}{'334'},
'338' => $VAR1->{'11'}{'334'},
'335' => $VAR1->{'11'}{'334'},
'340' => $VAR1->{'11'}{'334'},
'343' => $VAR1->{'11'}{'334'},
'336' => $VAR1->{'11'}{'334'}
},
'2' => {
'334' => $VAR1->{'11'}{'334'},
'342' => $VAR1->{'11'}{'334'},
...
Change:
#cordivals=split(/\s+/,$data1);
to:
my #cordivals=split(/\s+/,$data1);
What seems to be happening is that all the hash elements contain references to the same array variable, because you're not making the variable local to that iteration.
In general, you should use my with all variables.

How can I extract all global variables from a script and get each data type in Perl?

I like to capture all global variables from an external Perl script with Perl. Currently I am hanging around the type detection.
How to determine the correct data type ('', 'SCALAR', 'HASH', 'ARRAY', 'CODE')?
Parser script:
my %allVariables = ();
{
do "scriptToBeParsed.pl";
foreach my $sym ( keys %main:: ) {
# Get all normal variables and scalar/hash/array references:
if ( ref( *{"$sym"} ) =~ m/^(?:|SCALAR|HASH|ARRAY)$/ ) {
$allVariables{"$sym"} = *{"$sym"};
}
}
}
Script to be parsed:
$someVariable1 = 'Yes, I like to be captured';
$otherVariable2 = \'And I also want to be captured';
%anotherVariable3 = ( 'Capture' => 'me' );
#lameVariable4 = ( 'Capture', 'me' );
$fooVariable5 = { 'Capture' => 'me' };
$barVariable6 = [ 'Capture', 'me' ];
$subVariable7 = sub { return "Don't capture me!" };
sub dontCaptureMe { return "Don't capture me!" }
In my example ref( *{"$sym"} ) returns always 'GLOB' (of course).
Another approach would be to use the has-like access of the typeglob, which is explained in Chapter 8 of brian d foy's Mastering Perl on page 131f.
package test;
no strict;
no warnings;
$someVariable1 = 'Yes, I like to be captured';
$otherVariable2 = \'And I also want to be captured';
%anotherVariable3 = ( 'Capture' => 'me' );
#lameVariable4 = ( 'Capture', 'me' );
$fooVariable5 = { 'Capture' => 'me' };
$barVariable6 = [ 'Capture', 'me' ];
$subVariable7 = sub { return "Don't capture me!" };
sub dontCaptureMe { return "Don't capture me!" }
say $dontCaptureMe;
my %allVariables = ();
{
do "scriptToBecomeParsed.pl";
foreach my $sym ( keys %test:: ) {
for (qw( SCALAR HASH ARRAY CODE IO)) {
if (*{"$sym"}{$_}) {
$allVariables{$_}->{"$sym"} = *{"$sym"}{$_};
}
}
}
}
print Data::Dumper::Dumper \%allVariables;
This will produce the following output:
$VAR1 = {
'CODE' => {
'dontCaptureMe' => sub { "DUMMY" }
},
'ARRAY' => {
'lameVariable4' => [
'Capture',
'me'
]
},
'HASH' => {
'anotherVariable3' => {
'Capture' => 'me'
}
},
'SCALAR' => {
'someVariable1' => \'Yes, I like to be captured',
'__ANON__' => \undef,
'subVariable7' => \sub { "DUMMY" },
'dontCaptureMe' => \undef,
'otherVariable2' => \\'And I also want to be captured',
'BEGIN' => \undef,
'barVariable6' => \[
'Capture',
'me'
],
'anotherVariable3' => \undef,
'lameVariable4' => \undef,
'fooVariable5' => \{
'Capture' => 'me'
}
}
};
like you said
ref( *{"$sym"} ) returns always 'GLOB' (of course).
Because perl stores everything in the symbol table in a glob, it is impossible to tell which data type something is. This is because in perl it is perfectly valid to have an array, scalar, hash or whatever else with the same name... because of this, perl stores everything in globs to avoid collisions. What you could do is loop through all of the symbols in the symbol table and test each glob against all the possible things that it could be (the set isn't too large) and see which ones are set.
Alternatively, a more practical approach might be to just load the perl script as text and parse for $, %, #, sub, open (filehandle) to see what type everything is.

Equivalent of "shift" for a hash to create a $class->next() method

I almost feel like saying "it's me again!".
Anyway, here we go.
I like using while $object->next() style constructs. They appeal to me and seem "neat".
Now, when the thing I'm iterating over is an array, it's straightforward ("shift #ary or return undef")
sub next {
my ( $self, $args ) = #_;
my $next = shift #{ $self->{list_of_things} } or return undef;
my ( $car, $engine_size, $color )
= split( /\Q$opts->{fieldsep}/, $next );
$self->car = $host;
$self->engine_size = $engine_size;
$self->color = $color;
}
In this example I use AUTOLOAD to create the getters and setters and then have those instance variables available in my object during the while loop.
I'd like to do something similar but with the "list_of_things" being a %hash.
Here's a non-OO example that doesn't make it into the first iteration. Any ideas why?
(The total "list_of_things" is not that big - maybe 100 entries - so to do a keys(%{$hash}) every time doesn't seem too wasteful to me).
use strict;
use warnings;
use Data::Dumper;
my $list_of_things = {
volvo => {
color => "red",
engine_size => 2000,
},
bmw => {
color => "black",
engine_size => 2500,
},
mini => {
color => "british racing green",
engine_size => 1200,
}
};
sub next {
my $args = $_;
my #list = keys( %{$list_of_things} );
return undef if scalar #list == "0";
my $next = $list_of_things->{ $list[0] };
delete $list_of_things->{ $list[0] };
return $next;
}
while ( next()) {
print Dumper $_;
print scalar keys %{ $list_of_things }
}
Is there a better way of doing this? Am I doing something crazy?
EDIT:
I tried Ikegami's suggestion. Of course, Ikegami's example works flawlessly. When I try and abstract a little, so that all that is exposed to the object is a next->() method, I get the same "perl-going-to-100%-cpu" problem as in my original example.
Here's a non-OO example:
use Data::Dumper qw( Dumper );
sub make_list_iter {
my #list = #_;
return sub { #list ? shift(#list) : () };
}
sub next {
make_list_iter( keys %$hash );
}
my $hash = { ... };
while ( my ($k) = next->() ) {
print Dumper $hash->{$k};
}
It does not seem to get past the first step of the while() loop.
I am obviously missing something here...
If you don't want to rely on the hash's builtin iterator (used by each, keys and values), there's nothing stopping you from making your own.
use Data::Dumper qw( Dumper );
sub make_list_iter {
my #list = #_;
return sub { #list ? shift(#list) : () };
}
my $list_of_things = { ... };
my $i = make_list_iter(keys %$list_of_things);
while (my ($k) = $i->()) {
local $Data::Dumper::Terse = 1;
local $Data::Dumper::Indent = 0;
say "$k: " . Dumper($list_of_things->{$k});
}
The each operator is a builtin that iterates over hashes. It returns undef when it runs out of elements to return. So you could so something like
package SomeObject;
# creates new object instance
sub new {
my $class = shift;
return bless { hash_of_things => { #_ } }, $class
}
sub next {
my $self = shift;
my ($key,$value) = each %{ $self->{hash_of_things} };
return $key; # or return $value
}
Calling keys on the hash will reset the each iterator. It's good to know this so you can reset it on purpose:
sub reset {
my $self = shift;
keys %{ $self->{hash_of_things} }
}
and so you can avoid resetting it on accident.
The section on tie'ing hashes in perltie also has an example like this.
Here's how List::Gen could be used to create an iterator from a list:
use strict;
use warnings;
use List::Gen 'makegen';
my #list_of_things = ( # This structure is more suitable IMO
{
make => 'volvo',
color => 'red',
engine_size => 2000,
},
{
make => 'bmw',
color => 'black',
engine_size => 2500,
},
{
make => 'mini',
color => 'british racing green',
engine_size => 1200,
}
);
my $cars = makegen #list_of_things;
print $_->{make}, "\n" while $cars->next;
Well, if you don't need $list_of_things for later, you can always do something like
while(keys %$list_of_things)
{
my $temp=(sort keys %$list_of_things)[0];
print "key: $temp, value array: " . join(",",#{$list_of_things->{$temp}}) . "\n";
delete $list_of_things->{$temp};
}
And if you do need it, you can always assign it to a temporary hash reference and perform the same while loop on it.