Iterating over a complex data structure - perl

I have what appears to be a hash of a hash of an array of hashes. I'm trying to pull some values out and I'm stumped (this is way deeper than I would go with a structure. It looks like this.....
%htest = (
8569 => {
4587 => [
{
date=> "2011-01-15",
approved=> 1,
},
{
date=> "2011-01-12",
approved=> 1,
},
],
1254 => [
{
date=> "2011-01-12",
approved=> "",
},
{
date=> "",
approved=> 1,
},
],
},
);
Trying to iterate over this thing is giving me a massive headache. I'm trying to access the number of elements under the second hash value (4587 and 1254). The number of those elements where approved="1" and the number of elements where the date contains a value.
If I could iterate over them I sure I could shove what I need into a less complex structure but so far I'm at a loss.
I got this far...
while (my ($id, $surveyhash) = each %{ $htest{'8569'} } ){
print "$enumid = $subhash\n";
print Dumper $subhash."\n";
}
That gives me the "4587" and "1254" but trying to do a dumper on $subhash just gives me....
4587 = ARRAY(0x9a9ffb0)
$VAR1 = 'ARRAY(0x9a9ffb0)
';
1254 = ARRAY(0x9a91788)
$VAR1 = 'ARRAY(0x9a91788)
';
Any idea how to iterate over this monstrosity?
Janie

Your structure has typos, you need commas between the innermost hashes and a parenthesis at the end (rather than a curly bracket)
Once you fix it you can use something like this:
my $approved = 0, my $date_has_value = 0;
while ( my ($k,$vref) = each %htest ) {
while ( my ($k,$v) = each %$vref ) {
# Now you're inside the inner hash, so there will be 2 iterations
# with $k 4587 and 1254
foreach my $item (#$v) {
# Now each $item is a reference to the innermost hashes
$approved++ if $item->{approved} == 1;
$date_has_value++ if $item->{date};
}
}
}

Here's a fairly explicit iteration that should help get you started
my ($num_approved, $num_date) = (0, 0);
# outer hash
while (my ($ka, $va) = each %htest)
{
# inner hash
while (my ($kb, $vb) = each %{$va})
{
# each hash inside the array
foreach my $h (#{$vb})
{
$num_approved += ${$h}{"approved"} == 1;
$num_date += length(${$h}{"date"}) > 0;
}
}
}

Counting match cases can be done with "scalar grep".
my ($approved, $date_has_value) = (0, 0);
for my $v1 (values %htest) {
for my $v2 (values %$v1) {
$approved += scalar grep { $$_{approved} eq '1' } #$v2;
$date_has_value += scalar grep { $$_{date} ne '' } #$v2;
}
}

Related

How to group hash of numbers that are equals

I'm kinda new to programming and Perl... I would say I could do that in C like language, but I've got no idea how to do that in Perl.
To simplify things...
my $sql = [
{ ID => 1 },
{ ID => 2 },
{ ID => 2 },
{ ID => 2 },
{ ID => 3 }
];
my $number = 0;
foreach( #{ $sql } ) {
my #array = ();
if ( $number != $_->{ ID } ) {
push #array, $number;
if( $number ) {
push #array, '(' . $_->{ ID } . ')'; #how should I cycle that ?
}
}
$number = $_->{ ID };
print("#array");
}
I somehow need to get an output like this 1(2)(2)(2)3. I don't want to work with real code here, so I made a simple example how I use it now. I would be really grateful for help.
This code gives me 01 (2)2 (3) which is wrong. I've been thinking about it about for multiple hours so my results are getting worse and worse because I'm stuck and looping through the same mistakes.
Welcome to Perl and programming in general.
One straightforward way to do this is to count the number of times each value appears. If only once then print it without parentheses. If more than once then print it with parentheses and repeat according to the count for that value. I'm using the 'x' operator to repeat a string. '*' x 3 would give you '***' for example.
use strict;
use warnings;
my $sql = [
{ ID => 1 },
{ ID => 2 },
{ ID => 2 },
{ ID => 2 },
{ ID => 3 }
];
my $number = 0;
my %count_values;
foreach( #{ $sql } ) {
# make the number the key, count how many times it appears.
$count_values{$_->{ID}}++;
}
my $output_string = '';
foreach my $key_number ( sort {$a<=>$b} keys %count_values ){
# if just one occurance then don't add parentheses.
if ($count_values{$key_number} == 1){
$output_string .= $key_number;
} else {
$output_string .= "($key_number)" x $count_values{$key_number};
}
}
print $output_string;

Split string into a hash of hashes (perl)

at the moment im a little confused..
I am looking for a way to write a string with an indefinite number of words (separated by a slash) in a recursive hash.
These "strings" are output from a text database.
Given is for example
"office/1/hardware/mouse/count/200"
the next one can be longer or shorter..
This must be created from it:
{
office {
1{
hardware {
mouse {
count => 200
}
}
}
}
}
Any idea ?
Work backwards. Split the string. Use the last two elements to make the inner-most hash. While more words exist, make each one the key of a new hash, with the inner hash as its value.
my $s = "office/1/hardware/mouse/count/200";
my #word = split(/\//, $s);
# Bottom level taken explicitly
my $val = pop #word;
my $key = pop #word;
my $h = { $key => $val };
while ( my $key = pop #word )
{
$h = { $key => $h };
}
Simple recursive function should do
use strict;
use warnings;
use Data::Dumper;
sub foo {
my $str = shift;
my ($key, $rest) = split m|/|, $str, 2;
if (defined $rest) {
return { $key => foo($rest) };
} else {
return $key;
}
}
my $hash = foo("foo/bar/baz/2");
print Dumper $hash;
Gives output
$VAR1 = {
'foo' => {
'bar' => {
'baz' => '2'
}
}
};
But like I said in the comment: What do you intend to use this for? It is not a terribly useful structure.
If there are many lines to be read into a single hash and the lines have a variable number of fields, you have big problems and the other two answers will clobber data by either smashing sibling keys or overwriting final values. I'm supposing this because there is no rational reason to convert a single line into a hash.
You will have to walk down the hash with each field. This will also give you the most control over the process.
our $hash = {};
our $eolmark = "\000";
while (my $line = <...>) {
chomp $line;
my #fields = split /\//, $line;
my $count = #fields;
my $h = $hash;
my $i = 0;
map { (++$i == $count) ?
($h->{$_}{$eolmark} = 1) :
($h = $h->{$_} ||= {});
} #fields;
}
$h->{$_}{$eolmark} = 1 You need the special "end of line" key so that you can recognize the end of a record and still permit longer records to coexist. If you had two records
foo/bar/baz foo/bar/baz/quux, the second would overwrite the final value of the first.
$h = $h->{$_} ||= {} This statement is a very handy idiom to both create and populate a cache in one step and then take a shortcut reference to it. Never do a hash lookup more than once.
HTH

how to declare array reference in hash refrence

my $memType = [];
my $portOp = [];
my $fo = "aster.out.DRAMA.READ.gz";
if($fo =~/aster.out\.(.*)\.(.*)\.gz/){
push (#{$memType},$1);
push (#{$portOp},$2);
}
print Dumper #{$memType};
foreach my $mem (keys %{$portCapability->{#{$memType}}}){
//How to use the array ref memType inside a hash//
print "entered here\n";
//cannot post the rest of the code for obvious reasons//
}
I am not able to enter the foreach loop . Can anyone help me fix it?
Sorry this is not the complete code . Please help me.
%{$portCapability->{#{$memType}}}
This doesn't do what you may think it means.
You treat $portCapability->{#{$memType}} as a hash reference.
The #{$memType} is evaluated in scalar context, thus giving the size of the array.
I aren't quite sure what you want, but would
%{ $portCapability->{ $memType->[0] } }
work?
If, however, you want to slice the elements in $portCapability, you would need somethink like
#{ $portCapability }{ #$memType }
This evaluates to a list of hashrefs. You can then loop over the hashrefs, and loop over the keys in an inner loop:
for my $hash (#{ $portCapability }{ #$memType }) {
for my $key (keys %$hash) {
...;
}
}
If you want a flat list of all keys of the inner hashes, but don't need the hashes themselves, you could shorten above code to
for my $key (map {keys %$_} #{ $portCapability }{ #$memType }) {
...;
}
I think what you want is this:
my $foo = {
asdf => {
a => 1, b => 2,
},
foo => {
c => 3, d => 4
},
bar => {
e => 5, f => 6
}
};
my #keys = qw( asdf foo );
foreach my $k ( map { keys %{ $foo->{$_} } } #keys ) {
say $k;
}
But you do not know which of these $k belongs to which key of $foo now.
There's no direct way to get the keys of multiple things at the same time. It doesn't matter if these things are hashrefs that are stored within the same hashref under different keys, or if they are seperate variables. What you have to do is build that list yourself, by looking at each of the things in turn. That's simply done with above map statement.
First, look at all the keys in $foo. Then for each of these, return the keys inside that element.
my $memType = [];
my $portOp = [];
my $fo = “aster.out.DRAMA.READ.gz”;
if ($fo =~ /aster.out\.(\w+)\.(\w+)\.gz/ ) { #This regular expression is safer
push (#$memType, $1);
push (#$portOp, $2);
}
print Dumper “#$memType”; #should print “DRAMA”
#Now if you have earlier in your program the hash %portCapability, your code can be:
foreach $mem (#$memType) {
print $portCapability{$mem};
}
#or if you have the hash $portCapability = {…}, your code can be:
foreach $mem (#$memType) {
print $portCapability->{$mem};
}
#Hope it helps

Delete value from Perl hash of arrays of hashes

I'm trying to delete values from a hash of arrays of hashes that I created with the following code:
while ((my $Genotype1, my $Fitness1) = each (%Normalisedfithash)) {
while ((my $Parent1A, my $TallyP1) = each(%P1Tallyhash)) {
my $ParentTally = 0;
my $SecondParent = {
Parent2 => $Parent1A,
Tally => $ParentTally,
};
push #{ $StoredParentshash{$Genotype1}}, $SecondParent;
I have been trying to delete values from %StoredParentshash where Tally is zero. (I have further code which updates Tally, but some are not updated and I want them removed from the hash).
I have written the following:
for my $Parent (keys %StoredParentshash) {
my $aref1 = $StoredParentshash{$Parent};
for my $hashref1 (#$aref1) {
my $Tally = $hashref1->{'Tally'};
if ($Tally == 0){
delete $hashref1->{'Tally'};
delete $hashref1->{'Parent2'};
}
}
}
This code sort of deletes the data, but when I use Data::Dumper the structure I get back looks like this:
'7412' => [
{},
{
'Tally' => 1,
'Parent2' => '2136'
},
{},
{},
{},
How can I completely remove the keys where the Tally is zero rather than being left with {}?
Thanks!
The code that you say has generated the data structure is faulty, as it is missing two closing braces.
You must show either your actual code with balanced { .. } or a dump of %StoredParentshash before we can help you properly.
If Tally and Parent2 are the only keys in the SecondParent hashes, then you should write something like
for my $children (values %StoredParentshash) {
#$children = grep $_->{Tally} != 0, #$children;
}
Your data looks like:
my %StoredParentshash = (
key1 => [
{
Tally => ...,
...
},
...
],
...
);
And you want to delete some of the array elements. Generally, I use grep for that.
#array = grep keep_condition(), #array;
Here is no exception.
for my $array (values(%StoredParentshash)) {
#$array = grep $_->{Tally}, #$array;
}
And to delete any arrays that are now empty:
for my $key (keys(%StoredParentshash)) {
delete $StoredParentshash{$key} if !#{ $StoredParentshash{$key} };
}
Or combined:
for my $key (keys(%StoredParentshash)) {
my $array = $StoredParentshash{$key};
#$array = grep $_->{Tally}, #$array;
delete $StoredParentshash{$key} if !#$array;
}

split array in sections

I have an array with this type of content
a/a/a/test134.html
a/a/a/test223.html
a/b/b/test37.html
a/b/test41.html
a/b/test44.html
a/b/test432.html
a/d/test978.html
a/test.html
I need to split it by "directories" so that I can send each array for a directory into a function (please see code sample).
a/a/a/test134.html
a/a/a/test223.html
a/b/b/test37.html
a/b/test41.html
a/b/test44.html
a/b/test432.html
a/d/test978.html
a/test.html
This is what I have so far but I feel theres lots of bugs especially on end and beginning cases and is not clean enough to my liking.
for(my $i = 0; $i < scalar(#arrayKeys); $i++)
{
my($filename, $directory) = fileparse($arrayKeys[$i]);
my $currDir = $directory;
# $currDir ne $prevDir: takes care of changes in path
# $i + 1 == scalar(#arrayKeys): accounts for last row to be purged
if($currDir ne $prevDir || $i + 1 == scalar(#arrayKeys))
{
# if last row we need to push it
if($i + 1 == scalar(#arrayKeys))
{
push(#sectionArrayKeys, $arrayKeys[$i]);
}
# ensure for first entry run we don't output
if ($prevDir ne "")
{
&output(\#sectionArrayKeys);
}
# Clear Array and start new batch
#sectionArrayKeys = ();
push(#sectionArrayKeys, $arrayKeys[$i]);
}
else
{
push(#sectionArrayKeys, $arrayKeys[$i]);
}
$prevDir = $currDir;
}
Your script is confusing, but from what I understand, you want to split the array of paths into new arrays, depending on their path. Well, easiest way to keep them apart is using a hash, like so:
use warnings;
use strict;
my %dir_arrays;
while (<DATA>) {
chomp;
if (m{^(.+/)([^/]+)$}) {
push #{$dir_arrays{$1}}, $_; # or use $2 for just filename
}
}
use Data::Dumper;
print Dumper \%dir_arrays;
__DATA__
a/a/a/test134.html
a/a/a/test223.html
a/b/b/test37.html
a/b/test41.html
a/b/test44.html
a/b/test432.html
a/d/test978.html
a/test.html
Output:
$VAR1 = {
'a/b/' => [
'a/b/test41.html',
'a/b/test44.html',
'a/b/test432.html'
],
'a/d/' => [
'a/d/test978.html'
],
'a/b/b/' => [
'a/b/b/test37.html'
],
'a/a/a/' => [
'a/a/a/test134.html',
'a/a/a/test223.html'
],
'a/' => [
'a/test.html'
]
};
Now, to send these arrays to a function, do something like this:
for my $key (keys %dir_arrays) {
my_function($dir_arrays{$key}); # this sends an array reference
}
If you prefer to send an array instead of an array reference, just dereference it:
my_function(#{$dir_arrays{$key}});
Edit: Changed the script to store the full path, as it was more in line with the wanted output in the question.