Iterate over a nested hash in perl - perl

I have to iterate over a nested hash in perl and carry out some operations. The structure I have is
$featureGroup = [
{
featureType => "widget",
name => "dpx-shadow-fleet",
parameterMap => { dpxContext => "shadowAtf", dpxEndPoint => "/art/dp/ppd?" },
},
{
featureType => "widget",
name => "dpx-shadow-fleet",
parameterMap => { dpxContext => "shadowBtf", dpxEndPoint => "/art/dp/btf?" },
},
{
features => [
{
featuredesc => [
{
critical => 1,
featureType => "widget",
name => "dpx-ppd",
parameterMap => { dpxContext => "atf", dpxEndPoint => "/art/dp/" },
},
{
featureType => "widget",
name => "error",
parameterMap => { errorMessageId => "error" },
},
],
featureType => "sequence",
},
{
critical => 1,
features => ["encode-landing-image", "image-encoding-error"],
featureType => "sequence",
},
],
handler => "/gp/product/features/embed-landing-image.mi",
name => "embed-landing-image",
pfMetrics => { "" => undef, "start" => sub { "DUMMY" }, "stop" => sub { "DUMMY" } },
type => "custom-grid",
},
];
I want to iterate over the featuredesc subarray and get the value name. I am trying out this.
for(my $i = 0; $i < #$featureGroup; $i++){
if(defined $featureGroup->[$i]->{'features'}){
for(my $j = 0; $j < #$featureGroup->[$i]->{'features'} ; $j++){
print "$featureGroup->[$i]->{'features'}->{'featuredesc}->{name}";
}
}
}
But this is not working. I am not understanding where am I going wrong. Any pointers in the right direction would be useful.

You have a very complex data object there and you have already encountered problems dealing with it. While I could help you address your direct problem, I think you would benefit more from learning how to reduce the complexity.
Perl supports Object Oriented programming. This allows you to take data structures and attach subroutines to them that operate on them. You can read about Perl OO here. I will show you quickly how you can turn the $featureGroup list into a list of objects, and how to access the features that a single object contains. You should apply this technique to every hash in your datastructure (you can tone it back if you are sure that certain inner hashes should not be objects, but it is probably better to start by overdoing it and then scale back rather than the other way around).
This is one of the feature group hashes:
{
'featureType' => 'widget',
'name' => 'dpx-shadow-fleet',
'parameterMap' => {
'dpxContext' => 'shadowAtf',
'dpxEndPoint' => '/art/dp/ppd?'
}
}
In this one you have a featureType, name, and parameterMap. These fields do not appear in every object in your list (in fact the last hash looks quite different to the first two). I will show you how to create an object which requires those three parameters:
package Feature;
use Moose; # You may have to install this
has 'featureType' => (
'is' => 'rw',
'isa' => 'Str'
);
has 'name' => (
'is' => 'rw',
'isa' => 'Str'
);
has 'parameterMap' => (
'is' => 'rw',
'isa' => 'HashRef'
# You could make this accept another object type
# if you convert this inner hash
);
You can then construct your object like so:
my $f = new Feature(
'featureType' => 'widget',
'name' => 'dpx-shadow-fleet',
'parameterMap' => {
'dpxContext' => 'shadowAtf',
'dpxEndPoint' => '/art/dp/ppd?'
}
);
You are then able to access those fields by using the named accessors:
print $f->name; # dpx-shadow-fleet
At the moment this just seems like a longer way to use a hash, right? Well the real benefit comes from being able to define arbitrary subroutines on the class which hide complexity from the caller. So you want to operate on the features array in your original question. Lets define that as a field:
has features => (
is => 'rw',
isa => 'ArrayRef[HashRef]'
# This is an array containing hashes
# You _really_ want to turn the inner hashes into an object here!
);
Then we can operate on them in another subroutine. Lets define one that returns every feature that is a sequence (has a featureType of sequence):
sub get_sequences {
my ($self) = #_;
return grep { $_->{featureType} eq 'sequence' } #{ $self->features };
}
Now when you use an object of this type to get the sequence features all you need to do is:
$f->get_sequences();
If you apply this to all levels of your hash you will find that your code becomes easier to manage. Good luck!

Try this:
for(my $i = 0; $i < #$featureGroup; $i++){
if(defined $featureGroup->[$i]->{'features'}){
for(my $j = 0; $j<scalar #{$featureGroup->[$i]->{'features'}} ; $j++){
for(my $k=0;$k<scalar #{$featureGroup->[$i]->{'features'}->[$j]->{'featuredesc'}};$k++) {
if (defined $featureGroup->[$i]->{'features'}->[$j]->{'featuredesc'}->[$k]->{'name'}) {
print $featureGroup->[$i]->{'features'}->[$j]->{'featuredesc'}->[$k]->{'name'}."\n";
}
}
last if !defined $featureGroup->[$i+1]->{'features'};
}
}
}

Instead of iterated by index, I'd advise that you iterate by element.
This enables one to easily filter each step using grep or next
for my $group (grep {$_->{features}} #$featureGroup) {
for my $feature (grep {$_->{featuredesc}} #{$group->{features}}) {
for my $desc (#{$feature->{featuredesc}}) {
print "$desc->{name}\n"
}
}
}
Outputs:
dpx-ppd
error

Related

Array of Hashrefs: How to access based on hashref "column" values

I have an array of hashrefs built from a database using fethrow_hashref(). The data structure is built like so:
while (my $ref = $sth->fetchrow_hashref()) {
push #lines, $ref;
}
I sort the data in the query by program name ascending, so all of the references in the array are still in alphabetical order. Then, I go through each hash and find the value that is numerically equal to a '1'. I then take the caolumn name, and store it to compare to the rest of the hashrefs with that program name to ensure they all have a '1' in the same column.
my $pgm = "";
my $met_lvl = "";
my #devs = ();
my %errors = ();
my $error = "";
foreach my $line_ref (#lines) {
if ($pgm ne $line_ref->{"PROGRAM"}) {
if (#devs && $error) {
# print " Different number metal layers for $pgm: #devs \n";
$error = "";
}
#devs = ();
$pgm = $line_ref->{"PROGRAM"};
($met_lvl) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
push #devs, $line_ref->{"DEVICE"};
} elsif ($pgm eq $line_ref->{"PROGRAM"}) {
push #devs, $line_ref->{"DEVICE"};
my ($met_chk ) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
if ($met_chk ne $met_lvl) {
$errors{$line_ref->{"PROGRAM"}} = $line_ref->{"PROGRAM"};
$error = "YUP";
}
}
}
I'd like to be able to access the hashrefs individually, based on matching column names from the database. How can I access the hashrefs with "TEST" values for "PROGRAM" keys? I used Data::Dumper to provide an example of a few of the hashrefs I'd like to access based on "PROGRAM" value:
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV1',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR455 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV2',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR456 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV3',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NON_STANDARD',
'PROGRAM' => 'EXP'
};
$VAR457 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV4',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'FINAL'
};
I'd like to be able to access key values for the hashrefs which contain the same program name. I cannot even begin to figure out what type of operation to use for this. I assume map is the correct way to do it, but dereferencing the "PROGAM" value for each element (hashref) in the array is beyond the scope of my understanding. I hope I was able to define the problem well enough for you guys to be able to help.
Edit: The impetus for wanting to access hashrefs with the same 'PROGRAM" value is to be able to provide an output of selected values to print to a logfile. So, after I compare and find differences between those hashrefs with the same "PROGRAM" value, I want to access them all again, and print out the desired column values to the lofgile.
Looks like you need to exrtact subsets of your data (hashrefs) with the same PROGRAM name.
Can preprocess your data to build a hash with those names as keys, and arrayrefs (with suitable hashrefs) as values. Then process those groups one at a time.
use warnings;
use strict;
use feature 'say';
use Data::Dumper; # to print complex data below
... populate #lines with hashrefs as in the question or copy-paste a sample
# Build hash: ( TEST => [ hashrefs w/ TEST ], EXP => [ hashrefs w/ EXP ], ... )
my %prog_subset;
for my $hr (#lines) {
push #{ $prog_subset{$hr->{PROGRAM}} }, $hr;
# Or, using "postfix dereferencing" (stable from v5.24)
# push $prog_subset{$hr->{PROGRAM}}->#*, $hr;
}
foreach my $prog (keys %prog_subset) {
say "\nProcess hashrefs with PROGRAM being $prog";
foreach my $hr (#{ $prog_subset{$prog} }) {
say Dumper $hr;
}
}
(See postfix dereference)
Now %prog_subset contains keys TEST, EXP, FINAL (and whatever other PROGRAM names are in data), each having for value an arrayref of all hashrefs which have that PROGRAM name.
There are other ways, and there are libraries that can be leveraged, but this should do it.
OK! I found an example of this with the google machine. I replaced #lines = (); with $lines = [];. This allowed me to change the grep statement to (#found) = grep { $pgm eq $_->{PROGRAM} } #$lines;. Now the returned array is a list of the hashrefs that share the program name I'm looking for. Thanks for the help #zdim!

How do you get MotherDogRobot to birth an array of puppy objects using map and a hash of hashes?

Puppy meta data gets read in from config file using (General::Config) and creates this hash of hashes
$puppy_hashes = {
puppy_blue => { name => 'charlie', age => 4 },
puppy_red => { name => 'sam', age => 9 },
puppy_yellow => { name => 'jerry', age => 2 },
puppy_green => { name => 'phil', age => 5 },
}
the MotherDogRobot package consumes the puppies hash to birth an array of puppy objects (lol)
package MotherDogRobot;
use Moose;
use Puppy;
use Data::Dumper;
#moose includes warn and strict
sub init_puppy{
my($self,%options) = #_;
my $puppy = Puppy->new( %options );
return ($puppy);
}
sub birth_puppies{
my($self,$puppy_hashes) = #_;
my #keys = keys %{$puppy_hashes};
my #puppies = map { $self->init_puppy( $puppy_hashes->{$_} ) } #keys;
return(#puppies);
}
sub show_me_new_puppies{
my($self,$puppy_hashes) #_;
print Dumper($self->birth_puppies($puppy_hashes));
}
Error odd number of arguments
passing %options to Puppy->new(%options)
no luck birthing puppies -- which means I can't put lasers on their heads =/
UPDATE
I think the problem is that I'm passing a Hash Ref to init_puppy() instead of an array or hash, so when I try to pass %options to the new constructor, it's not getting a proper ( key => value) pair -- hence the odd number of arguments error.
But from this standpoint I've been looking at this code too long I cant figure out how to dereference this properly.
btw this is my official day 22 of using Perl!
you're using empty variables as if they're not empty, that is, you're not doing anything at all
print "hi $_ " for my #foo;
This assumes that the incomplete snippet you've shown is what you're really using
update: Similarly in sub init_puppy, you never initialize my($self,%options)=#_;
#!/usr/bin/perl --
use strict;
use warnings;
Main( #ARGV );
exit( 0 );
sub Main {
my $puppy_hashes = {
puppy_blue => { name => 'charlie', age => 4 },
puppy_red => { name => 'sam', age => 9 },
puppy_yellow => { name => 'jerry', age => 2 },
puppy_green => { name => 'phil', age => 5 },
};
for my $puppy ( MotherDogRobot->birth_puppies($puppy_hashes) ) {
print join ' ', $puppy, $puppy->name, $puppy->age, $puppy->dump, "\n";
}
}
BEGIN {
package Puppy;
BEGIN { $INC{'Puppy.pm'} = __FILE__; }
use Any::Moose;
has 'name' => ( is => 'rw', isa => 'Str' );
has 'age' => ( is => 'rw', isa => 'Int' );
package MotherDogRobot;
BEGIN { $INC{'MotherDogRobot.pm'} = __FILE__; }
use Moose;
use Puppy;
sub init_puppy {
my ( $self, %options ) = #_;
my $puppy = Puppy->new(%options);
return ($puppy);
}
sub birth_puppies {
my ( $self, $puppy_hashes ) = #_;
my #puppies = map { $self->init_puppy( %{$_} ) } values %$puppy_hashes;
return (#puppies);
}
no Moose;
}
The standard Moose constructor will accept both
->new( %{ $puppy_hashes->{$_} } )
and
->new( $puppy_hashes->{$_} )
if $puppy_hashes contains what you say it does, and $_ is an existing key.
Furthermore, Moose will not give the error Error odd number of argments when you pass no arguments. (You're not assigning anything to %config.)
I can't tell which part of what you said is wrong, but what you said doesn't add up.

Perl, #array in perl constructor

I write perl classes, but I don't know how to have a array or a hash in my $this variable ?
I have a pack.pm :
#!/usr/bin/perl -w
use strict;
use Parallel::ForkManager;
package Pack;
our $cgi = new CGI;
sub new {
my ($classe, $nom, $nbports, $gio) = #_;
my $this = {
"nom" => $nom,
"nbports" => $nbports,
"gio" => $gio
};
bless($this, $classe);
return $this;
}
...
1;
I would like to have a #tab, I can access via $this->tab, but I don't want to give it in arg to the instance.
How does it work in Perl ?
Thanks.
Given your answer to my comments, I think you want
my($this) = {
"nom" => $nom,
"nbports" => $nbports,
"gio" => $gio,
"tab" => []
};
i.e. set $this->{tab} to be a reference to a new anonymous array.
Now you can reference it as you wish, e.g.
$this->{"tab"}[0] = "new value";
print "Table contains ", scalar(#{$this->{"tab"}}), "entries\n";
Consider using Moose for your OO Perl needs.
I've created a Moose version of your object that includes an attribute with an attribute featuring Array trait delegation, inlcuding currying of delegated methods. Moose offers easy ways to generate powerful, encapsulated classes without writing reams of boilerplate.
I created a class Pack with attributes: nom, nbports, gio, and tab.
nom is a read-only string and is required when the object is created.
nbports is a read-only integer value and defaults to 32 when not provided.
gio is an optional, read-write boolean value.
tab is an array of strings. All sorts of behavior has been defined for tab:
all_tabs returns a list of the contents of tabs
add_tab pushes values onto the end of tabs
tab_count returns a count of the elements in tabs
alpha_tabs returns a list of the members of tabs alphabetical order
turn_tabs returns a list of the strings in tabs, but with the letters in reverse
Any attempts to set an attribute with be checked for type correctness.
Moose creates all the required methods to support these complex behaviors with the following code:
package Pack;
use Moose;
has 'nom' => (
is => 'ro',
isa => 'Str',
required => 1,
);
has 'nbports' => (
is => 'ro',
isa => 'Int',
default => 32,
);
has 'gio' => (
is => 'rw',
isa => 'Bool',
predicate => 'has_gio',
);
has 'tab' => (
traits => ['Array'],
is => 'ro',
isa => 'ArrayRef[Str]',
default => sub {[]},
handles => {
all_tabs => 'elements',
add_tab => 'push',
turn_tabs => [ 'map', sub { reverse } ],
tab_count => 'count',
alpha_tabs => [ 'sort', sub { lc($a) cmp lc($b) } ],
},
);
__PACKAGE__->meta->make_immutable;
no Moose;
1;
Usable like so:
my $p = Pack->new( nom => 'Roger', tab => [qw( fee fie foe fum )] );
my $gio_state = 'UNSET';
if( $p->has_gio ) {
$gio_state = $p->gio ? 'TRUE' : 'FALSE';
}
print "GIO is $gio_state\n";
my #turned = $p->turn_tabs; # eef eif eof muf
$p->add_tabs( 'faa', 'fim' );
my #sorted = $p->alpha_tabls; # faa fee fie fim foe fum
my $count = $p->tab_count; # 6
my $ports = $p->nbports; # 32
try with:
sub set_tab {
my ($self, #tab) = #_;
$self->{ tab } = \#tab;
}

Perl - Removing unwanted elements from an arrayref

I'm writing a script that parses the "pure-ftpwho -s" command to get a list of the current transfers. But when a user disconnects from the FTP and reconnects back and resumes a transfer, the file shows up twice. I want to remove the ghosted one with Perl. After parsing, here is what the arrayref looks like (dumped with Data::Dumper)
$VAR1 = [
{
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '63',
'speed' => '11',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '16',
'speed' => '60',
'file' => 'somefile.txt',
'user' => 'user2',
'size' => '14648'
}
];
Here user1 and user2 are downloading the same file, but user1 appears twice because the first one is a "ghost". What's the best way to check and remove elements that I don't need (in this case the first element of the arrayref). The condition to check is that - If the "file" key and "user" key is the same, then delete the hashref that contains the smaller value of "percent" key (if they're the same then delete all except one).
If order in the original arrayref doesn't matter, this should work:
my %users;
my #result;
for my $data (#$arrayref) {
push #{ $users{$data->{user}.$data->{file}} }, $data;
}
for my $value (values %users) {
my #data = sort { $a->{percent} <=> $b->{percent} } #$value;
push #result, $data[-1];
}
This can definitely be improved for efficiency.
The correct solution in this case would have been to use a hash when parsing the log file. Put all information into a hash, say %log, keyed by user and file:
$log{$user}->{$file} = {
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'size' => '14648'
};
etc. Latter entries in the log file would overwrite earlier ones. Alternatively, you can choose to overwrite entries with lower percent completed with ones that have higher completion rates.
Using a hash would get rid of a lot of completely superfluous code working around the choice of the wrong data structure.
For what it's worth, here's my (slightly) alternative approach. Again, it doesn't preserve the original order:
my %most_progress;
for my $data ( sort { $b->{percent} <=> $a->{percent} } #$data ) {
next if exists $most_progress{$data->{user}.$data->{file}};
$most_progress{$data->{user}.$data->{file}} = $data;
}
my #clean_data = values %most_progress;
This will preserve order:
use strict;
use warnings;
my $data = [ ... ]; # As posted.
my %pct;
for my $i ( 0 .. $#{$data} ){
my $r = $data->[$i];
my $k = join '|', $r->{file}, $r->{user};
next if exists $pct{$k} and $pct{$k}[1] >= $r->{percent};
$pct{$k} = [$i, $r->{percent}];
}
#$data = #$data[sort map $_->[0], values %pct];
my %check;
for (my $i = 0; $i <= $#{$arrayref}; $i++) {
my $transfer = $arrayref->[$i];
# check the transfer for user and file
my $key = $transfer->{user} . $transfer->{file};
$check{$key} = { } if ( !exists $check{$key} );
if ( $transfer->{percent} <= $check{$key}->{percent} ) {
# undefine this less advanced transfer
$arrayref->[$i] = undef;
} else {
# remove the other transfer
$arrayref->[$check{$key}->{index}] = undef if exists $check{$key}->{index};
# set the new standard
$check{$key} = { index => $i, percent => $transfer->{percent} }
}
}
# remove all undefined transfers
$arrayref = [ grep { defined $_ } #$arrayref ];
Variation on the theme with Perl6::Gather
use Perl6::Gather;
my #cleaned = gather {
my %seen;
for (sort { $b->{percent} <=> $a->{percent} } #$data) {
take unless $seen{ $_->{user} . $_->{file} }++;
}
};

How do I create and add anonymous hashes to a known Hash during script execution?

I'll attempt to illustrate this with an example. Take a common example of a Hash of Hashes:
my %HoH = (
flintstones => {
lead => "fred",
pal => "barney",
},
jetsons => {
lead => "george",
wife => "jane",
"his boy" => "elroy",
},
simpsons => {
lead => "homer",
wife => "marge",
kid => "bart",
},
);
For my purposes, I would like to be able to add an unnamed, or anonymous hashes to %HOH. I won't need (or be able to) define these sub-hashes until runtime. How can I accomplish this with Perl?
Everything I've read (and I have read through Perldocs and Google'd already) seems to show examples where all sub-hahes (e.g. "flintstones", "jetsons", and "simpsons") are defined.
What I am doing is attempting to build a parent Hash that will contain sub-hashes with rows from a CSV file:
%TopHash = (
%Line1 => {
cell01 => $some_value1a;
cell02 => $some_value2a;
cell03 => $some_value3a;
},
%Line2 => {
cell01 => $some_value1b;
cell02 => $some_value2b;
cell03 => $some_value3b;
},
%Line3 => {
cell01 => $some_value1c;
cell02 => $some_value2c;
cell03 => $some_value3c;
},
# etc
# etc
# etc
);
The number of "%LineX" hashes that I need is not known until runtime (because they represent the number of lines in a CSV that is read at runtime).
Any ideas? If it isn't clear already...I still am trying to wrap my head around Perl hashes.
To add an anonymous hash at runtime, assign it as you would a normal hash element:
$HoH{key} = { foo => 42 };
or
$HoH{key} = $hash_ref;
or
$HoH{key} = \%hash;
First you create the hash from the current line you're parsing
my %lineHash = (
cell01 => $some_value1a,
cell02 => $some_value1b,
cell03 => $some_value1c
);
or create a reference to a hash outright
my $lineHashRef = {
cell01 => $some_value2a,
cell02 => $some_value2b,
cell03 => $some_value2c
};
Then you add it to your overall hash, remembering that nested perl structures just contain references to the other structures.
$topHash{line1} = \%lineHash;
$topHash{line2} = $lineHashRef;
Updated
Example given a loop over an array of data to parse
my %topHash;
foreach my $i (0 .. $#data) {
my %tempHash;
// stuff here to parse $data[$i] and populate %tempHash
$topHash{"line$i"} = \%tempHash;
}
#!/usr/bin/perl
use strict;
my %HoH = (
line01 => {
cell01 => "cell0101",
cell02 => "cell0102",
cell03 => "cell0103"
}
);
$HoH{"line02"} =
{
cell01 => "cell0201",
cell02 => "cell0202",
cell03 => "cell0203"
};
foreach my $hohKey (keys %HoH)
{
my $newHash = $HoH{$hohKey};
print "Line Name: $hohKey\n";
foreach my $key (keys %$newHash)
{
print "\t$key => ", $newHash->{$key}, "\n";
}
}
Everytime you create a new hash from a line of data, you'll need to think of a unique key to store that data in your top hash table.
my $line = 1;
my %HoH;
while (<>) {
my ($cell01, $cell02, $cell03, #etc) = split /,/;
my $newHash = { cell01 => $cell01, cell02 => $cell02, ... };
my $key = "line$line";
$HoH{$key} = $newHash;
$line++;
}
Now keys(%HoH) will return a (unsorted) list like "line1","line2","line3",....
$HoH{"line5"} would return a reference to the data for the 5th line of your file.
%{$HoH{"line7"}} is kind of ugly syntax but it returns a hashtable of your data
from line 7.
$HoH{"line14"}{"cell02"} could be used to get at a specific piece of data.