Array of Hashrefs: How to access based on hashref "column" values - perl

I have an array of hashrefs built from a database using fethrow_hashref(). The data structure is built like so:
while (my $ref = $sth->fetchrow_hashref()) {
push #lines, $ref;
}
I sort the data in the query by program name ascending, so all of the references in the array are still in alphabetical order. Then, I go through each hash and find the value that is numerically equal to a '1'. I then take the caolumn name, and store it to compare to the rest of the hashrefs with that program name to ensure they all have a '1' in the same column.
my $pgm = "";
my $met_lvl = "";
my #devs = ();
my %errors = ();
my $error = "";
foreach my $line_ref (#lines) {
if ($pgm ne $line_ref->{"PROGRAM"}) {
if (#devs && $error) {
# print " Different number metal layers for $pgm: #devs \n";
$error = "";
}
#devs = ();
$pgm = $line_ref->{"PROGRAM"};
($met_lvl) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
push #devs, $line_ref->{"DEVICE"};
} elsif ($pgm eq $line_ref->{"PROGRAM"}) {
push #devs, $line_ref->{"DEVICE"};
my ($met_chk ) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
if ($met_chk ne $met_lvl) {
$errors{$line_ref->{"PROGRAM"}} = $line_ref->{"PROGRAM"};
$error = "YUP";
}
}
}
I'd like to be able to access the hashrefs individually, based on matching column names from the database. How can I access the hashrefs with "TEST" values for "PROGRAM" keys? I used Data::Dumper to provide an example of a few of the hashrefs I'd like to access based on "PROGRAM" value:
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV1',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR455 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV2',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR456 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV3',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NON_STANDARD',
'PROGRAM' => 'EXP'
};
$VAR457 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV4',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'FINAL'
};
I'd like to be able to access key values for the hashrefs which contain the same program name. I cannot even begin to figure out what type of operation to use for this. I assume map is the correct way to do it, but dereferencing the "PROGAM" value for each element (hashref) in the array is beyond the scope of my understanding. I hope I was able to define the problem well enough for you guys to be able to help.
Edit: The impetus for wanting to access hashrefs with the same 'PROGRAM" value is to be able to provide an output of selected values to print to a logfile. So, after I compare and find differences between those hashrefs with the same "PROGRAM" value, I want to access them all again, and print out the desired column values to the lofgile.

Looks like you need to exrtact subsets of your data (hashrefs) with the same PROGRAM name.
Can preprocess your data to build a hash with those names as keys, and arrayrefs (with suitable hashrefs) as values. Then process those groups one at a time.
use warnings;
use strict;
use feature 'say';
use Data::Dumper; # to print complex data below
... populate #lines with hashrefs as in the question or copy-paste a sample
# Build hash: ( TEST => [ hashrefs w/ TEST ], EXP => [ hashrefs w/ EXP ], ... )
my %prog_subset;
for my $hr (#lines) {
push #{ $prog_subset{$hr->{PROGRAM}} }, $hr;
# Or, using "postfix dereferencing" (stable from v5.24)
# push $prog_subset{$hr->{PROGRAM}}->#*, $hr;
}
foreach my $prog (keys %prog_subset) {
say "\nProcess hashrefs with PROGRAM being $prog";
foreach my $hr (#{ $prog_subset{$prog} }) {
say Dumper $hr;
}
}
(See postfix dereference)
Now %prog_subset contains keys TEST, EXP, FINAL (and whatever other PROGRAM names are in data), each having for value an arrayref of all hashrefs which have that PROGRAM name.
There are other ways, and there are libraries that can be leveraged, but this should do it.

OK! I found an example of this with the google machine. I replaced #lines = (); with $lines = [];. This allowed me to change the grep statement to (#found) = grep { $pgm eq $_->{PROGRAM} } #$lines;. Now the returned array is a list of the hashrefs that share the program name I'm looking for. Thanks for the help #zdim!

Related

Perl pass hash reference to Subroutine from Foreach loop (Array of Hash)

This may be something very simple for you but i have been trying for it for more than one hour.
Ok.. Here is my code,
#collection = [
{
'name' => 'Flo',
'model' => '23423',
'type' => 'Associate',
'id' => '1-23928-2392',
'age' => '23',
},
{
'name' => 'Flo1',
'model' => '23424',
'type' => 'Associate2',
'id' => '1-23928-23922',
'age' => '25',
}];
foreach my $row (#collection) {
$build_status = $self->build_config($row);
}
sub build_config {
my $self = shift;
my %collect_rows = #_;
#my %collect_rows = shift;
print Dumper %collect_rows; exit;
exit;
}
Where $row is really an Hash. But when i Print it.. It gives output like below (I am using Dumper),
$VAR1 = 'HASH(0x20a1d68)';
You are confusing references with hashes and arrays.
An array is declared with:
my #array = (1, 2, 3);
A hash is declared with:
my %hash = (1 => 2, 3 => 4);
That's because both hashes and arrays are simply lists (fancy lists, but I digress). The only time you need to use the [] and {} is when you want to use the values contained in the list, or you want to create a reference of either list (more below).
Note that the => is just a special (ie. fat) comma, that quotes the left-hand side, so although they do the same thing, %h = (a, 1) would break, %h = ("a", 1) works fine, and %h = (a => 1) also works fine because the a gets quoted.
An array reference is declared as such:
my $aref = [1, 2, 3];
...note that you need to put the array reference into a scalar. If you don't and do it like this:
my #array = [1, 2, 3];
... the reference is pushed onto the first element of #array, which is probably not what you want.
A hash reference is declared like this:
my $href = {a => 1, b => 2};
The only time [] is used on an array (not an aref) is when you're using it to use an element: $array[1];. Likewise with hashes, unless it's a reference, you only use {} to get at a key's value: $hash{a}.
Now, to fix your problem, you can keep using the references with these changes:
use warnings;
use strict;
use Data::Dumper;
# declare an array ref with a list of hrefs
my $collection = [
{
'name' => 'Flo',
...
},
{
'name' => 'Flo1',
...
}
];
# dereference $collection with the "circumfix" operator, #{}
# ...note that $row will always be an href (see bottom of post)
foreach my $row (#{ $collection }) {
my $build_status = build_config($row);
print Dumper $build_status;
}
sub build_config {
# shift, as we're only accepting a single param...
# the $row href
my $collect_rows = shift;
return $collect_rows;
}
...or change it up to use non-refs:
my #collection = (
{
'name' => 'Flo',
...
},
{
'name' => 'Flo1',
...
}
);
foreach my $row (#collection) {
my $build_status = build_config($row);
# build_config() accepts a single href ($row)
# and in this case, returns an href as well
print Dumper $build_status;
}
sub build_config {
# we've been passed in a single href ($row)
my $row = shift;
# return an href as a scalar
return $row;
}
I wrote a tutorial on Perl references you may be interested in guide to Perl references. There's also perlreftut.
One last point... the hashes are declared with {} inside of the array because they are indeed hash references. With multi-dimentional data, only the top level of the structure in Perl can contain multiple values. Everything else underneath must be a single scalar value. Therefore, you can have an array of hashes, but technically and literally, it's an array containing scalars where their value is a pointer (reference) to another structure. The pointer/reference is a single value.

How to incorporate hash inside hash in perl?

my %book = (
'name' => 'abc',
'author' => 'monk',
'isbn' => '123-890',
'issn' => '#issn',
);
my %chapter = (
'title' => 'xyz',
'page' => '90',
);
How do I incorporate %book inside %chapter through reference so that when I write "$chapter{name}", it should print 'abc'?
You can copy the keys/values of the %book into the %chapter:
#chapter{keys %book} = values %book;
Or something like
%chapter = (%chapter, %book);
Now you can say $chapter{name}, but changes in %book are not reflected in %chapter.
You can include the %book via reference:
$chapter{book} = \%book;
Now you could say $chapter{book}{name}, and changes do get reflected.
To have an interface that allows you to say $chapter{name} and that does reflect changes, some advanced techniques would have to be used (this is fairly trivial with tie magic), but don't go there unless you really have to.
You could write a subroutine to check a list of hashes for a key. This program demonstrates:
use strict;
use warnings;
my %book = (
name => 'abc',
author => 'monk',
isbn => '123-890',
issn => '#issn',
);
my %chapter = (
title => 'xyz',
page => '90',
);
for my $key (qw/ name title bogus / ) {
print '>> ', access_hash($key, \%book, \%chapter), "\n";
}
sub access_hash {
my $key = shift;
for my $hash (#_) {
return $hash->{$key} if exists $hash->{$key};
}
undef;
}
output
Use of uninitialized value in print at E:\Perl\source\ht.pl line 17.
>> abc
>> xyz
>>

Perl Hash of Hashes Storing References To Within - Can't use string ("") as a HASH ref

Alright I have a function which generates a hash tree that dumper prints out to look like this:
$VAR1 = {
'shaders' => {
'stock_gui.vert' => '',
'stock_font.vert' => '',
'stock_gui.frag' => '',
'stock_font.frag' => ''
},
'textures' => {},
'fonts' => {
'DroidSansMono.ttf' => '',
'small' => {
'DroidSansMono.ttf' => ''
}
}
};
Now I am trying to dfs iterate for example the fonts sub tree:
push (#stack, \%{$myHash->{'fonts'}});
Then in a loop:
my $tmpHash = pop(#stack);
Then a dumper of $tmpHash shows:
$VAR1 = {
'DroidSansMono.ttf' => '',
'small' => {
'DroidSansMono.ttf' => ''
}
};
The problem is trying access the children of the hash reference. I can count the keys and see the children. The dumper output looks okay. However trying to do something like:
foreach my $childKey ( keys $tmpHash ){
my $subChildrenCount = scalar keys(%{$tmpHash->{$childKey}});
}
Yields the error:
Can't use string ("") as a HASH ref while "strict refs" in use
I think this is because $tmpHash is a hash reference. I likely just need to dereference it somewhere. I've tried many things and all yields further issues. Any help appreciated.
If I try:
%{$tmpHash->{'small'}}
Then it works fine.
UPDATE:
If the string contains a '.' then this error occurs. Hard coding 'small' works. Hard coding 'stock_gui.vert' fails unless I escape the '.'. However the keys do not match if I escape the dot...
As you can see by running it yourself,
use strict;
use warnings;
my $tmpHash = {
'DroidSansMono.ttf' => '',
'small' => {
'DroidSansMono.ttf' => ''
}
};
my $subChildrenCount = scalar keys(%{$tmpHash->{'small'}});
the code you say gives that error does not actually give that error. I suspect you are actually doing
my $subChildrenCount = scalar keys(%{$tmpHash->{'DroidSansMono.ttf'}});
Your hash format doesn't make much sense. It mixes field names and actual data as keys.

troubleshooting "pseudo-hashes are deprecated" while using xml module

I am just learning how to use perl hashes and ran into this message in perl. I am using XML::Simple to parse xml output and using exists to check on the hash keys.
Message:
Pseudo-hashes are deprecated at ./h2.pl line 53.
Argument "\x{2f}\x{70}..." isn't numeric in exists at ./h2.pl line 53.
Bad index while coercing array into hash at ./h2.pl line 53.
I had the script working earlier with one test directory and then executed the script on another directory for testing when I got this message. How do I resolve/workaround this?
Code that the error references:
use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
#my $data = XMLin($xml);
my $data = XMLin($xml, ForceArray => [qw (file) ]);
my $size=0;
if (exists $data->{class}
and $data->{class}=~ /FileNotFound/) {
print "The directory: $Path does not exist\n";
exit;
} elsif (exists $data->{file}->{path}
and $data->{file}->{path} =~/test-out-00/) {
$size=$data->{file}->{size};
if ($size < 1024000) {
print "FILE SIZE:$size BYTES\n";
exit;
}
} else {
exit;
}
print Dumper( $data );
Working test case, data structure looks like this:
$VAR1 = {
'recursive' => 'no',
'version' => '0.20.202.1.1101050227',
'time' => '2011-09-30T02:49:39+0000',
'filter' => '.*',
'file' => {
'owner' => 'test_act',
'replication' => '3',
'blocksize' => '134217728',
'permission' => '-rw-------',
'path' => '/source/feeds/customer/test/test-out-00',
'modified' => '2011-09-30T02:48:41+0000',
'size' => '135860644',
'group' => '',
'accesstime' => '2011-09-30T02:48:41+0000'
'modified' => '2011-09-30T02:48:41+0000'
},
'exclude' => ''
};
recursive:no
version:0.20.202.1.1101050227
time:2011-10-01T07:06:16+0000
filter:.*
file:HASH(0x84c83ec)
path:/source/feeds/customer/test
directory:HASH(0x84c75d8)
exclude:
Data structure with seeing error:
$VAR1 = {
'recursive' => 'no',
'version' => '0.20.202.1.1101050227',
'time' => '2011-10-03T04:49:36+0000',
'filter' => '.*',
'file' => [
{
'owner' => 'test_act',
'replication' => '3',
'blocksize' => '134217728',
'permission' => '-rw-------',
'path' => '/source/feeds/customer/test/20110531/test-out-00',
'modified' => '2011-10-03T04:47:46+0000',
'size' => '121406618',
'group' => 'feeds',
'accesstime' => '2011-10-03T04:47:46+0000'
},
Test xml file:
<?xml version="1.0" encoding="UTF-8"?><listing time="2011-10-03T04:49:36+0000" recursive="no" path="/source/feeds/customer/test/20110531" exclude="" filter=".*" version="0.20.202.1.1101050227"><directory path="/source/feeds/customer/test/20110531" modified="2011-10-03T04:48:19+0000" accesstime="1970-01-01T00:00:00+0000" permission="drwx------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-00" modified="2011-10-03T04:47:46+0000" accesstime="2011-10-03T04:47:46+0000" size="121406618" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-01" modified="2011-10-03T04:48:04+0000" accesstime="2011-10-03T04:48:04+0000" size="127528522" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-02" modified="2011-10-03T04:48:19+0000" accesstime="2011-10-03T04:48:19+0000" size="125452919" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/></listing>
The "Pseudo-hashes are deprecated" error means you're trying to access an array as a hash, which means that either $data->{file} or $data->{file}{path} is an arrayref.
You can check the data type by using print ref $data->{file}. The Data::Dumper module may also help you to see what is in your data structure (perhaps while setting $Data::Dumper::Maxdepth = N to limit the dump to N number of levels if the structure is big).
UPDATE
Now that you are using ForceArray, $data->{file} should always point to an arrayref, which may possibly have multiple references to path. Here is a modified segment of your code to handle that. But note that the logic of the if-then-exit conditions may have to change.
if (defined $data->{class} and $data->{class}=~ /FileNotFound/) {
print "The directory: $Path does not exist\n";
exit;
}
exit if ! defined $data->{file};
# filter the list for the first file entry named test-out-00
my ( $file ) = grep {
defined $_->{path} && $_->{path} =~ /test-out-00/
} #{ $data->{file} };
exit if ! defined $file;
$size = $file->{size};
if ($size < 1024000) {
print "FILE SIZE:$size BYTES\n";
exit;
}
When using XML::Simple, the ForceArray option is one of the most important to understand, especially in cases when your input data has nested elements that can occur 1 or more times. For example:
use XML::Simple;
use Data::Dumper;
my #xml_snippets = (
'<opt> <name x="3" y="4">B</name> <name x="5" y="6">C</name> </opt>',
'<opt> <name x="1" y="2">A</name> </opt>',
);
for my $xs (#xml_snippets){
my $data = XMLin($xs, ForceArray => 0);
print Dumper($data);
}
Output:
$VAR1 = {
'name' => [ # Array ref because there are 2 <name> elements.
{
'y' => '4',
'content' => 'B',
'x' => '3'
},
{
'y' => '6',
'content' => 'C',
'x' => '5'
}
]
};
$VAR1 = {
'name' => { # No intermediate array ref.
'y' => '2',
'content' => 'A',
'x' => '1'
}
};
By activating the ForceArray option, you can direct XML::Simple to produce consistent data structures that always use the intermediate array reference, even when there is only 1 of a particular nested element. You can activate the option globally or for specific tags, as illustrated here:
my $data = XMLin($xs, ForceArray => 1 ); # Globally.
my $data = XMLin($xs, ForceArray => [qw(name foo bar)]);
First, I recommend that you use ForceArray => [qw( file )] as previously discussed. That will cause an array to be returned for file, whether there's one or more file element. This is easier to handle than having two possible formats.
As I previously indicated, the problem is that you made no provision for looping over multiple file elements. You said you wanted to exit if the file doesn't exist, so that means you want
my $found;
for my $file (#{ $data->{file} }) {
if ($file->{path} =~ m{/test-out-00\z}) {
$found = $file;
last;
}
}
die("Test file not found\n") if !$found;
... do something with file data in $found ...

Perl - Removing unwanted elements from an arrayref

I'm writing a script that parses the "pure-ftpwho -s" command to get a list of the current transfers. But when a user disconnects from the FTP and reconnects back and resumes a transfer, the file shows up twice. I want to remove the ghosted one with Perl. After parsing, here is what the arrayref looks like (dumped with Data::Dumper)
$VAR1 = [
{
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '63',
'speed' => '11',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '16',
'speed' => '60',
'file' => 'somefile.txt',
'user' => 'user2',
'size' => '14648'
}
];
Here user1 and user2 are downloading the same file, but user1 appears twice because the first one is a "ghost". What's the best way to check and remove elements that I don't need (in this case the first element of the arrayref). The condition to check is that - If the "file" key and "user" key is the same, then delete the hashref that contains the smaller value of "percent" key (if they're the same then delete all except one).
If order in the original arrayref doesn't matter, this should work:
my %users;
my #result;
for my $data (#$arrayref) {
push #{ $users{$data->{user}.$data->{file}} }, $data;
}
for my $value (values %users) {
my #data = sort { $a->{percent} <=> $b->{percent} } #$value;
push #result, $data[-1];
}
This can definitely be improved for efficiency.
The correct solution in this case would have been to use a hash when parsing the log file. Put all information into a hash, say %log, keyed by user and file:
$log{$user}->{$file} = {
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'size' => '14648'
};
etc. Latter entries in the log file would overwrite earlier ones. Alternatively, you can choose to overwrite entries with lower percent completed with ones that have higher completion rates.
Using a hash would get rid of a lot of completely superfluous code working around the choice of the wrong data structure.
For what it's worth, here's my (slightly) alternative approach. Again, it doesn't preserve the original order:
my %most_progress;
for my $data ( sort { $b->{percent} <=> $a->{percent} } #$data ) {
next if exists $most_progress{$data->{user}.$data->{file}};
$most_progress{$data->{user}.$data->{file}} = $data;
}
my #clean_data = values %most_progress;
This will preserve order:
use strict;
use warnings;
my $data = [ ... ]; # As posted.
my %pct;
for my $i ( 0 .. $#{$data} ){
my $r = $data->[$i];
my $k = join '|', $r->{file}, $r->{user};
next if exists $pct{$k} and $pct{$k}[1] >= $r->{percent};
$pct{$k} = [$i, $r->{percent}];
}
#$data = #$data[sort map $_->[0], values %pct];
my %check;
for (my $i = 0; $i <= $#{$arrayref}; $i++) {
my $transfer = $arrayref->[$i];
# check the transfer for user and file
my $key = $transfer->{user} . $transfer->{file};
$check{$key} = { } if ( !exists $check{$key} );
if ( $transfer->{percent} <= $check{$key}->{percent} ) {
# undefine this less advanced transfer
$arrayref->[$i] = undef;
} else {
# remove the other transfer
$arrayref->[$check{$key}->{index}] = undef if exists $check{$key}->{index};
# set the new standard
$check{$key} = { index => $i, percent => $transfer->{percent} }
}
}
# remove all undefined transfers
$arrayref = [ grep { defined $_ } #$arrayref ];
Variation on the theme with Perl6::Gather
use Perl6::Gather;
my #cleaned = gather {
my %seen;
for (sort { $b->{percent} <=> $a->{percent} } #$data) {
take unless $seen{ $_->{user} . $_->{file} }++;
}
};