insert anonymous hash into anonymous hash for counting in a loop - perl

I'm trying to count starts and stops of some services i keep track of in logs.
I'm not going to past here entire code, but my way of doing hash is this:
I'm passing those starts and stops into anonymous hash .
First I'm creating anonymous hash filled with keys and values (in my case $knot is a key an zeros are values). Next im replaqcing values with another hash.
My code looks like this:
foreach $knot (#knots){
chomp $knot;
$variable = $variable."$knot;0;";
$Services = {split(/;/,$variable)};
}
my $data =
{
Starts=>'0',
Stops=>'0',
};
foreach my $key (keys %$Services) {
$Services->{$key} = $data;
}
print Dumper $Services;
Printing shows:
$VAR1 = {
' knot1' => {
'Stops' => '0',
'Starts' => '0'
},
' knot2' => $VAR1->{' knot1'},
' knot3' => $VAR1->{' knot1'},
' knot4' => $VAR1->{' knot1'},
' knot5' => $VAR1->{' knot1'},
and so on. Is there a better way of doing this? My way if i'm correct is badly written because changing knot1 starts/stops changes every other knot values.

Counting is very simple in Perl, thanks to Autovivification. You can just create anonymous data structures on the fly, like so:
use Data::Dumper;
my %hash = ();
$hash{apple}{green}++;
$hash{apple}{red} ++;
$hash{pear}{yellow}++;
$hash{apple}{green}++;
$hash{apple}{red} ++;
$hash{apple}{green}++;
print Dumper(\%hash);
This will produce the desired structure for counting:
$VAR1 = {
'apple' => {
'green' => 3,
'red' => 2
},
'pear' => {
'yellow' => 1
}
};
This also works in loops using variables (here using a reference to a hash):
my $hash_ref = {};
for my $fruit (qw( apple pear apple peach apple pear )) {
$hash_ref->{$fruit}++;
}
print Dumper($hash_ref);
resulting in:
$VAR1 = {
'peach' => 1,
'pear' => 2,
'apple' => 3
};

Related

Array of Hashrefs: How to access based on hashref "column" values

I have an array of hashrefs built from a database using fethrow_hashref(). The data structure is built like so:
while (my $ref = $sth->fetchrow_hashref()) {
push #lines, $ref;
}
I sort the data in the query by program name ascending, so all of the references in the array are still in alphabetical order. Then, I go through each hash and find the value that is numerically equal to a '1'. I then take the caolumn name, and store it to compare to the rest of the hashrefs with that program name to ensure they all have a '1' in the same column.
my $pgm = "";
my $met_lvl = "";
my #devs = ();
my %errors = ();
my $error = "";
foreach my $line_ref (#lines) {
if ($pgm ne $line_ref->{"PROGRAM"}) {
if (#devs && $error) {
# print " Different number metal layers for $pgm: #devs \n";
$error = "";
}
#devs = ();
$pgm = $line_ref->{"PROGRAM"};
($met_lvl) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
push #devs, $line_ref->{"DEVICE"};
} elsif ($pgm eq $line_ref->{"PROGRAM"}) {
push #devs, $line_ref->{"DEVICE"};
my ($met_chk ) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
if ($met_chk ne $met_lvl) {
$errors{$line_ref->{"PROGRAM"}} = $line_ref->{"PROGRAM"};
$error = "YUP";
}
}
}
I'd like to be able to access the hashrefs individually, based on matching column names from the database. How can I access the hashrefs with "TEST" values for "PROGRAM" keys? I used Data::Dumper to provide an example of a few of the hashrefs I'd like to access based on "PROGRAM" value:
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV1',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR455 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV2',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR456 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV3',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NON_STANDARD',
'PROGRAM' => 'EXP'
};
$VAR457 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV4',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'FINAL'
};
I'd like to be able to access key values for the hashrefs which contain the same program name. I cannot even begin to figure out what type of operation to use for this. I assume map is the correct way to do it, but dereferencing the "PROGAM" value for each element (hashref) in the array is beyond the scope of my understanding. I hope I was able to define the problem well enough for you guys to be able to help.
Edit: The impetus for wanting to access hashrefs with the same 'PROGRAM" value is to be able to provide an output of selected values to print to a logfile. So, after I compare and find differences between those hashrefs with the same "PROGRAM" value, I want to access them all again, and print out the desired column values to the lofgile.
Looks like you need to exrtact subsets of your data (hashrefs) with the same PROGRAM name.
Can preprocess your data to build a hash with those names as keys, and arrayrefs (with suitable hashrefs) as values. Then process those groups one at a time.
use warnings;
use strict;
use feature 'say';
use Data::Dumper; # to print complex data below
... populate #lines with hashrefs as in the question or copy-paste a sample
# Build hash: ( TEST => [ hashrefs w/ TEST ], EXP => [ hashrefs w/ EXP ], ... )
my %prog_subset;
for my $hr (#lines) {
push #{ $prog_subset{$hr->{PROGRAM}} }, $hr;
# Or, using "postfix dereferencing" (stable from v5.24)
# push $prog_subset{$hr->{PROGRAM}}->#*, $hr;
}
foreach my $prog (keys %prog_subset) {
say "\nProcess hashrefs with PROGRAM being $prog";
foreach my $hr (#{ $prog_subset{$prog} }) {
say Dumper $hr;
}
}
(See postfix dereference)
Now %prog_subset contains keys TEST, EXP, FINAL (and whatever other PROGRAM names are in data), each having for value an arrayref of all hashrefs which have that PROGRAM name.
There are other ways, and there are libraries that can be leveraged, but this should do it.
OK! I found an example of this with the google machine. I replaced #lines = (); with $lines = [];. This allowed me to change the grep statement to (#found) = grep { $pgm eq $_->{PROGRAM} } #$lines;. Now the returned array is a list of the hashrefs that share the program name I'm looking for. Thanks for the help #zdim!

Perl pass hash reference to Subroutine from Foreach loop (Array of Hash)

This may be something very simple for you but i have been trying for it for more than one hour.
Ok.. Here is my code,
#collection = [
{
'name' => 'Flo',
'model' => '23423',
'type' => 'Associate',
'id' => '1-23928-2392',
'age' => '23',
},
{
'name' => 'Flo1',
'model' => '23424',
'type' => 'Associate2',
'id' => '1-23928-23922',
'age' => '25',
}];
foreach my $row (#collection) {
$build_status = $self->build_config($row);
}
sub build_config {
my $self = shift;
my %collect_rows = #_;
#my %collect_rows = shift;
print Dumper %collect_rows; exit;
exit;
}
Where $row is really an Hash. But when i Print it.. It gives output like below (I am using Dumper),
$VAR1 = 'HASH(0x20a1d68)';
You are confusing references with hashes and arrays.
An array is declared with:
my #array = (1, 2, 3);
A hash is declared with:
my %hash = (1 => 2, 3 => 4);
That's because both hashes and arrays are simply lists (fancy lists, but I digress). The only time you need to use the [] and {} is when you want to use the values contained in the list, or you want to create a reference of either list (more below).
Note that the => is just a special (ie. fat) comma, that quotes the left-hand side, so although they do the same thing, %h = (a, 1) would break, %h = ("a", 1) works fine, and %h = (a => 1) also works fine because the a gets quoted.
An array reference is declared as such:
my $aref = [1, 2, 3];
...note that you need to put the array reference into a scalar. If you don't and do it like this:
my #array = [1, 2, 3];
... the reference is pushed onto the first element of #array, which is probably not what you want.
A hash reference is declared like this:
my $href = {a => 1, b => 2};
The only time [] is used on an array (not an aref) is when you're using it to use an element: $array[1];. Likewise with hashes, unless it's a reference, you only use {} to get at a key's value: $hash{a}.
Now, to fix your problem, you can keep using the references with these changes:
use warnings;
use strict;
use Data::Dumper;
# declare an array ref with a list of hrefs
my $collection = [
{
'name' => 'Flo',
...
},
{
'name' => 'Flo1',
...
}
];
# dereference $collection with the "circumfix" operator, #{}
# ...note that $row will always be an href (see bottom of post)
foreach my $row (#{ $collection }) {
my $build_status = build_config($row);
print Dumper $build_status;
}
sub build_config {
# shift, as we're only accepting a single param...
# the $row href
my $collect_rows = shift;
return $collect_rows;
}
...or change it up to use non-refs:
my #collection = (
{
'name' => 'Flo',
...
},
{
'name' => 'Flo1',
...
}
);
foreach my $row (#collection) {
my $build_status = build_config($row);
# build_config() accepts a single href ($row)
# and in this case, returns an href as well
print Dumper $build_status;
}
sub build_config {
# we've been passed in a single href ($row)
my $row = shift;
# return an href as a scalar
return $row;
}
I wrote a tutorial on Perl references you may be interested in guide to Perl references. There's also perlreftut.
One last point... the hashes are declared with {} inside of the array because they are indeed hash references. With multi-dimentional data, only the top level of the structure in Perl can contain multiple values. Everything else underneath must be a single scalar value. Therefore, you can have an array of hashes, but technically and literally, it's an array containing scalars where their value is a pointer (reference) to another structure. The pointer/reference is a single value.

Problems with sorting a hash of hashes by value in Perl

I'm rather inexperienced with hashes of hashes - so I hope someone can help a newbie...
I have the following multi-level hash:
$OCRsimilar{$ifocus}{$theWord}{"form"} = $theWord;
$OCRsimilar{$ifocus}{$theWord}{"score"} = $OCRscore;
$OCRsimilar{$ifocus}{$theWord}{"distance"} = $distance;
$OCRsimilar{$ifocus}{$theWord}{"similarity"} = $similarity;
$OCRsimilar{$ifocus}{$theWord}{"length"} = $ilength;
$OCRsimilar{$ifocus}{$theWord}{"frequency"} = $OCRHashDict{$ikey}{$theWord};
Later, I need to sort each second-level element ($theWord) according to the score value. I've tried various things, but have failed so far. The problem seems to be that the sorting introduces new empty elements in the hash that mess things up.
What I have done (for example - I'm sure this is far from ideal):
my #flat = ();
foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {
push #flat, [$key1, $OCRsimilar{$ifocus}{$key1}{'score'}];
}
for my $entry (sort { $b->[1] <=> $a->[1] } #flat) {
print STDERR "#$entry[0]\t#$entry[1]\n";
}
If I check things with Data::Dumper, the hash contains for example this:
'uroadcast' => {
'HASH(0x7f9739202b08)' => {},
'broadcast' => {
'frequency' => '44',
'length' => 9,
'score' => '26.4893274374278',
'form' => 'broadcast',
'distance' => 1,
'similarity' => 1
}
}
If I don't do the sorting, the hash is fine. What's going on?
Thanks in advance for any kind of pointers...!
Just tell sort what to sort on. No other tricks are needed.
#!/usr/bin/perl
use warnings;
use strict;
my %OCRsimilar = (
focus => {
word => {
form => 'word',
score => .2,
distance => 1,
similarity => 1,
length => 4,
frequency => 22,
},
another => {
form => 'another',
score => .01,
distance => 1,
similarity => 1,
length => 7,
frequency => 3,
},
});
for my $word (sort { $OCRsimilar{focus}{$a}{score} <=> $OCRsimilar{focus}{$b}{score} }
keys %{ $OCRsimilar{focus} }
) {
print "$word: $OCRsimilar{focus}{$word}{score}\n";
}
Pointers: perlreftut, perlref, sort.
What seems suspicious to me is this construct:
foreach my $key1 (keys { $OCRsimilar{$ifocus} }) {
Try dereferencing the hash, so it becomes:
foreach my $key1 (keys %{ $OCRsimilar{$ifocus} }) {
Otherwise, you seem to be creating an anonymous hash and taking the keys of it, equivalent to this code:
foreach my $key1 (keys { $OCRsimilar{$ifocus} => undef }) {
Thus, I think $key1 would equal $OCRsimilar{$ifocus} inside the loop. Then, I think Perl will do auto-vivification when it encounters $OCRsimilar{$ifocus}{$key1}, adding the hash reference $OCRsimilar{$ifocus} as a new key to itself.
If you use warnings;, the program ought to complain Odd number of elements in anonymous hash.
Still, I don't understand why Perl doesn't do further auto-vivication and add 'score' as the key, showing something like 'HASH(0x7f9739202b08)' => { 'score' => undef }, in the Data dump.

Storing SOAP response as hash

I'm trying to store the response from a Perl (v5.12.4) SOAP call (SQL query) as a hash for further processing while maintaining the structure of the data. The columns will vary so I'd like to be able to build the column keys dynamically. Here is an example of the reply:
$VAR1 = {
'row' => [
#0
{
'column1' => '',
'column2' => 'f',
'column3' => '0',
'column4' => '',
'column5' => 'f',
'column6' => '0',
'column7' => 'f',
},
]
};
I'm interested in the column names and values. I've made some (comical) attempts at handling this, such as:
unless ($res->fault) {
$Data::Dumper::Indent = 3;
my (%reply) = $res->paramsall();
foreach my $keys (keys %reply) {
print "$keys\n";
}
}
This gets a "Reference found where even-sized list expected at script.pl line...". Any point in the right direction here is much appreciated.
Thanks
This :
$res->paramsall()
almost certainy returns a hashref.
So what you are doing is assigning the hashref to the 1st key of %reply.
my %reply = %{ $res->paramsall() };
Will probably work.
You should take a look at perlreftut for some methods to work with refrerence structures.
Dereferencing hashes with my %hash = %{ } makes a copy of the referenced hash.
You can work directly with hashrefs like this :
my $reply = $res->paramsall();
foreach my $key1 (keys %$reply) {
foreach my $key2 (keys %{ $reply->{$key1} }) {
print " $key2: " . $reply->{$key1}{$key2} . "\n";
}
}
In the 1st code example in your reply, this :
$reply{$key2}->{$key2}
should read :
$reply{$key1}{$key2}
since everything is already dereferenced.

Perl - Removing unwanted elements from an arrayref

I'm writing a script that parses the "pure-ftpwho -s" command to get a list of the current transfers. But when a user disconnects from the FTP and reconnects back and resumes a transfer, the file shows up twice. I want to remove the ghosted one with Perl. After parsing, here is what the arrayref looks like (dumped with Data::Dumper)
$VAR1 = [
{
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '63',
'speed' => '11',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '16',
'speed' => '60',
'file' => 'somefile.txt',
'user' => 'user2',
'size' => '14648'
}
];
Here user1 and user2 are downloading the same file, but user1 appears twice because the first one is a "ghost". What's the best way to check and remove elements that I don't need (in this case the first element of the arrayref). The condition to check is that - If the "file" key and "user" key is the same, then delete the hashref that contains the smaller value of "percent" key (if they're the same then delete all except one).
If order in the original arrayref doesn't matter, this should work:
my %users;
my #result;
for my $data (#$arrayref) {
push #{ $users{$data->{user}.$data->{file}} }, $data;
}
for my $value (values %users) {
my #data = sort { $a->{percent} <=> $b->{percent} } #$value;
push #result, $data[-1];
}
This can definitely be improved for efficiency.
The correct solution in this case would have been to use a hash when parsing the log file. Put all information into a hash, say %log, keyed by user and file:
$log{$user}->{$file} = {
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'size' => '14648'
};
etc. Latter entries in the log file would overwrite earlier ones. Alternatively, you can choose to overwrite entries with lower percent completed with ones that have higher completion rates.
Using a hash would get rid of a lot of completely superfluous code working around the choice of the wrong data structure.
For what it's worth, here's my (slightly) alternative approach. Again, it doesn't preserve the original order:
my %most_progress;
for my $data ( sort { $b->{percent} <=> $a->{percent} } #$data ) {
next if exists $most_progress{$data->{user}.$data->{file}};
$most_progress{$data->{user}.$data->{file}} = $data;
}
my #clean_data = values %most_progress;
This will preserve order:
use strict;
use warnings;
my $data = [ ... ]; # As posted.
my %pct;
for my $i ( 0 .. $#{$data} ){
my $r = $data->[$i];
my $k = join '|', $r->{file}, $r->{user};
next if exists $pct{$k} and $pct{$k}[1] >= $r->{percent};
$pct{$k} = [$i, $r->{percent}];
}
#$data = #$data[sort map $_->[0], values %pct];
my %check;
for (my $i = 0; $i <= $#{$arrayref}; $i++) {
my $transfer = $arrayref->[$i];
# check the transfer for user and file
my $key = $transfer->{user} . $transfer->{file};
$check{$key} = { } if ( !exists $check{$key} );
if ( $transfer->{percent} <= $check{$key}->{percent} ) {
# undefine this less advanced transfer
$arrayref->[$i] = undef;
} else {
# remove the other transfer
$arrayref->[$check{$key}->{index}] = undef if exists $check{$key}->{index};
# set the new standard
$check{$key} = { index => $i, percent => $transfer->{percent} }
}
}
# remove all undefined transfers
$arrayref = [ grep { defined $_ } #$arrayref ];
Variation on the theme with Perl6::Gather
use Perl6::Gather;
my #cleaned = gather {
my %seen;
for (sort { $b->{percent} <=> $a->{percent} } #$data) {
take unless $seen{ $_->{user} . $_->{file} }++;
}
};