Optimize perl hash mess - perl

I have Perl code, which looks messy:
my $x = $h->[1];
foreach my $y (keys %$x) {
my $ax = $x->{$y};
foreach my $ay (keys %$ax) {
if (ref($ax->{$ay}) eq 'JE::Object::Proxy') {
my $bx = $ax->{$ay};
if ($$bx->{class_info}->{name} eq 'HTMLImageElement') {
print $$bx->{value}->{src}, "\n";
}
}
}
}
Is it possible to optimize the code above to not use any variables, just $h, as that one is an input?

Here's my crack at it:
print $$_->{value}{src}, "\n" for grep {
ref $_ eq 'JE::Object::Proxy' &&
$$_->{class_info}{name} eq 'HTMLImageElement'
} map {
values %$_
} values %{ $h->[1] };

You're using keys, when you really just want values.
foreach my $h ( grep { ref() eq 'HASH' } values %$x ) {
foreach my $obj (
grep { ref() eq 'JE::Object::Proxy'
and $_->{class_info}{name} eq 'HTMLImageElement'
} values %$h
) {
say $obj->{value}{src};
}
}

A lot of the "messiness" can be cleaned up by reducing your line count and minimizing how much nested code you have. Use the each command to get the next key and its associated value from the hash in one line. [EDIT: as Axeman pointed out, you really only need the values, so I'm replacing my use of each with values]. Also, use a pair of next statement to skip the print statement.
for my $ax (values %{$h->[1]} ) {
for my $bx (values %$ax ) {
next unless ref($bx) eq 'JE::Object::Proxy';
next unless $$bx->{class_info}->{name} eq 'HTMLImageElement';
print "$$bx->{value}->{src}\n";
}
}

Just removing the helper variables is easy, something like this should do it:
foreach my $y (keys %{$h->[1]}) {
foreach my $ax (%{$h->[1]->{$y}) {
foreach my $ay (keys %$ax) {
if(ref($h->[1]->{$y}->{$ay}) eq 'JE::Object::Proxy') {
if($h->[1]->{$y}->{$ay}->{class_info}->{name} eq 'HTMLImageElement') {
print $h->[1]->{$y}->{$ay}->{value}->{src}, "\n";
}
}
}
}
}
You could also remove the duplicated if:
foreach my $y (keys %{$h->[1]}) {
foreach my $ax (%{$h->[1]->{$y}) {
foreach my $ay (keys %$ax) {
if(ref($h->[1]->{$y}->{$ay}) eq 'JE::Object::Proxy' && $h->[1]->{$y}->{$ay}->{class_info}->{name} eq 'HTMLImageElement') {
print $h->[1]->{$y}->{$ay}->{value}->{src}, "\n";
}
}
}
}
But I don't really see how to make it more readable: it is a iteration over a three dimensional structure.

Related

Perl: multidimensional hash

suppose I have the following data
cluster1:d(A),f(C)s,(A)
cluster2:r(D),h(D),f(A)
I want this out put
output:
cluster1:A->2
cluster1:C->1
cluster2:D->2
cluster2:A->1
here is my try,but it is not correct , the part that I am trying to count characters has a problem that I cant fix
the code is a part of very big code ,and I want exactly multidimensional hash
use strict;
use Data::Dumper;
my %count;
while (<DATA>) {
my %HoH;
my ( $cluster, $ch ) = split (/:/,$_);
$HoH{$cluster}={split /[()]+/,$ch};
for my $clust ( keys %HoH ) {
for my $character ( keys %{ $HoH{$clust} } ) {
$count{$clust}{$HoH{$clust}{$character}}++;
}
}
}
print Dumper(\%count);
foreach my $name (sort keys %count) {
foreach my $subject (keys %{$count{$name}}) {
print "$name:$subject->$count{$name}{$subject}\n";
}
}
DATA
cluster1:d(A),f(C)s,(A)
cluster2:r(D),h(D),f(A)
It would be nice if you try to understand the below code so that you can get an idea for solving the problem:-
use strict;
use Data::Dumper;
my $data = "cluster1:A,B,C,A";
my %cluster = ();
my ($cluster_key, $cluster_val ) = split (':', $data);
my #cluster1_data = split(',', $cluster_val);
foreach my $val ( #cluster1_data ) {
$cluster{$cluster_key}{$val}++;
}
print Dumper(\%cluster);
foreach my $clus ( keys %cluster ) {
my $clus_ref = $cluster{$clus};
foreach my $clu ( keys %{ $clus_ref } ){
my $count = $clus_ref->{$clu};
print"$clus:$clu->$count\n";
}
}
Output:
$VAR1 = {
'cluster1' => {
'A' => 2,
'C' => 1,
'B' => 1
}
};
cluster1:A->2
cluster1:C->1
cluster1:B->1
What do you expect $count{$cluster}{$characters}+=1; to do exactly? You have to loop over your input data to populate %count if you expect to get the desired result:
while (<DATA>) {
next unless /^(cluster\d+):(.+)/;
$count{$1}{$_}++ for split/,/, $2;
}
If you also add sort to the second foreach you'll get the output you want.
EDIT: This solves the question for the updated input and requirements:
my %count;
while (<DATA>) {
next unless /^(cluster\d+):(.+)/;
my $cluster = $1;
$count{$cluster}{$_}++ for $2 =~ /\((\w)\)/g;
}
for my $key (sort keys %count) {
for my $value (sort {
$count{$key}{$b} <=> $count{$key}{$a}
} keys %{$count{$key}}) {
print "$key:$value->$count{$key}{$value}\n";
}
}

compare multiple hashes for common keys merge values

I have a working bit of code here where I am comparing the keys of six hashes together to find the ones that are common amongst all of them. I then combine the values from each hash into one value in a new hash. What I would like to do is make this scaleable. I would like to be able to easily go from comparing 3 hashes to 100 without having to go back into my code and altering it. Any thoughts on how I would achieve this? The rest of the code already works well for different input amounts, but this is the one part that has me stuck.
my $comparison = List::Compare->new([keys %{$posHashes[0]}], [keys %{$posHashes[1]}], [keys %{$posHashes[2]}], [keys %{$posHashes[3]}], [keys %{$posHashes[4]}], [keys %{$posHashes[5]}]);
my %comboHash;
for ($comparison->get_intersection) {
$comboHash{$_} = ($posHashes[0]{$_} . $posHashes[1]{$_} . $posHashes[2]{$_} . $posHashes[3]{$_} . $posHashes[4]{$_} . $posHashes[5]{$_});
}
my %all;
for my $posHash (#posHashes) {
for my $key (keys(%$posHash)) {
push #{ $all{$key} }, $posHash->{$key};
}
}
my %comboHash;
for my $key (keys(%all)) {
next if #{ $all{$key} } != #posHashes;
$comboHash{$key} = join('', #{ $all{$key} });
}
Just make a subroutine and pass it hash references
my $combination = combine(#posHashes);
sub combine {
my #hashes = #_;
my #keys;
for my $href (#hashes) {
push #keys, keys %$href;
}
# Insert intersection code here..
# .....
my %combo;
for my $href (#hashes) {
for my $key (#intersection) {
$combo{$key} .= $href->{$key};
}
}
return \%combo;
}
Create a subroutine:
sub combine_hashes {
my %result = ();
my #hashes = #_;
my $first = shift #hashes;
for my $element (keys %$first) {
my $count = 0;
for my $href (#hashes) {
$count += (grep {$_ eq $element} (keys %$href));
}
if ($count > $#hashes) {
$result{$element} = $first->{$element};
$result{$element} .= $_->{$element} for #hashes;
}
}
\%result;
}
and call it by:
my %h = %{combine_hashes(\%h1, \%h2, \%h3)};
...or as:
my %h = %{combine_hashes(#posHashes)};
There is pretty straightforward solution:
sub merge {
my $first = shift;
my #hashes = #_;
my %result;
KEY:
for my $key (keys %$first) {
my $accu = $first->{$key};
for my $hash (#hashes) {
next KEY unless exists $hash->{$key};
$accu .= $hash->{$key};
}
$result{$key} = $accu;
}
return \%result;
}
You have to call it with references to hashes and it will return also hash reference e.g.:
my $comboHashRef = merge(#posHashes);

Perl Hashref Substitutions

I'm using DBI to connect to Sybase to grab records in a hash_ref element. The DBI::Sybase driver has a nasty habit of returning records with trailing characters, specifically \x00 in my case. I'm trying to write a function to clean this up for all elements in the hashref, the code I have below does the trick, but I can't find a way to make it leaner, and I know there is away to do this better:
#!/usr/bin/perl
my $dbh = DBI->connect('dbi:Sybase:...');
my $sql = qq {SELECT * FROM table WHERE age > 18;};
my $qry = $dbh->selectall_hashref($sql, 'Name');
foreach my $val(values %$qry) {
$qry->{$val} =~ s/\x00//g;
}
foreach my $key(keys %$qry) {
$qry->{$key} =~ s/\x00//g;
foreach my $val1(keys %{$qry->{$key}}) {
$qry->{$key}->{$val1} =~ s/\x00//g;
}
foreach my $key1(keys %{$qry->{$key}}) {
$qry->{$key}->{$key1} =~ s/\x00//g;
}
While I think that a regex substitution is not exactly an ideal solution (seems like it should be fixed properly instead), here's a handy way to solve it with chomp.
use Data::Dumper;
my %a = (
foo => {
a => "foo\x00",
b => "foo\x00"
},
bar => {
c => "foo\x00",
d => "foo\x00"
},
baz => {
a => "foo\x00",
a => "foo\x00"
}
);
$Data::Dumper::Useqq=1;
print Dumper \%a;
{
local $/ = "\x00";
chomp %$_ for values %a;
}
print Dumper \%a;
chomp will remove a single trailing value equal to whatever the input record separator $/ is set to. When used on a hash, it will chomp the values.
As you will note, we do not need to use the values directly, as they are aliased. Note also the use of a block around the local $/ statement to restrict its scope.
For a more manageable solution, it's probably best to make a subroutine, called recursively. I used chomp again here, but you can just as easily skip that and use s/\x00//g. Or tr/\x00//d, which basically does the same thing. chomp is only safer in that it only removes characters from the end of the string, like s/\x00$// would.
strip_null(\%a);
print Dumper \%a;
sub strip_null {
local $/ = "\x00";
my $ref = shift;
for (values %$ref) {
if (ref eq 'HASH') {
strip_null($_); # recursive strip
} else {
chomp;
}
}
}
First your code:
foreach my $val(values %$qry) {
$qry->{$val} =~ s/\x00//g;
# here you are using a value as if it was a key
}
foreach my $key(keys %$qry) {
$qry->{$key} =~ s/\x00//g;
foreach my $val1(keys %{$qry->{$key}}) {
$qry->{$key}->{$val1} =~ s/\x00//g;
}
foreach my $key1(keys %{$qry->{$key}}) {
$qry->{$key}->{$key1} =~ s/\x00//g;
}
# and this does the same thing twice...
what you should do is:
foreach my $x (values %$qry) {
foreach my $y (ref $x eq 'HASH' ? values %$x : $x) {
$y =~ s/(?:\x00)+$//
}
}
which will clean up only ending nulls in the values of two levels of the hash.
the body of the loop could also be written as:
if (ref $x eq 'HASH') {
foreach my $y (values %$x) {
$y =~ s/(?:\x00)+$//
}
}
else {
$x =~ s/(?:\x00)+$//
}
But that forces you to write the substitution twice, and you shouldn't repeat yourself.
Or if you really want to reduce the code, using the implicit $_ variable works well:
for (values %$qry) {
s/(?:\x00)+$// for ref eq 'HASH' ? values %$_ : $_
}

How to construct a hash of hashes

I need to compare two hashes, but I can't get the inner set of keys...
my %HASH = ('first'=>{'A'=>50, 'B'=>40, 'C'=>30},
'second'=>{'A'=>-30, 'B'=>-15, 'C'=>9});
foreach my $key (keys(%HASH))
{
my %innerhash = $options{$key};
foreach my $inner (keys(%innerhash))
{
print "Match: ".$otherhash{$key}->{$inner}." ".$HASH{$key}->{$inner};
}
}
$options{$key} is a scalar (you can tell be the leading $ sigil). You want to "dereference" it to use it as a hash:
my %HASH = ('first'=>{'A'=>50, 'B'=>40, 'C'=>30},
'second'=>{'A'=>-30, 'B'=>-15, 'C'=>9});
foreach my $key (keys(%HASH))
{
my %innerhash = %{ $options{$key} }; # <---- note %{} cast
foreach my $inner (keys(%innerhash))
{
print "Match: ".$otherhash{$key}->{$inner}." ".$HASH{$key}->{$inner};
}
}
When you're ready to really dive into these things, see perllol, perldsc and perlref.
I'm guessing you say "options" there where you mean "HASH"?
Hashes only store scalars, not other hashes; each value of %HASH is a hash reference that needs to be dereferenced, so your inner loop should be:
foreach my $inner (keys(%{ $HASH{$key} })
Or:
my %HASH = ('first'=>{'A'=>50, 'B'=>40, 'C'=>30},
'second'=>{'A'=>-30, 'B'=>-15, 'C'=>9});
foreach my $key (keys(%HASH))
{
my $innerhash = $HASH{$key};
foreach my $inner (keys(%$innerhash))
{
print "Match: ".$otherhash{$key}->{$inner}." ".$innerhash->{$inner};
}
}

Search object array for matching possible multiple values using different comparison operators

I have a function to search an array of objects for a matching value using the eq operator, like so:
sub find {
my ( $self, %params ) = #_;
my #entries = #{ $self->{_entries} };
if ( $params{filename} ) {
#entries = grep { $_->filename eq $params{filename} } #entries;
}
if ( $params{date} ) {
#entries = grep { $_->date eq $params{date} } #entries;
}
if ( $params{title} ) {
#entries = grep { $_->title eq $params{title} } #entries;
}
....
I wanted to also be able to pass in a qr quoted variable to use in the comparison instead but the only way I can think of separating the comparisons is using an if/else block, like so:
if (lc ref($params{whatever}) eq 'regexp') {
#use =~
} else {
#use eq
}
Is there a shorter way of doing it? Because of reasons beyond my control I'm using Perl 5.8.8 so I can't use the smart match operator.
TIA
This is Perl, so of course there's a CPAN module: Match::Smart. It works very similarly to Perl 5.10's smart match operator, only you type smart_match($a, $b) rather than $a ~~ $b.
You may wish to compare with the perlsyn documentation for 5.10 smartmatching as Match::Smart handles quite a few more situations.
Otherwise, I don't see anything wrong with:
sub smart_match {
my ($target, $param) = #_;
if (ref $param eq 'Regexp') {
return ($target =~ qr/$param/);
}
else {
return ($target eq $param);
}
}
#entries = grep { smart_match($_->date, $params{date}) } #entries;
I don't know exactly what you want your end result to be, but you can do:
for my $field (qw(filename date title)) {
my $p = $param($field};
#entries = (ref($p) eq 'regexp')
? grep { $_->$field =~ /$p/ } #entries
: grep { $_->$field eq $p } #entries;
}
Alternatively, you can turn even your 'eq' comparisions into regular expressions, e.g.:
my $entry = "string to be equal to";
my $re = qr/^\Q$entry\E/;
and that simplifies the logic in the for loop.