troubleshooting "pseudo-hashes are deprecated" while using xml module

troubleshooting "pseudo-hashes are deprecated" while using xml module - perl

I am just learning how to use perl hashes and ran into this message in perl. I am using XML::Simple to parse xml output and using exists to check on the hash keys.
Message:
Pseudo-hashes are deprecated at ./h2.pl line 53.
Argument "\x{2f}\x{70}..." isn't numeric in exists at ./h2.pl line 53.
Bad index while coercing array into hash at ./h2.pl line 53.
I had the script working earlier with one test directory and then executed the script on another directory for testing when I got this message. How do I resolve/workaround this?
Code that the error references:
use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
#my $data = XMLin($xml);
my $data = XMLin($xml, ForceArray => [qw (file) ]);
my $size=0;
if (exists $data->{class}
and $data->{class}=~ /FileNotFound/) {
print "The directory: $Path does not exist\n";
exit;
} elsif (exists $data->{file}->{path}
and $data->{file}->{path} =~/test-out-00/) {
$size=$data->{file}->{size};
if ($size < 1024000) {
print "FILE SIZE:$size BYTES\n";
exit;
}
} else {
exit;
}
print Dumper( $data );
Working test case, data structure looks like this:
$VAR1 = {
'recursive' => 'no',
'version' => '0.20.202.1.1101050227',
'time' => '2011-09-30T02:49:39+0000',
'filter' => '.*',
'file' => {
'owner' => 'test_act',
'replication' => '3',
'blocksize' => '134217728',
'permission' => '-rw-------',
'path' => '/source/feeds/customer/test/test-out-00',
'modified' => '2011-09-30T02:48:41+0000',
'size' => '135860644',
'group' => '',
'accesstime' => '2011-09-30T02:48:41+0000'
'modified' => '2011-09-30T02:48:41+0000'
},
'exclude' => ''
};
recursive:no
version:0.20.202.1.1101050227
time:2011-10-01T07:06:16+0000
filter:.*
file:HASH(0x84c83ec)
path:/source/feeds/customer/test
directory:HASH(0x84c75d8)
exclude:
Data structure with seeing error:
$VAR1 = {
'recursive' => 'no',
'version' => '0.20.202.1.1101050227',
'time' => '2011-10-03T04:49:36+0000',
'filter' => '.*',
'file' => [
{
'owner' => 'test_act',
'replication' => '3',
'blocksize' => '134217728',
'permission' => '-rw-------',
'path' => '/source/feeds/customer/test/20110531/test-out-00',
'modified' => '2011-10-03T04:47:46+0000',
'size' => '121406618',
'group' => 'feeds',
'accesstime' => '2011-10-03T04:47:46+0000'
},
Test xml file:
<?xml version="1.0" encoding="UTF-8"?><listing time="2011-10-03T04:49:36+0000" recursive="no" path="/source/feeds/customer/test/20110531" exclude="" filter=".*" version="0.20.202.1.1101050227"><directory path="/source/feeds/customer/test/20110531" modified="2011-10-03T04:48:19+0000" accesstime="1970-01-01T00:00:00+0000" permission="drwx------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-00" modified="2011-10-03T04:47:46+0000" accesstime="2011-10-03T04:47:46+0000" size="121406618" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-01" modified="2011-10-03T04:48:04+0000" accesstime="2011-10-03T04:48:04+0000" size="127528522" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-02" modified="2011-10-03T04:48:19+0000" accesstime="2011-10-03T04:48:19+0000" size="125452919" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/></listing>

The "Pseudo-hashes are deprecated" error means you're trying to access an array as a hash, which means that either $data->{file} or $data->{file}{path} is an arrayref.
You can check the data type by using print ref $data->{file}. The Data::Dumper module may also help you to see what is in your data structure (perhaps while setting $Data::Dumper::Maxdepth = N to limit the dump to N number of levels if the structure is big).
UPDATE
Now that you are using ForceArray, $data->{file} should always point to an arrayref, which may possibly have multiple references to path. Here is a modified segment of your code to handle that. But note that the logic of the if-then-exit conditions may have to change.
if (defined $data->{class} and $data->{class}=~ /FileNotFound/) {
print "The directory: $Path does not exist\n";
exit;
}
exit if ! defined $data->{file};
# filter the list for the first file entry named test-out-00
my ( $file ) = grep {
defined $_->{path} && $_->{path} =~ /test-out-00/
} #{ $data->{file} };
exit if ! defined $file;
$size = $file->{size};
if ($size < 1024000) {
print "FILE SIZE:$size BYTES\n";
exit;
}

When using XML::Simple, the ForceArray option is one of the most important to understand, especially in cases when your input data has nested elements that can occur 1 or more times. For example:
use XML::Simple;
use Data::Dumper;
my #xml_snippets = (
'<opt> <name x="3" y="4">B</name> <name x="5" y="6">C</name> </opt>',
'<opt> <name x="1" y="2">A</name> </opt>',
);
for my $xs (#xml_snippets){
my $data = XMLin($xs, ForceArray => 0);
print Dumper($data);
}
Output:
$VAR1 = {
'name' => [ # Array ref because there are 2 <name> elements.
{
'y' => '4',
'content' => 'B',
'x' => '3'
},
{
'y' => '6',
'content' => 'C',
'x' => '5'
}
]
};
$VAR1 = {
'name' => { # No intermediate array ref.
'y' => '2',
'content' => 'A',
'x' => '1'
}
};
By activating the ForceArray option, you can direct XML::Simple to produce consistent data structures that always use the intermediate array reference, even when there is only 1 of a particular nested element. You can activate the option globally or for specific tags, as illustrated here:
my $data = XMLin($xs, ForceArray => 1 ); # Globally.
my $data = XMLin($xs, ForceArray => [qw(name foo bar)]);

First, I recommend that you use ForceArray => [qw( file )] as previously discussed. That will cause an array to be returned for file, whether there's one or more file element. This is easier to handle than having two possible formats.
As I previously indicated, the problem is that you made no provision for looping over multiple file elements. You said you wanted to exit if the file doesn't exist, so that means you want
my $found;
for my $file (#{ $data->{file} }) {
if ($file->{path} =~ m{/test-out-00\z}) {
$found = $file;
last;
}
}
die("Test file not found\n") if !$found;
... do something with file data in $found ...

Related

Array of Hashrefs: How to access based on hashref "column" values

I have an array of hashrefs built from a database using fethrow_hashref(). The data structure is built like so:
while (my $ref = $sth->fetchrow_hashref()) {
push #lines, $ref;
}
I sort the data in the query by program name ascending, so all of the references in the array are still in alphabetical order. Then, I go through each hash and find the value that is numerically equal to a '1'. I then take the caolumn name, and store it to compare to the rest of the hashrefs with that program name to ensure they all have a '1' in the same column.
my $pgm = "";
my $met_lvl = "";
my #devs = ();
my %errors = ();
my $error = "";
foreach my $line_ref (#lines) {
if ($pgm ne $line_ref->{"PROGRAM"}) {
if (#devs && $error) {
# print " Different number metal layers for $pgm: #devs \n";
$error = "";
}
#devs = ();
$pgm = $line_ref->{"PROGRAM"};
($met_lvl) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
push #devs, $line_ref->{"DEVICE"};
} elsif ($pgm eq $line_ref->{"PROGRAM"}) {
push #devs, $line_ref->{"DEVICE"};
my ($met_chk ) = grep { $line_ref->{$_} == 1 } keys(%$line_ref);
if ($met_chk ne $met_lvl) {
$errors{$line_ref->{"PROGRAM"}} = $line_ref->{"PROGRAM"};
$error = "YUP";
}
}
}
I'd like to be able to access the hashrefs individually, based on matching column names from the database. How can I access the hashrefs with "TEST" values for "PROGRAM" keys? I used Data::Dumper to provide an example of a few of the hashrefs I'd like to access based on "PROGRAM" value:
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV1',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR455 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV2',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'TEST'
};
$VAR456 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV3',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NON_STANDARD',
'PROGRAM' => 'EXP'
};
$VAR457 = {
'PLM' => undef,
'SLM' => undef,
'QLM' => undef,
'DEVICE' => 'DEV4',
'TLM' => '1',
'DLM' => undef,
'ROUTING' => 'NORMAL',
'PROGRAM' => 'FINAL'
};
I'd like to be able to access key values for the hashrefs which contain the same program name. I cannot even begin to figure out what type of operation to use for this. I assume map is the correct way to do it, but dereferencing the "PROGAM" value for each element (hashref) in the array is beyond the scope of my understanding. I hope I was able to define the problem well enough for you guys to be able to help.
Edit: The impetus for wanting to access hashrefs with the same 'PROGRAM" value is to be able to provide an output of selected values to print to a logfile. So, after I compare and find differences between those hashrefs with the same "PROGRAM" value, I want to access them all again, and print out the desired column values to the lofgile.

Looks like you need to exrtact subsets of your data (hashrefs) with the same PROGRAM name.
Can preprocess your data to build a hash with those names as keys, and arrayrefs (with suitable hashrefs) as values. Then process those groups one at a time.
use warnings;
use strict;
use feature 'say';
use Data::Dumper; # to print complex data below
... populate #lines with hashrefs as in the question or copy-paste a sample
# Build hash: ( TEST => [ hashrefs w/ TEST ], EXP => [ hashrefs w/ EXP ], ... )
my %prog_subset;
for my $hr (#lines) {
push #{ $prog_subset{$hr->{PROGRAM}} }, $hr;
# Or, using "postfix dereferencing" (stable from v5.24)
# push $prog_subset{$hr->{PROGRAM}}->#*, $hr;
}
foreach my $prog (keys %prog_subset) {
say "\nProcess hashrefs with PROGRAM being $prog";
foreach my $hr (#{ $prog_subset{$prog} }) {
say Dumper $hr;
}
}
(See postfix dereference)
Now %prog_subset contains keys TEST, EXP, FINAL (and whatever other PROGRAM names are in data), each having for value an arrayref of all hashrefs which have that PROGRAM name.
There are other ways, and there are libraries that can be leveraged, but this should do it.

OK! I found an example of this with the google machine. I replaced #lines = (); with $lines = [];. This allowed me to change the grep statement to (#found) = grep { $pgm eq $_->{PROGRAM} } #$lines;. Now the returned array is a list of the hashrefs that share the program name I'm looking for. Thanks for the help #zdim!

How to print an element from an array which is inside the hash in perl

I'm trying to print the outputs from an API which are in multidimensional format.
use strict;
use warnings;
use Data::Dumper;
my $content={
'school_set' => 'SSET1234',
'result' => [
{
'school_name' => 'school_abc',
'display_value' => 'IL25',
'school_link' => 'example.com',
'status' => 'registerd',
'status_message' => 'only arts',
'school_id' => '58c388d40596191f',
}
],
'school_table' => 'arts_schools'
};
print "school_name is=".$content{result}[0]{school_name};
print "school_status is=".$content{result}[3]{status};
output
Global symbol "%content" requires explicit package name (did you forget to declare "my %content"?) at test8.pl line 20.
Global symbol "%content" requires explicit package name (did you forget to declare "my %content"?) at test8.pl line 21.
I have to print the outputs like below from the result.
school_name = school_abc
school_status = registered

If $content is a hash reference, you need to dereference it first. Use the arrow operator for that:
$content->{result}[0]{school_name}
The syntax without the arrow is only possible for %content.
my %content = ( result => [ { school_name => 'abc' } ] );
print $content{result}[0]{school_name};
If you want to print all the results, you have to loop over the array somehow. For example
#!/usr/bin/perl
use warnings;
use strict;
my $content = {
'result' => [
{
'school_name' => 'school_abc',
'status' => 'registerd',
},
{
'school_name' => 'school_def',
'status' => 'pending',
}
],
};
for my $school (#{ $content->{result} }) {
print "school_name is $school->{school_name}, status is $school->{status}\n";
}

Your data structure assumes an array, perhaps it would be useful to utilize loop output for the data of interest.
The data presented as hash reference and will require de-referencing to loop through an array.
Following code snippet is based on your posted code and demonstrates how desired output can be achieved.
use strict;
use warnings;
use feature 'say';
my $dataset = {
'school_set' => 'SSET1234',
'result' => [
{
'school_name' => 'school_abc',
'display_value' => 'IL25',
'school_link' => 'example.com',
'status' => 'registerd',
'status_message' => 'only arts',
'school_id' => '58c388d40596191f',
}
],
'school_table' => 'arts_schools'
};
for my $item ( #{$dataset->{result}} ) {
say "school_name is = $item->{school_name}\n"
. "school_status is = $item->{status}";
}
exit 0;
Output
school_name is = school_abc
school_status is = registerd

How to get "Var1" value of dumper in perl

When i use below code then it gives output, But i want "width", "file_media_type", "file_ext" values, But I am unable to get this value in individual. I am very new with Perl Please help me!
Code
use warnings ;
use strict;
use Image::Info qw[image_info];
use Data::Dumper;
my $file = 'd:\perl\test\a.jpg';
my $info = Dumper(image_info($file));
print $info;
Output
$VAR1 = {
'width' => 45,
'file_media_type' => 'image/png',
'file_ext' => 'png',
'PNG_Chunks' => [
'IHDR',
'gAMA',
'cHRM',
'IDAT',
'IEND'
],
'Chunk-cHRM' => ' z% Çâ ·  ÇF u0 O` :ù ?o',
'PNG_Filter' => 'Adaptive',
'color_type' => 'RGB',
'height' => 20,
'Gamma' => '0.45454',
'resolution' => '1/1',
'SampleFormat' => 'U8',
'Compression' => 'Deflate'
};

image_info($file) returns a hash reference. Looking at the dump you know the keys available (the keys are strings before =>)
$info = image_info($file);
foreach my $key ( qw/width file_media_type file_ext/ ){
print "$key:$info->{$key}\n";
}

How to incorporate hash inside hash in perl?

my %book = (
'name' => 'abc',
'author' => 'monk',
'isbn' => '123-890',
'issn' => '#issn',
);
my %chapter = (
'title' => 'xyz',
'page' => '90',
);
How do I incorporate %book inside %chapter through reference so that when I write "$chapter{name}", it should print 'abc'?

You can copy the keys/values of the %book into the %chapter:
#chapter{keys %book} = values %book;
Or something like
%chapter = (%chapter, %book);
Now you can say $chapter{name}, but changes in %book are not reflected in %chapter.
You can include the %book via reference:
$chapter{book} = \%book;
Now you could say $chapter{book}{name}, and changes do get reflected.
To have an interface that allows you to say $chapter{name} and that does reflect changes, some advanced techniques would have to be used (this is fairly trivial with tie magic), but don't go there unless you really have to.

You could write a subroutine to check a list of hashes for a key. This program demonstrates:
use strict;
use warnings;
my %book = (
name => 'abc',
author => 'monk',
isbn => '123-890',
issn => '#issn',
);
my %chapter = (
title => 'xyz',
page => '90',
);
for my $key (qw/ name title bogus / ) {
print '>> ', access_hash($key, \%book, \%chapter), "\n";
}
sub access_hash {
my $key = shift;
for my $hash (#_) {
return $hash->{$key} if exists $hash->{$key};
}
undef;
}
output
Use of uninitialized value in print at E:\Perl\source\ht.pl line 17.
>> abc
>> xyz
>>

Perl - Removing unwanted elements from an arrayref

I'm writing a script that parses the "pure-ftpwho -s" command to get a list of the current transfers. But when a user disconnects from the FTP and reconnects back and resumes a transfer, the file shows up twice. I want to remove the ghosted one with Perl. After parsing, here is what the arrayref looks like (dumped with Data::Dumper)
$VAR1 = [
{
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '63',
'speed' => '11',
'file' => 'somefile.txt',
'user' => 'user1',
'size' => '14648'
},
{
'status' => 'DL',
'percent' => '16',
'speed' => '60',
'file' => 'somefile.txt',
'user' => 'user2',
'size' => '14648'
}
];
Here user1 and user2 are downloading the same file, but user1 appears twice because the first one is a "ghost". What's the best way to check and remove elements that I don't need (in this case the first element of the arrayref). The condition to check is that - If the "file" key and "user" key is the same, then delete the hashref that contains the smaller value of "percent" key (if they're the same then delete all except one).

If order in the original arrayref doesn't matter, this should work:
my %users;
my #result;
for my $data (#$arrayref) {
push #{ $users{$data->{user}.$data->{file}} }, $data;
}
for my $value (values %users) {
my #data = sort { $a->{percent} <=> $b->{percent} } #$value;
push #result, $data[-1];
}
This can definitely be improved for efficiency.

The correct solution in this case would have been to use a hash when parsing the log file. Put all information into a hash, say %log, keyed by user and file:
$log{$user}->{$file} = {
'status' => 'DL',
'percent' => '20',
'speed' => '10',
'size' => '14648'
};
etc. Latter entries in the log file would overwrite earlier ones. Alternatively, you can choose to overwrite entries with lower percent completed with ones that have higher completion rates.
Using a hash would get rid of a lot of completely superfluous code working around the choice of the wrong data structure.

For what it's worth, here's my (slightly) alternative approach. Again, it doesn't preserve the original order:
my %most_progress;
for my $data ( sort { $b->{percent} <=> $a->{percent} } #$data ) {
next if exists $most_progress{$data->{user}.$data->{file}};
$most_progress{$data->{user}.$data->{file}} = $data;
}
my #clean_data = values %most_progress;

This will preserve order:
use strict;
use warnings;
my $data = [ ... ]; # As posted.
my %pct;
for my $i ( 0 .. $#{$data} ){
my $r = $data->[$i];
my $k = join '|', $r->{file}, $r->{user};
next if exists $pct{$k} and $pct{$k}[1] >= $r->{percent};
$pct{$k} = [$i, $r->{percent}];
}
#$data = #$data[sort map $_->[0], values %pct];

my %check;
for (my $i = 0; $i <= $#{$arrayref}; $i++) {
my $transfer = $arrayref->[$i];
# check the transfer for user and file
my $key = $transfer->{user} . $transfer->{file};
$check{$key} = { } if ( !exists $check{$key} );
if ( $transfer->{percent} <= $check{$key}->{percent} ) {
# undefine this less advanced transfer
$arrayref->[$i] = undef;
} else {
# remove the other transfer
$arrayref->[$check{$key}->{index}] = undef if exists $check{$key}->{index};
# set the new standard
$check{$key} = { index => $i, percent => $transfer->{percent} }
}
}
# remove all undefined transfers
$arrayref = [ grep { defined $_ } #$arrayref ];

Variation on the theme with Perl6::Gather
use Perl6::Gather;
my #cleaned = gather {
my %seen;
for (sort { $b->{percent} <=> $a->{percent} } #$data) {
take unless $seen{ $_->{user} . $_->{file} }++;
}
};