XML::Simple: Parsing nested hashes/array - perl

I am trying to parse an XML file. The Xml file can be found # http://pastebin.com/fvuwbrh9.
I have saved this xml file as packages.xml.
Goal: List all the names which are surrounded by <packagereq> tag in the XML (I am referring the packagereq which fall under the group in the dumper output).
I wrote below script called rpm.pl:
#!/usr/bin/perl -w
use strict;
use XML::Simple;
use Data::Dumper;
my $ref = XMLin ('packages.xml');
#print Dumper ($ref);
foreach my $a ( keys %{ $ref->{group} } )
{
if ( exists $ref->{group}->{$a}->{packagelist} )
{
foreach my $b ( #{ $ref->{group}->{$a}->{packagelist}->{packagereq} } )
{
print $b->{content}."\n"; ### <<< referring the Dumper out put
}
}
}
Now my script goes half way throgh and prints the package names but then it gets terminated with below error:
Not an ARRAY reference at rpm.pl line 29.
After above error, the script does not process rest of the XML file and terminates.
Above error makes me believe that somewhere value of $ref->{group}->{$a}->{packagelist}->{packagereq} is not an ARRAY reference.
I have gone as carefuly as I can throguh the XML file (OR the Dumper output) but found that packagereq always points to an ARRAY reference unless and of course I overlooked something but I doubt so.
Could you provide some input on why is it complaining about Not an ARRAY ref.
Thanks.

XML::Simple, the most complicated XML parser to use. Add the following:
my $ref = XMLin ('packages.xml',
KeyAttr => [qw( id )],
ForceArray => [qw( group packagereq ignoredep )],
);

Related

Printing the return of IO::All with Data::Dumper?

Consider this snippet:
use Data::Dumper;
#targetDirsToScan = ("./");
use IO::All;
$io = io(#targetDirsToScan); # Create new directory object
#contents = $io->all(0); # Get all contents of dir
for my $contentry ( #contents ) {
print Dumper($contentry) ."\n";
}
This prints something like:
$VAR1 = bless( \*Symbol::GEN298, 'IO::All::File' );
$VAR1 = bless( \*Symbol::GEN307, 'IO::All::Dir' );
$VAR1 = bless( \*Symbol::GEN20, 'IO::All::File' );
...
I expected I would get the all the fields of the respective objects dumped, instead; at first, I thought this was a reference, so I thought the fields would be printed if I dereference the variable - but I realized I don't really know how to dereference it.
So - how can I print out all the fields and contents of the #contents, using the same kind of for my ... loop?
You can do this:
use Data::Dumper;
use IO::All;
$io = io('/tmp');
for my $file ( $io->all(0) ) {
print Dumper \%{*$file};
}
But you should seriously consider whether doing this is a good idea. One of the core tenets of object-oriented programming is encapsulation. You should not care about the guts of a blessed object - you should interact with it only via the methods it provides.

How to deal with calling data that is parsed into a hash in perl

So I parsed the following XML code using Perl and i'm trying to call the spectrum results but i'm having difficulty since it is a hash. I keep getting the error message reference found where even sized list expected.
<message>
<cmd id="result_data">
<result-file-header>
<path>String</path>
<duration>Float</duration>
<spectra-count>Integer</spectra-count>
</result-file-header>
<scan-results count="Integer">
<scan-result>
<spectrum-index>Integer</spectrum-index>
<time-stamp>Integer</time-stamp>
<tic>Float</tic>
<start-mass>Float</start-mass>
<stop-mass>Float</stop-mass>
<spectrum count="Integer">mass,abundance;mass1,abundance1;
mass2,abundance2</spectrum>
</scan-result>
<scan-result>
<spectrum-index>Integer</spectrum-index>
<time-stamp>Integer</time-stamp>
<tic>Float</tic>
<start-mass>Float</start-mass>
<stop-mass>Float</stop-mass>
<spectrum count="Integer">mass3,abundance3;mass4,abundance4;
mass5,abundance5</spectrum>
</scan-result>
</scan-results>
</cmd>
</message>
Here is the Perl code i'm using:
my $file = "gapiparseddataexample1.txt";
unless(open FILE, '>'.$file) {
die "\nUnable to create $file\n";
}
use warnings;
use XML::Simple;
use Data::Dumper;
my $values= XMLin('samplegapi.xml', ForceArray => [ 'scan-result' ,'result-file-header']);
print Dumper($values);
my $results = $values->{'cmd'}->{'scan-results'}->{'scan-result'};
my $results1=$values->{'cmd'}->{'result-file-header'};
for my $data (#$results) {
print FILE "Spectrum Index",":",$data->{"spectrum-index"},"\n";
print FILE "Total Ion Count",":",$data->{tic},"\n";
%spectrum=$data->{spectrum};
print FILE "Spectrum",":",%spectrum, "\n";
for my $data1 (#$results1) {
print FILE "Duration",":",$data1->{duration},"\n";
}
}
I want to be able to print out the spectrum value pairs.
This:
$spectrum=$data->{spectrum};
print FILE "Spectrum",":", $spectrum->{'content'}, "\n";
for my $data1 (#$results1) {
print FILE "Duration",":",$data1->{duration},"\n";
}
Should give you this (which I assume is what you want):
Spectrum:mass,abundance;mass1,abundance1;
mass2,abundance2
You'll want to remove the newline value from 'content' I imagine (so it doesn't split over two lines).
Explanation for anyone that's curious
The element contents have been shoved into "->content" because element also has an attribute. In this case, one called "count":
<spectrum count="Integer">mass3,abundance3;mass4,abundance4;
mass5,abundance5</spectrum>
This sort of behaviour is common in other languages and other XML parsing libraries too (e.g. sometimes they shove it into an element with the key 0). Sometimes it happens even when elements don't have regular attributes but are of specific types.
If you were to var dump $data->{$spectrum} you'd see the structure (again that usually applies in other languages and with other XML parsing libraries too).

Why do I get a filename of ARRAY(0x7fd5c22f7cd8) when trying to unzip with Perl's Archive::Extract?

I'm using Perl 5.12 on Mac 10.5.7. I have a JAR file, that I wish to unzip, and then process a file matching a file pattern. I'm having trouble figuring out how to iterate over the results of unzipping. I have ...
### build an Archive::Extract object ###
my $ae = Archive::Extract->new( archive => $dbJar );
### what if something went wrong?
my $ok = $ae->extract or die $ae->error;
### files from the archive ###
my #files = $ae->files;
for (#files) {
print "file : $_\n";
But there is only one thing returned from the loop -- "ARRAY(0x7fd5c22f7cd8)". I know there are lots of files in the archive, so I'm curious what I"m doing wrong here. - Dave
$ae->files returns an array reference, not an array. Try this:
my $files = $ae->files;
for(#$files) {
print "file : $_\n";
}
From the Perldoc of Archive::Extract:
$ae->files
This is an array ref holding all the paths from the archive. See extract() for details.
It's quite common for methods to return not arrays and hashes, but references to them. It's also quite common for methods to take references to arrays and hashes as arguments. This is because less data has to be passed back and forth between the method and your call.
You can do this:
for my $file ( #{ $ae->files } ) {
print "$file\n";
}
The #{...} dereferences the reference and makes it into a simple array. And yes, you can put method calls that return an array reference in the #{...} like I did.
As already mentioned, a very helpful package is Data::Dumper. This can take a reference and show you the data structure contained therein. It also will tell you if that data structure represents a blessed object which might be a clue that there is a method you can use to pull out the data.
For example, imagine an object called Company. One of the methods might be $company->Employees which returns an array reference to Employee objects. You might not realize this and discover that you get something like ARRAY{0x7fd5c22f7cd8) returned, pushing this through Data::Dumper might help you see the structure:
use Data::Dumper;
use Feature qw(say);
use Company;
[...]
#employee_list = $company->Employees;
# say join "\n", #employee_list; #Doesn't work.
say Dumper #employee_list;
This might print:
$VAR = [
{
FIRST => 'Marion',
LAST => 'Librarian',
POSITION => 'Yes Man',
} Employee,
{
FIRST => 'Charlie',
LAST => 'australopithecus`,
POSITION => 'Cheese Cube Eater'
} Employee,
]
Not only do you see this is an array of hash references, but these are also objects Employee too. Thus, you should use some methods from the Employee class to parse the information you want.
use Feature qw(say);
use Company;
use Employee;
[...]
for my $employee ( #{ $company->Employees } ) {
say "First Name: " . $employee->First;
say "Last Name: " . $employee->Last;
say "Position: " . $employee->Position;
print "\n";
}
Try doing this :
use Data::Dumper;
print Dumper #files;
That way, you will see the content of #files.
If you don't know how to process your data structure, paste the output of Dumper here

Perl: Simple INI file info retrieval

I have a perl script that's reading an INI file like this:
[placeholder_title]
Hostname = 127.0.0.1
Port = 161
The library that I'm using for this is Config::Tiny.
Normally when reading the ini file, I would have something like this:
$Config = Config::Tiny->read( 'configfile.ini' );
my $Hostname_property = $Config->{placeholder_title}->{Hostname};
Now I have a case where the section title in the config file is decided by the user, so I don't exactly know what it is.
Before I actually had multiple sections in the config file, so I would iterate through them like this:
foreach my $Section (keys %{$Config}) {
my $Hostname_property = $Config->{$Section}->{Hostname};
my $Port_property = $Config->{$Section}->{Port};
But what if I were to have only 1 section in total?
Is there a particular keyword I can use to substitute for the section name?
I've tried the similar looping logic from the prior example something like this:
$Config = Config::Tiny->read( 'configfile.ini' );
my $Section = keys %{$Config};
my $Hostname_property = $Config->{$Section}->{Hostname};
print $Hostname_property, "\n";
But then I get an error that $Hostname_property is not initialized, so my $Section variable clearly isn't doing what I hoped it to do.
If anybody can help me out or at least point me in the right direction, it would be greatly appreciated.
Thank you.
The reason my $Section = keys %{$Config}; doesn't work is that you're calling keys in scalar context, so it's returning the number of keys. Try calling it in list context instead:
my ($Section) = keys %{$Config};
This will set $Section to the first key. ("first" in whatever order keys is returning the keys in. If there's only one key, that doesn't matter.)
It's okay for a hash to have only one key. Consequently, it's okay if there's only one section in your ini file.
For example, if we have a file called blah.ini with contents of
[title]
foo=bar
blah=baz
and if we run the following code:
use strict;
use warnings;
use Config::Tiny;
my $cfg=Config::Tiny->read("blah.ini");
use Data::Dumper;
print Dumper($cfg) . "\n";
Then we get the output
$VAR1 = bless( {
'title' => {
'blah' => 'baz',
'foo' => 'bar'
}
}, 'Config::Tiny' );
Consequently, we can do something like the following:
use strict;
use warnings;
use Config::Tiny;
my $cfg=Config::Tiny->read("blah.ini");
foreach my $title(sort keys %$cfg)
{
foreach my $setting (sort keys %{$cfg->{$title}})
{
print "title: $title,setting $setting, value $cfg->{$title}->{$setting}\n";
}
}
And the output is
title: title,setting blah, value baz
title: title,setting foo, value bar

How do I convert Data::Dumper output back into a Perl data structure?

I was wondering if you could shed some lights regarding the code I've been doing for a couple of days.
I've been trying to convert a Perl-parsed hash back to XML using the XMLout() and XMLin() method and it has been quite successful with this format.
#!/usr/bin/perl -w
use strict;
# use module
use IO::File;
use XML::Simple;
use XML::Dumper;
use Data::Dumper;
my $dump = new XML::Dumper;
my ( $data, $VAR1 );
Topology:$VAR1 = {
'device' => {
'FOC1047Z2SZ' => {
'ChassisID' => '2009-09',
'Error' => undef,
'Group' => {
'ID' => 'A1',
'Type' => 'Base'
},
'Model' => 'CATALYST',
'Name' => 'CISCO-SW1',
'Neighbor' => {},
'ProbedIP' => 'TEST',
'isDerived' => 0
}
},
'issues' => [
'TEST'
]
};
# create object
my $xml = new XML::Simple (NoAttr=>1,
RootName=>'data',
SuppressEmpty => 'true');
# convert Perl array ref into XML document
$data = $xml->XMLout($VAR1);
#reads an XML file
my $X_out = $xml->XMLin($data);
# access XML data
print Dumper($data);
print "STATUS: $X_out->{issues}\n";
print "CHASSIS ID: $X_out->{device}{ChassisID}\n";
print "GROUP ID: $X_out->{device}{Group}{ID}\n";
print "DEVICE NAME: $X_out->{device}{Name}\n";
print "DEVICE NAME: $X_out->{device}{name}\n";
print "ERROR: $X_out->{device}{error}\n";
I can access all the element in the XML with no problem.
But when I try to create a file that will house the parsed hash, problem arises because I can't seem to access all the XML elements. I guess, I wasn't able to unparse the file with the following code.
#!/usr/bin/perl -w
use strict;
#!/usr/bin/perl
# use module
use IO::File;
use XML::Simple;
use XML::Dumper;
use Data::Dumper;
my $dump = new XML::Dumper;
my ( $data, $VAR1, $line_Holder );
#this is the file that contains the parsed hash
my $saveOut = "C:/parsed_hash.txt";
my $result_Holder = IO::File->new($saveOut, 'r');
while ($line_Holder = $result_Holder->getline){
print $line_Holder;
}
# create object
my $xml = new XML::Simple (NoAttr=>1, RootName=>'data', SuppressEmpty => 'true');
# convert Perl array ref into XML document
$data = $xml->XMLout($line_Holder);
#reads an XML file
my $X_out = $xml->XMLin($data);
# access XML data
print Dumper($data);
print "STATUS: $X_out->{issues}\n";
print "CHASSIS ID: $X_out->{device}{ChassisID}\n";
print "GROUP ID: $X_out->{device}{Group}{ID}\n";
print "DEVICE NAME: $X_out->{device}{Name}\n";
print "DEVICE NAME: $X_out->{device}{name}\n";
print "ERROR: $X_out->{device}{error}\n";
Do you have any idea how I could access the $VAR1 inside the text file?
Regards,
newbee_me
$data = $xml->XMLout($line_Holder);
$line_Holder has only the last line of your file, not the whole file, and not the perl hashref that would result from evaling the file. Try something like this:
my $ref = do $saveOut;
The do function loads and evals a file for you. You may want to do it in separate steps, like:
use File::Slurp "read_file";
my $fileContents = read_file( $saveOut );
my $ref = eval( $fileContents );
You might want to look at the Data::Dump module as a replacement for Data::Dumper; its output is already ready to re-eval back.
Basically to load Dumper data you eval() it:
use strict;
use Data::Dumper;
my $x = {"a" => "b", "c"=>[1,2,3],};
my $q = Dumper($x);
$q =~ s{\A\$VAR\d+\s*=\s*}{};
my $w = eval $q;
print $w->{"a"}, "\n";
The regexp (s{\A\$VAR\d+\s*=\s*}{}) is used to remove $VAR1= from the beginning of string.
On the other hand - if you need a way to store complex data structure, and load it again, it's much better to use Storable module, and it's store() and retrieve() functions.
This has worked for me, for hashes of hashes. Perhaps won't work so well with structures which contain references other structures. But works well enough for simple structures, like arrays, hashes, or hashes of hashes.
open(DATA,">",$file);
print DATA Dumper(\%g_write_hash);
close(DATA);
my %g_read_hash = %{ do $file };
Please use dump module as a replacement for Data::Dumper
You can configure the variable name used in Data::Dumper's output with $Data::Dumper::Varname.
Example
use Data::Dumper
$Data::Dumper::Varname = "foo";
my $string = Dumper($object);
eval($string);
...will create the variable $foo, and should contain the same data as $object.
If your data structure is complicated and you have strange results, you may want to consider Storable's freeze() and thaw() methods.