XML-SAX Error: Not an ARRAY reference - perl

(WinXP Pro, ActivePerl 5.10.1, and XML-SAX 0.96 used)
MyXML.xml as this
<?xml version="1.0" standalone="yes"?>
<DocumentElement>
<subItem>
<bib>0006</bib>
<name>Stanley Cheruiyot Teimet</name>
<team>肯尼亚</team>
<time>2:17:59</time>
<rank>1</rank>
<comment />
<gender>Male</gender>
<distance>Full</distance>
<year>2010</year>
</subItem>
</DocumentElement>
MyPerl.pl
#!/usr/bin/perl -w
use strict;
use XML::Simple;
use Data::Dumper;
use utf8;
open FILE, ">myXML.txt" or die $!;
my $tree = XMLin('./myXML.xml');
print Dumper($tree);
print FILE "\n";
for (my $i = 0; $i < 1; $i++)
{
print FILE "('$tree->{subItem}->[$i]->{distance}')";
}
close FILE;
Output:
D:\learning\perl\im>mar.pl
$VAR1 = {
'subItem' => {
'distance' => 'Full',
'time' => '2:17:59',
'name' => 'Stanley Cheruiyot Teimet',
'bib' => '0006',
'comment' => {},
'team' => '肯尼亚',
'rank' => '1',
'year' => '2010',
'gender' => 'Male'
}
};
Not an ARRAY reference at D:\learning\perl\im\mar.pl line 41.
I don't know what the Array reference means? The Dumper() works well.But can't print the data to TXT file.
Actually, the sample code ran well days before, Then I remember I upgrade my Komodo Edit from V5. to newest V6.
Today, I just try to improve the script, at the beginning stage, I fixed another error. "Could not find ParserDetails.ini" with google help. (I didn't get the error before!)
But now I get the ARRAY reference error. I already reinstalled my XML-SAX via PPM just now. It still doesn't work.

The whole stack for XML parsing that you have set up works fine, as the dump of the parsed tree shows. XML::SAX is not the cause of the problem, and it is only involved indirectly.
The error comes simply from improper access to the data structure which was generated by XML::Simple.
I can guess what happened. In an earlier version of your program, you had the ForceArray option enabled (that is a good practice, see OPTIONS and STRICT_MODE in XML::Simple), and the algorithm for traversing the parsed tree also was written to take this into account, i.e. there is array access involved.
In the current version of your program ForceArray is not enabled, but the traversing algorithm does not match the data structure any more. I suggest to re-enable options that are recommended in the documentation.
#!/usr/bin/env perl
use utf8;
use strict;
use warnings FATAL => 'all';
use IO::File qw();
use XML::Simple qw(:strict);
use autodie qw(:all);
my $xs = XML::Simple->new(ForceArray => 1, KeyAttr => {}, KeepRoot => 1);
my $tree = $xs->parse_file('./myXML.xml');
{
open my $out, '>', 'myXML.txt';
$out->say;
for my $subitem (#{ $tree->{DocumentElement}->[0]->{subItem} }) {
$out->say($subitem->{distance}->[0]); # 'Full'
}
}
The tree looks like this now:
{
'DocumentElement' => [
{
'subItem' => [
{
'distance' => ['Full'],
'time' => ['2:17:59'],
'name' => ['Stanley Cheruiyot Teimet'],
'bib' => ['0006'],
'comment' => [{}],
'team' => ["\x{80af}\x{5c3c}\x{4e9a}"],
'rank' => ['1'],
'year' => ['2010'],
'gender' => ['Male']
}
]
}
]
}

Related

Perl: Add hash as sub hash to simple hash

I looked at the other two questions that seem to be about this, but they are a little obtuse and I can't relate them to what I want to do, which I think is simpler. I also think this will be a much clearer statement of a very common problem/task so I'm posting this for the benefit of others like me.
The Problem:
I have 3 files, each file a list of key=value pairs:
settings1.ini
key1=val1
key2=val2
key3=val3
settings2.ini
key1=val4
key2=val5
key3=val6
settings3.ini
key1=val7
key2=val8
key3=val9
No surprise, I want to read those key=value pairs into a hash to operate on them, so...
I have a hash of the filenames:
my %files = { file1 => 'settings1.ini'
, file2 => 'settings2.ini'
, file3 => 'settings3.ini'
};
I can iterate through the filenames like so:
foreach my $fkey (keys %files) {
say $files{$fkey};
}
Ok.
Now I want to add the list of key=value pairs from each file to the hash as a sub-hash under each respective 'top-level' filename key, such that I can iterate through them like so:
foreach my $fkey (keys %files) {
say "File: $files{$fkey}";
foreach my $vkey (keys $files{$fkey}) {
say " $vkey: $files{$fkey}{$vkey}";
}
}
In other words, I want to add a second level to the hash such that it goes from just being (in psuedo terms) a single layer list of values:
file1 => settings1.ini
file2 => settings2.ini
file3 => settings3.ini
to being a multi-layered list of values:
file1 => key1 => 'val1'
file1 => key2 => 'val2'
file1 => key3 => 'val3'
file2 => key1 => 'val4'
file2 => key2 => 'val5'
file2 => key3 => 'val6'
file3 => key1 => 'val7'
file3 => key2 => 'val8'
file3 => key3 => 'val9'
Where:
my $fkey = 'file2';
my $vkey = 'key3';
say $files{$fkey}{$vkey};
would print the value
'val6'
As a side note, I am trying to use File::Slurp to read in the key=value pairs. Doing this on a single level hash is fine:
my %new_hash = read_file($files{$fkey}) =~ m/^(\w+)=([^\r\n\*,]*)$/img;
but - to rephrase this whole problem - what I really want to do is 'graft' the new hash of key=value pairs onto the existing hash of filenames 'under' the top $file key as a 'child/branch' sub-hash.
Questions:
How do I do this, how do I build a multi-level hash one layer at a time like this?
Can I do this without having to pre-define the hash as multi-layered up front?
I use strict; and so I have seen the
Can't use string ("string") as a HASH ref while "strict refs" in use at script.pl line <lineno>.
which I don't fully understand...
Edit:
Thank you Timur Shtatland, Polar Bear and Dave Cross for your great answers. In mentally parsing your suggestions it occurred to me that I had slightly mislead you by being a little inconsistent in my original question. I apologize. I also think I see now why I saw the 'strict refs' error. I have made some changes.
Note that my first mention of the initial hash of filename is correct. The subsequent foreach examples looping through %files, however, were incorrect because I went from using file1 as the first file key to using settings1.ini as the first file key. I think this is why Perl threw the strict refs error - because I tried to change the key from the initial string to a hash_ref pointing to the sub-hash (or vice versa).
Have I understood that correctly?
There are several CPAN modules purposed for ini files. You should study what is available and choose what your heart desire.
Otherwise you can write your own code something in the spirit of following snippet
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my #files = qw(settings1.ini settings2.ini settings3.ini);
my %hash;
for my $file (#files) {
$hash{$file} = read_settings($file);
}
say Dumper(\%hash);
sub read_settings {
my $fname = shift;
my %hash;
open my $fh, '<', $fname
or die "Couldn't open $fname";
while( <$fh> ) {
chomp;
my($k,$v) = split '=';
$hash{$k} = $v
}
close $fh;
return \%hash;
}
Output
$VAR1 = {
'settings1.ini' => {
'key2' => 'val2',
'key1' => 'val1',
'key3' => 'val3'
},
'settings2.ini' => {
'key2' => 'val5',
'key1' => 'val4',
'key3' => 'val6'
},
'settings3.ini' => {
'key1' => 'val7',
'key2' => 'val8',
'key3' => 'val9'
}
};
To build the hash one layer at a time, use anonymous hashes. Each value of %files here is a reference to a hash, for example, for $files{'settings1.ini'}:
# read the data into %new_hash, then:
$files{'settings1.ini'} = { %new_hash }
You do not need to predefine the hash as multi-layered (as hash of hashes) upfront.
Also, avoid reinventing the wheel. Use Perl modules for common tasks, in this case consider something like Config::IniFiles for parsing *.ini files
SEE ALSO:
Anonymous hashes: perlreftut
Hashes of hashes: perldsc
Perl makes stuff like this ridiculously easy.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my %files;
# <> reads from the files given on the command line
# one line at a time.
while (<>) {
chomp;
my ($key, $val) = split /=/;
# $ARGV contains the name of the file that
# is currently being read.
$files{$ARGV}{$key} = $val;
}
say Dumper \%files;
Running this as:
$ perl readconf settings1.ini settings2.ini settings3.ini
Gives the following output:
$VAR1 = {
'settings3.ini' => {
'key2' => 'val8',
'key1' => 'val7',
'key3' => 'val9'
},
'settings2.ini' => {
'key3' => 'val6',
'key1' => 'val4',
'key2' => 'val5'
},
'settings1.ini' => {
'key3' => 'val3',
'key1' => 'val1',
'key2' => 'val2'
}
};

Parsing and iterating through an XML structure where each entry has multiple attributes

I'm trying to deal with an XML list of items (in this case, images) and iterate over each one. I don't really understand Perl or hashes, but I found a few explanations and examples (many here) and wrote something that seemed to work. The XML is a list of elements, each of which contains a unique 'id' attribute.
I'm using XMLin from XML::Simple to parse the XML.
When the list contains multiple elements, it iterates through by 'id'. But it seems that when there is only one, it gets confused, and treats each attribute of the element as its own value, which results in a run-time error.
Can't use string ("0") as a HASH ref while "strict refs" in use
I'm guessing that the problem is that the hash key doesn't that 'id' is the unique key, at least when there's only one entry. So I added code to dump the keys. I also added a line to print what the $image is in the foreach loop. In the case that breaks, the line print "In loop; image ID=$image\n"; displays In loop; image ID=Serial Since Serial is an attribute at the same level as id, I'm guessing this is the problem (not properly using id as the key).
Here's my code:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple;
my $album_data_file = $ARGV[0];
my $album_file_list = $ARGV[1];
my $do_dump_data = $ARGV[2];
my $album_data = XMLin ( $album_data_file );
my $LIST_FILE;
if ( defined $album_file_list && "$album_file_list" ne "" )
{
if ( open ( $LIST_FILE, ">", "$album_file_list" ) )
{
print "Opened file $album_file_list as $LIST_FILE\n";
}
}
if ( defined $do_dump_data && $do_dump_data eq "true" )
{
use Data::Dumper;
print "data:\n\n";
print Dumper ( $album_data );
print "\n\n\n\n";
print "keys:\n\n";
print Dumper ( keys %{$album_data->{Images}->{Image}} );
print "\n\n\n\n";
}
foreach my $image ( keys %{$album_data->{Images}->{Image}} )
{
print "In loop; image ID=$image\n";
my $ref = $album_data->{Images}->{Image}->{$image};
#
# Write to files list: file name, ID, key, size, MD5
#
print $LIST_FILE ( "$ref->{FileName}\t$image\t$ref->{Key}"
. "\t$ref->{Size}\t$ref->{MD5Sum}\n" );
}
close ( $LIST_FILE );
Here's a sample XML file that breaks it:
<?xml version="1.0" encoding="utf-8"?>
<rsp stat="ok">
<method>images.get</method>
<Images>
<Image id="123" Key="xyz" Type="Album" Caption="Room 5083" FileName="MVI_2838.AVI" Format="MP4" Height="480" Keywords="China; Suite" LastUpdated="2014-04-19 11:49:45" Position="1" Serial="0" Size="116033" Width="640" Date="2014-04-19 11:46:24" Hidden="0" MD5Sum="6151e20053eeda87c688f8becae0d402" Watermark="0">
<Album id="345" Key="zzy" />
</Image>
</Images>
</rsp>
Here's the result of dumping the full $album_data:
$VAR1 = {
'method' => 'images.get',
'Images' => {
'Image' => {
'Serial' => '0',
'Format' => 'MP4',
'Keywords' => 'China; Suite',
'Type' => 'Album',
'Size' => '116033',
'MD5Sum' => '6151e20053eeda87c688f8becae0d402',
'id' => '123',
'Key' => 'xyz',
'LastUpdated' => '2014-04-19 11:49:45',
'Album' => {
'id' => '345',
'Key' => 'zzy'
},
'Position' => '1',
'Height' => '480',
'Date' => '2014-04-19 11:46:24',
'Caption' => 'Room 5083',
'FileName' => 'MVI_2838.AVI',
'Hidden' => '0',
'Width' => '640',
'Watermark' => '0',
}
},
'stat' => 'ok'
};
Here's the result of dumping the keys %{$album_data->{Images}->{Image}} construct:
$VAR1 = 'Serial';
$VAR2 = 'Format';
$VAR3 = 'Keywords';
$VAR5 = 'Type';
$VAR6 = 'Size';
$VAR7 = 'MD5Sum';
$VAR9 = 'id';
$VAR10 = 'Key';
$VAR11 = 'LastUpdated';
$VAR12 = 'Album';
$VAR14 = 'Position';
$VAR15 = 'Height';
$VAR16 = 'Date';
$VAR17 = 'Caption';
$VAR19 = 'FileName';
$VAR20 = 'Hidden';
$VAR23 = 'Width';
$VAR24 = 'Watermark';
$VAR27 = 'Duration';
According to XML::Simple #Status of this Module:
The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces. In particular, XML::LibXML is highly recommended.
The major problems with this module are the large number of options and the arbitrary ways in which these options interact - often with unexpected results.
Patches with bug fixes and documentation fixes are welcome, but new features are unlikely to be added.
XML::Simple is a useful module to quickly parse xml if you're familiar with perl complex data structures. However, whenever the xml gets too complex, the module outlives it's usefulness because of it's arbitrary method of parsing certain structures depending on a lot of configuration variables.
I still use XML::Simple on rare occasions, but I'd advise you to look at either XML::Twig or the afforementioned XML::LibXML to avoid issues like this.
I appreciate Miller's caution that XML::Simple is discouraged and his warning that it is difficult to use due to so many options which interact in ways that are hard to define and manage. While looking into the replacement modules he suggested, I stumbled on some information that I should have been aware of prior to using XML::Simple in the first place. In particular, the fact that my script worked when multiple images are in the XML but fails when there is only one points out that if one is using XML::Simple, it is often critical to set the ForceArray option to the element(s) that are always supposed to be in an array, even if a particular XML file only happens to contain one. Otherwise, the element will sometimes be an array and sometimes a scalar, causing the exact run-time error I was seeing.
So, in my case, setting forcearray => [ 'Image' ] makes the code work (by forcing all <image> elements into an array, even if there is only one), at less immediate effort than trying to figure out how to use a different XML parsing module (although I have no doubt that making the effort to do so will save time in the future).

Perl reading zip files with IO::Uncompress::AnyUncompress

We are moving from our current build system (which is a mess) to one that uses Ant with Ivy. I'm cleaning up all the build files, and finding the jar dependencies. I thought it might be easier if I could automate it a bit, by going through the jars that are checked into the project, finding what classes they contain, then matching those classes with the various import statements in the Java code.
I have used Archive::Tar before, but Archive::Zip isn't a standard Perl module. (My concern is that someone is going to try my script, call me in the middle of the night and tell me it isn't working.)
I noticed that IO::Uncompress::AnyUncompress is a standard module, so I thought I could try IO::Uncompress::AnyUncompressor at leastIO::Uncompress::Unzip` which is also a standard module.
Unfortunately, the documentation for these modules give no examples (According to the documentation, examples are a todo).
I'm able to successfully open my jar and create an object:
my $zip_obj = IO::Uncompress::AnyUncompress->new ( $zip_file );
Now, I want to see the contents. According to the documentation:
getHeaderInfo
Usage is
$hdr = $z->getHeaderInfo();
#hdrs = $z->getHeaderInfo();
This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).
Okay, this isn't an object like Archive::Tar or Archive::Zip returns, and there are no methods or subroutines mentioned to parse the data. I'll use Data::Dumper and see what hash keys are contained in the reference.
Here's a simple test program:
#! /usr/bin/env perl
use 5.12.0;
use warnings;
use IO::Uncompress::AnyUncompress;
use Data::Dumper;
my $obj = IO::Uncompress::AnyUncompress->new("testng.jar")
or die qq(You're an utter failure);
say qq(Dump of \$obj = ) . Dumper $obj;
my #header2 = $obj->getHeaderInfo;
say qq(Dump of \$header = ) . Dumper $headers->[0];
And here's my results:
Dump of $obj = $VAR1 = bless( \*Symbol::GEN0, 'IO::Uncompress::Unzip' );
Dump of $header = $VAR1 = {
'UncompressedLength' => 0,
'Zip64' => 0,
'MethodName' => 'Stored',
'Stream' => 0,
'Time' => 1181224440,
'MethodID' => 0,
'CRC32' => 0,
'HeaderLength' => 43,
'ExtraFieldRaw' => '¦- ',
'ExtraField' => [
[
'¦-',
''
]
],
'FingerprintLength' => 4,
'Type' => 'zip',
'TrailerLength' => 0,
'CompressedLength' => 0,
'Name' => 'META-INF/',
'Header' => 'PK
+N¦6 META-INF/¦- '
};
Some of that looks sort of useful. However, all of my entries return `'Name' => 'META-INF/``, so it doesn't look like a file name.
Is it possible to use IO::Uncompress::AnyUncompress (or even IO::Uncompress:Unzip) to read through the archive and see what files are in its contents. And, if so, how do I parse that header?
Otherwise, I'll have to go with Archive::Zip and let people know they have to download and install it from CPAN on their systems.
The files in the archive are compressed in different data streams, so you need to iterate through the streams to get the individual files.
use strict;
use warnings;
use IO::Uncompress::Unzip qw(unzip $UnzipError);
my $zipfile = 'zipfile.zip';
my $u = new IO::Uncompress::Unzip $zipfile
or die "Cannot open $zipfile: $UnzipError";
die "Zipfile has no members"
if ! defined $u->getHeaderInfo;
for (my $status = 1; $status > 0; $status = $u->nextStream) {
my $name = $u->getHeaderInfo->{Name};
warn "Processing member $name\n" ;
if ($name =~ /\/$/) {
mkdir $name;
}
else {
unzip $zipfile => $name, Name => $name
or die "unzip failed: $UnzipError\n";
}
}

troubleshooting "pseudo-hashes are deprecated" while using xml module

I am just learning how to use perl hashes and ran into this message in perl. I am using XML::Simple to parse xml output and using exists to check on the hash keys.
Message:
Pseudo-hashes are deprecated at ./h2.pl line 53.
Argument "\x{2f}\x{70}..." isn't numeric in exists at ./h2.pl line 53.
Bad index while coercing array into hash at ./h2.pl line 53.
I had the script working earlier with one test directory and then executed the script on another directory for testing when I got this message. How do I resolve/workaround this?
Code that the error references:
use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
#my $data = XMLin($xml);
my $data = XMLin($xml, ForceArray => [qw (file) ]);
my $size=0;
if (exists $data->{class}
and $data->{class}=~ /FileNotFound/) {
print "The directory: $Path does not exist\n";
exit;
} elsif (exists $data->{file}->{path}
and $data->{file}->{path} =~/test-out-00/) {
$size=$data->{file}->{size};
if ($size < 1024000) {
print "FILE SIZE:$size BYTES\n";
exit;
}
} else {
exit;
}
print Dumper( $data );
Working test case, data structure looks like this:
$VAR1 = {
'recursive' => 'no',
'version' => '0.20.202.1.1101050227',
'time' => '2011-09-30T02:49:39+0000',
'filter' => '.*',
'file' => {
'owner' => 'test_act',
'replication' => '3',
'blocksize' => '134217728',
'permission' => '-rw-------',
'path' => '/source/feeds/customer/test/test-out-00',
'modified' => '2011-09-30T02:48:41+0000',
'size' => '135860644',
'group' => '',
'accesstime' => '2011-09-30T02:48:41+0000'
'modified' => '2011-09-30T02:48:41+0000'
},
'exclude' => ''
};
recursive:no
version:0.20.202.1.1101050227
time:2011-10-01T07:06:16+0000
filter:.*
file:HASH(0x84c83ec)
path:/source/feeds/customer/test
directory:HASH(0x84c75d8)
exclude:
Data structure with seeing error:
$VAR1 = {
'recursive' => 'no',
'version' => '0.20.202.1.1101050227',
'time' => '2011-10-03T04:49:36+0000',
'filter' => '.*',
'file' => [
{
'owner' => 'test_act',
'replication' => '3',
'blocksize' => '134217728',
'permission' => '-rw-------',
'path' => '/source/feeds/customer/test/20110531/test-out-00',
'modified' => '2011-10-03T04:47:46+0000',
'size' => '121406618',
'group' => 'feeds',
'accesstime' => '2011-10-03T04:47:46+0000'
},
Test xml file:
<?xml version="1.0" encoding="UTF-8"?><listing time="2011-10-03T04:49:36+0000" recursive="no" path="/source/feeds/customer/test/20110531" exclude="" filter=".*" version="0.20.202.1.1101050227"><directory path="/source/feeds/customer/test/20110531" modified="2011-10-03T04:48:19+0000" accesstime="1970-01-01T00:00:00+0000" permission="drwx------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-00" modified="2011-10-03T04:47:46+0000" accesstime="2011-10-03T04:47:46+0000" size="121406618" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-01" modified="2011-10-03T04:48:04+0000" accesstime="2011-10-03T04:48:04+0000" size="127528522" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/><file path="/source/feeds/customer/test/20110531/test-out-02" modified="2011-10-03T04:48:19+0000" accesstime="2011-10-03T04:48:19+0000" size="125452919" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group="feeds"/></listing>
The "Pseudo-hashes are deprecated" error means you're trying to access an array as a hash, which means that either $data->{file} or $data->{file}{path} is an arrayref.
You can check the data type by using print ref $data->{file}. The Data::Dumper module may also help you to see what is in your data structure (perhaps while setting $Data::Dumper::Maxdepth = N to limit the dump to N number of levels if the structure is big).
UPDATE
Now that you are using ForceArray, $data->{file} should always point to an arrayref, which may possibly have multiple references to path. Here is a modified segment of your code to handle that. But note that the logic of the if-then-exit conditions may have to change.
if (defined $data->{class} and $data->{class}=~ /FileNotFound/) {
print "The directory: $Path does not exist\n";
exit;
}
exit if ! defined $data->{file};
# filter the list for the first file entry named test-out-00
my ( $file ) = grep {
defined $_->{path} && $_->{path} =~ /test-out-00/
} #{ $data->{file} };
exit if ! defined $file;
$size = $file->{size};
if ($size < 1024000) {
print "FILE SIZE:$size BYTES\n";
exit;
}
When using XML::Simple, the ForceArray option is one of the most important to understand, especially in cases when your input data has nested elements that can occur 1 or more times. For example:
use XML::Simple;
use Data::Dumper;
my #xml_snippets = (
'<opt> <name x="3" y="4">B</name> <name x="5" y="6">C</name> </opt>',
'<opt> <name x="1" y="2">A</name> </opt>',
);
for my $xs (#xml_snippets){
my $data = XMLin($xs, ForceArray => 0);
print Dumper($data);
}
Output:
$VAR1 = {
'name' => [ # Array ref because there are 2 <name> elements.
{
'y' => '4',
'content' => 'B',
'x' => '3'
},
{
'y' => '6',
'content' => 'C',
'x' => '5'
}
]
};
$VAR1 = {
'name' => { # No intermediate array ref.
'y' => '2',
'content' => 'A',
'x' => '1'
}
};
By activating the ForceArray option, you can direct XML::Simple to produce consistent data structures that always use the intermediate array reference, even when there is only 1 of a particular nested element. You can activate the option globally or for specific tags, as illustrated here:
my $data = XMLin($xs, ForceArray => 1 ); # Globally.
my $data = XMLin($xs, ForceArray => [qw(name foo bar)]);
First, I recommend that you use ForceArray => [qw( file )] as previously discussed. That will cause an array to be returned for file, whether there's one or more file element. This is easier to handle than having two possible formats.
As I previously indicated, the problem is that you made no provision for looping over multiple file elements. You said you wanted to exit if the file doesn't exist, so that means you want
my $found;
for my $file (#{ $data->{file} }) {
if ($file->{path} =~ m{/test-out-00\z}) {
$found = $file;
last;
}
}
die("Test file not found\n") if !$found;
... do something with file data in $found ...

How to parse multi record XML file ues XML::Simple in Perl

My data.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd country="UK">
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<price>10.0</price>
</cd>
<cd country="CHN">
<title>Greatest Hits</title>
<artist>Dolly Parton</artist>
<price>9.99</price>
</cd>
<cd country="USA">
<title>Hello</title>
<artist>Say Hello</artist>
<price>0001</price>
</cd>
</catalog>
my test.pl
#!/usr/bin/perl
# use module
use XML::Simple;
use Data::Dumper;
# create object
$xml = new XML::Simple;
# read XML file
$data = $xml->XMLin("data.xml");
# access XML data
print "$data->{cd}->{country}\n";
print "$data->{cd}->{artist}\n";
print "$data->{cd}->{price}\n";
print "$data->{cd}->{title}\n";
Output:
Not a HASH reference at D:\learning\perl\t1.pl line 16.
Comment: I googled and found the article(handle single xml record).
http://www.go4expert.com/forums/showthread.php?t=812
I tested with the article code, it works quite well on my laptop.
Then I created my practice code above to try to access multiple record. but failed. How can I fix it? Thank you.
Always use strict;, always use warnings; Don't quote complex references like you're doing. You're right to use Dumper;, it should have shown you that cd was an array ref - you have to specificity which cd.
#!/usr/bin/perl
use strict;
use warnings;
# use module
use XML::Simple;
use Data::Dumper;
# create object
my $xml = new XML::Simple;
# read XML file
my $data = $xml->XMLin("file.xml");
# access XML data
print $data->{cd}[0]{country};
print $data->{cd}[0]{artist};
print $data->{cd}[0]{price};
print $data->{cd}[0]{title};
If you do print Dumper($data), you will see that the data structure does not look like you think it does:
$VAR1 = {
'cd' => [
{
'country' => 'UK',
'artist' => 'Bonnie Tyler',
'price' => '10.0',
'title' => 'Hide your heart'
},
{
'country' => 'CHN',
'artist' => 'Dolly Parton',
'price' => '9.99',
'title' => 'Greatest Hits'
},
{
'country' => 'USA',
'artist' => 'Say Hello',
'price' => '0001',
'title' => 'Hello'
}
]
};
You need to access the data like so:
print "$data->{cd}->[0]->{country}\n";
print "$data->{cd}->[0]->{artist}\n";
print "$data->{cd}->[0]->{price}\n";
print "$data->{cd}->[0]->{title}\n";
In addition to what has been said by Evan, if you're unsure if you're stuck with one or many elements, ref() can tell you what it is, and you can handle it accordingly:
my $data = $xml->XMLin("file.xml");
if(ref($data->{cd}) eq 'ARRAY')
{
for my $cd (#{ $data->{cd} })
{
print Dumper $cd;
}
}
else # Chances are it's a single element
{
print Dumper $cd;
}