Using XML::Simple for retrieving data in Perl - perl

I've created the below XML file for retrieving data.
Input:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<ExecutionLogs>
<category cname='Condition1' cid='T1'>
<log>value1</log>
<log>value2</log>
<log>value3</log>
</category>
<category cname='Condition2' cid='T2'>
<log>value4</log>
<log>value5</log>
<log>value6</log>
</category>
<category cname='Condition3' cid='T3'>
<log>value7</log>
</category>
</ExecutionLogs>
I want the output like below,
Condition1 -> value1,value2,value3
Condition2 -> value4,value5,value6
Condition3 -> value7
I have tried the code below,
use strict;
use XML::Simple;
my $filename = "input.xml";
$config = XML::Simple->new();
$config = XMLin($filename);
#values = #{$config->{'category'}{'log'}};
Please help me on this. Thanks in advance.

A way to do this using XML::Twig:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
XML::Twig->new( twig_handlers => {
category => sub { print $_->att( 'cname'), ': ',
join( ',', $_->children_text( 'log')), "\n";
},
},
)
->parsefile( 'my.xml');
The handler is called each time a category element has been parsed. $_ is the element itself.

What I would do :
use strict; use warnings;
use XML::Simple;
my $config = XML::Simple->new();
$config = XMLin($filename, ForceArray => [ 'log' ]);
# we want an array here ^---------------------^
my #category = #{ $config->{'category'} };
# ^------------------------^
# de-reference of an ARRAY ref
foreach my $hash (#category) {
print $hash->{cname}, ' -> ', join(",", #{ $hash->{log} }), "\n";
}
OUTPUT
Condition1 -> value1,value2,value3
Condition2 -> value4,value5,value6
Condition3 -> value7
NOTE
ForceArray => [ 'log' ] is there to ensure treating same types in {category}->[#]->{log]
unless that, we try to dereferencing an ARRAY ref on a string for the last "Condition3".
Check XML::Simple#ForceArray
and
perldoc perlreftut
perldoc ref
perldoc perlref

Related

In perl I want change the tag content(firstclip) to secondclip in my existing xml file

<?xml version="1.0"?>
<root>
<Arguments>
<apkName>Player
<testUseCase>PlayVideo</testUseCase>
<id>1</id>
<clipName>firstclip</clipName>
</apkName>
</Arguments>
</root>
I tried this code: but its not working and its keeping player name in new tag with name with content and order also changing ..
use XML::Simple;
my $xml_file = "test.xml";
my $xml = XMLin(
$xml_file,
KeepRoot => 1,
ForceArray => 1,
)
$xml->{root}->[0]->{Arguments}->[0]->{apkName}->[0]->{clipName}->[0] = 'secondclip';
XMLout(
$xml,
XMLDecl =>1,
KeepRoot => 1 ,
NoAttr => 1,
OutputFile => $xml_file,
);
Don't use XML::Simple:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
#parse your XML.
my $twig = XML::Twig -> new -> parsefile ( 'your_file.xml' );
#search for, and modify 'clipName' nodes containing the text 'firstclip'.
$_ -> set_text('secondclip') for $twig -> findnodes('//clipName[string()="firstclip"]');
$twig -> set_pretty_print('indented');
$twig -> print;
Although are you sure apkName actually looks like that? It seem odd that the 'close' tag would be where it is.
To rewrite your existing file - XML::Twig has a parsefile_inplace mechanism, but I'd suggest it's overly complicated for what you're trying to do, and instead you just want to
open ( my $output, '>', 'output.new.xml' ) or die $!;
print {$output} $twig -> sprint;

How to get the text contents of an XML child element based on an attribute of its parent

This is my XML data
<categories>
<category id="Id001" name="Abcd">
<project> ID_1234</project>
<project> ID_5678</project>
</category>
<category id="Id002" name="efgh">
<project> ID_6756</project>
<project> ID_4356</project>
</category>
</categories>
I need to get the text contents of each <project> element based on the name attribute of the containing <category> element.
I am using Perl with the XML::LibXML module.
For example, given category name Abcd i should get the list ID_1234, ID_5678.
Here is my code
my $parser = XML::LibXML->new;
$doc = $parser->parse_file( "/cctest/categories.xml" );
my #nodes = $doc->findnodes( '/categories/category' );
foreach my $cat ( #nodes ) {
my #catn = $cat->findvalue('#name');
}
This gives me the category names in array #catn. But how can I get the text values of each project?
You haven't shown what you've tried so far, or what your desired output is so I've made a guess at what you're looking for.
With XML::Twig you could do something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parse ( \*DATA );
foreach my $project ( $twig -> findnodes ( '//project' ) ) {
print join ",", (map { $project -> parent -> att($_) } qw ( id name )), $project -> text,"\n";
}
__DATA__
<categories>
<category id="Id001" name="Abcd">
<project> ID_1234</project>
<project> ID_5678</project>
</category>
<category id="Id002" name="efgh">
<project> ID_6756</project>
<project> ID_4356</project>
</category>
</categories>
Which produces:
Id001,Abcd, ID_1234,
Id001,Abcd, ID_5678,
Id002,efgh, ID_6756,
Id002,efgh, ID_4356,
It does this by using findnodes to locate any element 'project'.
Then extract the 'id' and 'name' attributes from the parent (the category), and print that - along with the text in this particular element.
xpath is a powerful tool for selecting data from XML, and with a more focussed question, we can give more specific answers.
So if you were seeking all the projects 'beneath' category "Abcd" you could:
foreach my $project ( $twig -> findnodes ( './category[#name="Abcd"]/project' ) ) {
print $project -> text,"\n";
}
This uses XML::LibXML, which is the library you're already using.
Your $cat variable contains an XML element object which you can process with the same findnodes() and findvalue() methods that you used on the top-level $doc object.
#!/usr/bin/perl
use strict;
use warnings;
# We use modern Perl here (specifically say())
use 5.010;
use XML::LibXML;
my $doc = XML::LibXML->new->parse_file('categories.xml');
foreach my $cat ($doc->findnodes('//category')) {
say $cat->findvalue('#name');
foreach my $proj ($cat->findnodes('project')) {
say $proj->findvalue('.');
}
}
You can try with XML::Simple
use strict;
use warnings;
use XML::Simple;
use Data::Dumper
my $XML_file = 'your XML file';
my $XML_data;
#Get data from your XML file
open(my $IN, '<:encoding(UTF-8)', $XML_file) or die "cannot open file $XML_file";
{
local $/;
$XML_data = <$IN>;
}
close($IN);
#Store XML data as hash reference
my $xmlSimple = XML::Simple->new(KeepRoot => 1);
my $hash_ref = $xmlSimple->XMLin($XML_data);
print Dumper $hash_ref;
The hash reference will be as below:
$VAR1 = {
'categories' => {
'category' => {
'efgh' => {
'id' => 'Id002',
'project' => [
' ID_6756',
' ID_4356'
]
},
'Abcd' => {
'id' => 'Id001',
'project' => [
' ID_1234',
' ID_5678'
]
}
}
}
};
Now to get data which you want:
foreach(#{$hash_ref->{'categories'}->{'category'}->{'Abcd'}->{'project'}}){
print "$_\n";
}
The result is:
ID_1234
ID_5678

How do I extract an attribute/property in Perl using XML::Twig module?

If I have the below sample XML, how do I extract the _Id from the field using XML::Twig?
<note>
<to _Id="100">Share</to>
<from>Jane</from>
<heading>Reminder</heading>
<body>A simple text</body>
</note>
I've tried combinations of the below with no luck.
sub getId {
my ($twig, $mod) = #_;
##my $to_id = $mod->field('to')->{'_Id'}; ## does not work
##my $to_id = $mod->{'atts'}->{_Id}; ## does not work
##my $to_id = $mod->id; ## does not work
$twig->purge;
}
This is one way to get 100. It uses the first_child method:
use warnings;
use strict;
use XML::Twig;
my $xml = <<XML;
<note>
<to _Id="100">Share</to>
<from>Jane</from>
<heading>Reminder</heading>
<body>A simple text</body>
</note>
XML
my $twig = XML::Twig->new(twig_handlers => { note => \&getId });
$twig->parse($xml);
sub getId {
my ($twig, $mod) = #_;
my $to_id = $mod->first_child('to')->att('_Id');
print "$to_id \n";
}

perl script to iterate over xml nodes using XML::LibXML

I am trying to come up with a perl script to iterate over some nodes and get values in xml file.
My XML File looks like below and is saved spec.xml
<?xml version="1.0" encoding="UTF-8"?>
<WO xmlns="http://www.example.com/yyyy" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<WOSet>
<SR>
<FINISHTIME>2013-07-29T18:21:38-05:00</FINISHTIME>
<STARTTIME xsi:nil="true" />
<TYPE>SR</TYPE>
<DESCRIPTION>Create CUST</DESCRIPTION>
<EXTERNALSYSTEMID />
<REPORTEDBY>PCAUSR</REPORTEDBY>
<REPORTEDEMAIL />
<STATUS>RESOLVED</STATUS>
<SRID>1001</SRID>
<UID>1</UID>
<SPEC>
<AVALUE>IT</AVALUE>
<ATTRID>CUST_DEPT</ATTRID>
<NALUE xsi:nil="true" />
<TVALUE />
</SPEC>
<SPEC>
<AVALUE>001</AVALUE>
<ATTRID>DEPT_CODE</ATTRID>
<NVALUE xsi:nil="true" />
<TVALUE />
</SPEC>
</SR>
</WOSet>
</WO>
when I run the below script , I neither get the output nor any error to get clue on where to fix things...
I am not a perl expert , would love experts here to through some light...
#!/usr/bin/perl
use XML::LibXML;
use strict;
use warnings;
my $file = 'spec.xml';
my $parser = XML::LibXML->new();
my $tree = $parser->parse_file($file);
my $root = $tree->getDocumentElement;
foreach my $atrid ( $tree->findnodes('WO/WOSet/SR/SPEC') ) {
my $name = $atrid->findvalue('ATTRID');
my $value = $atrid->findvalue('AVALUE');
print $name
print " = ";
print $value;
print ";\n";
}
My expected output is
CUST_DEPT = IT
DEPT_CODE = 001
The XML doesn't contain any element named WO in the null namespace. You want to match the elements named WO in the http://www.example.com/yyyy namespace.
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML qw( );
use XML::LibXML::XPathContext qw( );
my $file = 'spec.xml';
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($file);
my $root = $doc->getDocumentElement;
my $xpc = XML::LibXML::XPathContext->new($doc);
$xpc->registerNs(y => 'http://www.example.com/yyyy');
for my $atrid ( $xpc->findnodes('y:WO/y:WOSet/y:SR/y:SPEC') ) {
my $name = $xpc->findvalue('y:ATTRID', $atrid);
my $value = $xpc->findvalue('y:AVALUE', $atrid);
print "$name = $value\n";
}

How to parse multi record XML file ues XML::Simple in Perl

My data.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd country="UK">
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<price>10.0</price>
</cd>
<cd country="CHN">
<title>Greatest Hits</title>
<artist>Dolly Parton</artist>
<price>9.99</price>
</cd>
<cd country="USA">
<title>Hello</title>
<artist>Say Hello</artist>
<price>0001</price>
</cd>
</catalog>
my test.pl
#!/usr/bin/perl
# use module
use XML::Simple;
use Data::Dumper;
# create object
$xml = new XML::Simple;
# read XML file
$data = $xml->XMLin("data.xml");
# access XML data
print "$data->{cd}->{country}\n";
print "$data->{cd}->{artist}\n";
print "$data->{cd}->{price}\n";
print "$data->{cd}->{title}\n";
Output:
Not a HASH reference at D:\learning\perl\t1.pl line 16.
Comment: I googled and found the article(handle single xml record).
http://www.go4expert.com/forums/showthread.php?t=812
I tested with the article code, it works quite well on my laptop.
Then I created my practice code above to try to access multiple record. but failed. How can I fix it? Thank you.
Always use strict;, always use warnings; Don't quote complex references like you're doing. You're right to use Dumper;, it should have shown you that cd was an array ref - you have to specificity which cd.
#!/usr/bin/perl
use strict;
use warnings;
# use module
use XML::Simple;
use Data::Dumper;
# create object
my $xml = new XML::Simple;
# read XML file
my $data = $xml->XMLin("file.xml");
# access XML data
print $data->{cd}[0]{country};
print $data->{cd}[0]{artist};
print $data->{cd}[0]{price};
print $data->{cd}[0]{title};
If you do print Dumper($data), you will see that the data structure does not look like you think it does:
$VAR1 = {
'cd' => [
{
'country' => 'UK',
'artist' => 'Bonnie Tyler',
'price' => '10.0',
'title' => 'Hide your heart'
},
{
'country' => 'CHN',
'artist' => 'Dolly Parton',
'price' => '9.99',
'title' => 'Greatest Hits'
},
{
'country' => 'USA',
'artist' => 'Say Hello',
'price' => '0001',
'title' => 'Hello'
}
]
};
You need to access the data like so:
print "$data->{cd}->[0]->{country}\n";
print "$data->{cd}->[0]->{artist}\n";
print "$data->{cd}->[0]->{price}\n";
print "$data->{cd}->[0]->{title}\n";
In addition to what has been said by Evan, if you're unsure if you're stuck with one or many elements, ref() can tell you what it is, and you can handle it accordingly:
my $data = $xml->XMLin("file.xml");
if(ref($data->{cd}) eq 'ARRAY')
{
for my $cd (#{ $data->{cd} })
{
print Dumper $cd;
}
}
else # Chances are it's a single element
{
print Dumper $cd;
}