XML::LibXML replace value - perl

I have the below xml file and in this I want to write a perl script so that I can grep on first book name and then author name and if both of them matches change the field "value" from false to true. For ex if book name is abc and author name is john, change the value from false to true.
use warnings;
use strict;
use XML::LibXML;
my $parser = XML::LibXML->new();
my $xmldoc = $parser->parse_file('b.xml');
for my $book ($xmldoc->findnodes('/library/book')) {
my $name = $book->findvalue('/#name');
if($name eq "abc")
{
print "yes" ;
}
}
<library>
<book name="abc" id="3">
<key name="history">
<default label="base" value="1"/>
</key>
<author name="john">
<default label="base" value="false"/>
</author>
</book>
<book name="xyz" id="4">
<key name="science">
<default label="base" value="1"/>
</key>
<author name="nik">
<default label="base" value="false"/>
</author>
</book>
I am new to perl, can someone help me here? I have written some code to reach to first point to check the author name but that itself is not working.

for my $default_author_node ($xmldoc->findnodes(
'/library/book[#name="abc"]/author[#name="john"]/default'
)) {
$default_author_node->setAttribute('value', 'true');
}
But you probably want the names to be variable.
Solution 1: Dynamically build the above XPath.
sub text_to_xpath {
my ($s) = #_;
return qq{"$s"} if $s !~ tr/"//;
return qq{'$s'} if $s !~ tr/'//;
$s = s/"/", '"', "/g;
return qq{concat("$s")};
}
my $target_book_xp = text_to_xpath($target_book);
my $target_author_xp = text_to_xpath($target_author);
for my $default_author_node ($xmldoc->findnodes(
"/library/book[\#name=$target_book_xp]/author[\#name=$target_author_xp]/default"
)) {
$default_author_node->setAttribute('value', 'true');
}
Solution 2: Do the checking yourself.
This is what you were attempting, but XPath /#name gets the child attributes (#) named name at the root of the document (/), but the only node there is the root element (library). Just like in directory paths, don't use a leading / if you want to search relative to the context.
for my $book_node ($xmldoc->findnodes('/library/book')) {
my $name = $book_node->getAttribute('name');
next if !defined($name) || $name ne $target_book;
for my $author_node ($book_node->findnodes('author')) {
my $name = $book_node->getAttribute('name');
next if !defined($name) || $name ne $target_author;
for my $default_author_node ($author_node->findnodes('default')) {
$default_author_node->setAttribute('value', 'true');
}
}
}

Related

unable to parse xml file using registered namespace

I am using XML::LibXML to parse a XML file. There seems to some problem in using registered namespace while accessing the node elements. I am planning to covert this xml data into CSV file. I am trying to access each and every element here. To start with I tried out extracting attribute values of <country> and <state> tags. Below is the code I have come with . But I am getting error saying XPath error : Undefined namespace prefix.
use strict;
use warnings;
use Data::Dumper;
use XML::LibXML;
my $XML=<<EOF;
<DataSet xmlns="http://www.w3schools.com" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">
<exec>
<survey_region ver="1.1" type="x789" date="20160312"/>
<survey_loc ver="1.1" type="x789" date="20160312"/>
<note>Population survey</note>
</exec>
<country name="ABC" type="MALE">
<state name="ABC_state1" result="PASS">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
Some random text
contained here
]]></comment>
</state>
</country>
<country name="XYZ" type="MALE">
<state name="XYZ_state2" result="FAIL">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
any random text data
]]></comment>
</state>
</country>
</DataSet>
EOF
my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($XML);
my $xc = XML::LibXML::XPathContext->new($doc);
$xc->registerNs('x','http://www.w3schools.com');
foreach my $camelid ($xc->findnodes('//x:DataSet')) {
my $country_name = $camelid->findvalue('./x:country/#name');
my $country_type = $camelid->findvalue('./x:country/#type');
my $state_name = $camelid->findvalue('./x:state/#name');
my $state_result = $camelid->findvalue('./x:state/#result');
print "state_name ($state_name)\n";
print "state_result ($state_result)\n";
print "country_name ($country_name)\n";
print "country_type ($country_type)\n";
}
Update
if I remove the name space from XML and change my XPath slightly it seems to work. Can someone help me understand the difference.
foreach my $camelid ($xc->findnodes('//DataSet')) {
my $country_name = $camelid->findvalue('./country/#name');
my $country_type = $camelid->findvalue('./country/#type');
my $state_name = $camelid->findvalue('./country/state/#name');
my $state_result = $camelid->findvalue('./country/state/#result');
print "state_name ($state_name)\n";
print "state_result ($state_result)\n";
print "country_name ($country_name)\n";
print "country_type ($country_type)\n";
}
This would be my approach
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $XML=<<EOF;
<DataSet xmlns="http://www.w3schools.com" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">
<exec>
<survey_region ver="1.1" type="x789" date="20160312"/>
<survey_loc ver="1.1" type="x789" date="20160312"/>
<note>Population survey</note>
</exec>
<country name="ABC" type="MALE">
<state name="ABC_state1" result="PASS">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
Some random text
contained here
]]></comment>
</state>
</country>
<country name="XYZ" type="MALE">
<state name="XYZ_state2" result="FAIL">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
any random text data
]]></comment>
</state>
</country>
</DataSet>
EOF
my $parser = XML::LibXML->new();
my $tree = $parser->parse_string($XML);
my $root = $tree->getDocumentElement;
my #country = $root->getElementsByTagName('country');
foreach my $citem(#country){
my $country_name = $citem->getAttribute('name');
my $country_type = $citem->getAttribute('type');
print "Country Name -- $country_name\nCountry Type -- $country_type\n";
my #state = $citem->getElementsByTagName('state');
foreach my $sitem(#state){
my #info = $sitem->getElementsByTagName('info');
my $state_name = $sitem->getAttribute('name');
my $state_result = $sitem->getAttribute('result');
print "State Name -- $state_name\nState Result -- $state_result\n";
foreach my $i (#info){
my $text = $i->getElementsByTagName('type');
print "Info --- $text\n";
}
}
print "\n";
}
Of course you can manipulate the data anyway you'd like. If you are parsing from a file change parse_string to parse_file.
For the individual elements in the xml use the getElementsByTagName to get the elements within the tags. This should be enough to get you going
There seem to be two small mistakes here.
1. call findvalue for the XPathContext document with the context node as parameter.
2. name is a attribute in country no a node.
Therefor try :
my $country_name = $xc->findvalue('./x:country/#name', $camelid );
Update to the updated question if I remove the name space from XML and change my XPath slightly it seems to work. Can someone help me understand the difference.
To understand what happens here have a look to NOTE ON NAMESPACES AND XPATH
In your case $camelid->findvalue('./x:state/#name'); calls findvalue is called for an node.
But: The recommended way is to use the XML::LibXML::XPathContext module to define an explicit context for XPath evaluation, in which a document independent prefix-to-namespace mapping can be defined. Which I did above.
Conclusion:
Calling find on a node will only work: if the root element had no namespace
(or if you use the same prefix as in the xml doucment if ther is any)

How to actually modify values of an XML file using XML::LibXML

I have an XML file (information.xml). I have to extract element and attribute values from this XML file and insert those element and attribute values into another XML file (build.xml). I have to change the build.xml file by filling the appropriate element values and tags from information.xml file.
I have to use XML::LibXML to do so. I am able to extract the element and attribute values from information.xml. But, I am unable to open and fill those values in build.xml
Example :
information.xml
<info>
<app version="10.5.10" long_name ="My Application">
<name> MyApp </name>
<owner>larry </owner>
<description> This is my first application</description>
</app>
</info>
build.xml
<build long_name="" version="">
<section type="Appdesciption">
<description> </description>
</section>
<section type="Appdetails">
<app_name> </app_name>
<owner></owner>
</section>
</build>
Now, my task is to extract value of owner from information.xml, open build.xml, search for owner tag in build.xml and put the extracted value there.
The Perl script looks like:
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $file1="/root/shubhra/myapp/information.xml";
my $file2="/root/shubhra/myapp/build.xml";
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($file1);
foreach my $line ($doc->findnodes('//info/app'))
{
my $owner= $line->findnodes('./owner'); # 1st way
print "\n",$owner->to_literal,"\n";
my ($long_name) = $line->findvalue('./#long_name'); # 2nd way
print "\n $long_name \n";
my $version = $line->findnodes('#version');
print "\n",$version->to_literal,"\n";
}
my $parser2 = XML::LibXML->new();
my $doc2 = $parser2->parse_file($file2);
foreach my $line2 ($doc2->findnodes('//build'))
{
my ($owner2)= $line2->findnodes('./section/owner/text()');
my ($version2)=$line2->findvalue('./#version');
print "\n Build.xml already has version : $version2 \n";
print "\n Build.xml already has owner :",$owner2->to_literal;
$owner2->setData("Windows Application 2"); # Not changing build.xml
$line2->setAttribute(q|version|,"60.60.60"); # Not changing build.xml
my $changedversion = $line2->getAttribute(q|version|);
#superficially changed but didn't changed build.xml content
print "\n The changed version is : $changedversion";
}
build.xml looks like :
<build long_name="" version="9.10.10">
<section type="Appdesciption">
<description> </description>
</section>
<section type="Appdetails">
<app_name> </app_name>
<owner>shubhra</owner>
</section>
</build>
my $doc3 = XML::LibXML->load_xml(location => $file2, no_blanks => 1);
my $xpath_expression = '/build/section/owner/text()';
my #nodes = $doc3->findnodes( $xpath_expression );
for my $node (#nodes) {
my $content = $node->toString;
$content = $owner;
$node->setData($content);
}
$doc->toFile($file2 . '.new', 1);
The following fails to find anything (setting $owner2 to undef) since owner has no text:
my ($owner2) = $line2->findnodes('./section/owner/text()');
You want
my ($owner2) = $line2->findnodes('./section/owner');
This entails changing
print "\n Build.xml already has owner :", $owner2->to_literal;
to
print "\n Build.xml already has owner :", $owner2->textContent;
and
$owner2->setData("Windows Application 2");
to
$owner2->removeChildNodes();
$owner2->appendText("Windows Application 2");
You imply you want the following to change build.xml, but it doesn't even mention build.xml:
$line2->setAttribute(q|version|, "60.60.60");
It does modify $doc2, but you'll need to add the following code to modify build.xml too:
$doc2->toFile('build.xml');

print lines between x and y from the log using Perl

I have log file and that contain some xml messages like...
<fixsim xyz='tststtsts'>
<name test="test1">
<time t=234>
</time>
</name>
</fixsim>
here some normal log text
whoiwoei
blsdbndsnb
<fixsim xyz='tssts'
<name test="test2"
<time t=234>
</time>
</name>
</fixsim>
and so on....
From the above log file i want to grab the xml message (from <Fixsim> to </fixsim>) with some condition. For example
i want xml message having test= test2. so as output i should get
<fixsim xyz='tssts'
<name test="test2"
<time t=234>
</time>
</name>
</fixsim>
The following will get the XML docs:
process($_) for $log =~ m{<fixsim.*?</fixsim>}sg;
and so would
my $xml;
while (<$log_fh>) {
if ( my $count = m{<fixsim} .. m{</fixsim>} ) {
$xml .= $_;
if ($count =~ /E0\z/) {
process($xml);
$xml = undef;
}
}
process($xml) if defined($xml);
}
Once you got the XML, you can extract the field you need using your favorite XML parser.

Check the particular value of attribute and extract sibblings value using XPath in Perl using XML::LibXML

If solve is user, I need the value of name (LINES).
if solve is imm and name is ARCH, I need the value of value (6M)
I tried in the following:
use strict;
use warnings;
use XML::LibXML;
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file(...);
my $j = 0;
my #imm;
$imm[$j++] = $_->data
foreach
$doc->findnodes(
"//meters/meter"
."/name[../value[\#solve/string() = 'immediate']]"
."/text()");
Errors are
Possible unintended interpolation of #solve in string
Global symbol "#solve" requires explicit package name
Sample XML:
<rules>
<meters>
<meter>
<name>LINES</spirit:name>
<display>LINES</display>
<description>Specifies the number of lines</description>
<value
solve="user"
id="LINES"
order="11"
format="long"
dataType="integer"
min="2"
max="34">34</value>
</meter>
<meter>
<name>ARCH</name>
<display>Define ARCH</display>
<description>Define ARCH</description>
<value
format="string"
solve="imm"
id="ARCH"
>6M</value>
</meter>
</meters>
</rules>
<rules>
<meters>
<meter>
<name>LINES</spirit:name>
<display>LINES</display>
<description>Specifies the number of lines</description>
<value
solve="user"
id="LINES"
order="11"
format="long"
dataType="integer"
min="2"
max="34">34</value>
</meter>
<meter>
<name>ARCH</name>
<display>Define ARCH</display>
<description>Define ARCH</description>
<value
format="string"
solve="imm"
id="ARCH"
>6M</value>
</meter>
</meters>
</rules>
This works for me:
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file('XXX.xml');
my $j = 0;
my #imm;
$imm[$j++] = $_->data
for $doc->findnodes(
"//meters/meter"
."/name[../value[\#solve = 'imm']]"
."/text()");
print for #imm;
I made the following changes to your code:
I removed the /string() part from the XPath expression. It is invalid usage of a function as indicated by the error I got.
The value of solve is imm in your sample data, not immediate.
BTW, changing the middle of the XPath expression to
'/name[../value[#solve = "immediate"]]'
avoids the need of the backslash.
a quick and dirty hack to solve your issue - with XML::Simple:
use strict;
use warnings;
use XML::Simple ;
use Data::Dumper;
my $xml = XMLin(qq|
<xml>
<rules>
<meters>
<meter>
<name>LINES</name>
<display>LINES</display>
<description>Specifies the number of lines</description>
<value
solve="user"
id="LINES"
order="11"
format="long"
dataType="integer"
min="2"
max="34">34</value>
</meter>
<meter>
<name>ARCH</name>
<display>Define ARCH</display>
<description>Define ARCH</description>
<value
format="string"
solve="imm"
id="ARCH"
>6M</value>
</meter>
</meters>
</rules>
<rules>
<meters>
<meter>
<name>LINES</name>
<display>LINES</display>
<description>Specifies the number of lines</description>
<value
solve="user"
id="LINES"
order="11"
format="long"
dataType="integer"
min="2"
max="34">34</value>
</meter>
<meter>
<name>ARCH</name>
<display>Define ARCH</display>
<description>Define ARCH</description>
<value
format="string"
solve="imm"
id="ARCH"
>6M</value>
</meter>
</meters>
</rules>
</xml>|,
ForceArray => 1);
foreach my $rule( #{ $xml->{rules} } ){
foreach my $meter ( #{ $rule->{meters} } ){
foreach my $m ( #{ $meter->{meter} } ){
if ( $m->{value}->{LINES} and $m->{value}->{LINES}->{solve} eq 'user' ) {
print "user solver: ".$m->{name}->[0]."\n";
} elsif ( $m->{value}->{ARCH}->{solve} and $m->{name}->[0] and $m->{value}->{ARCH}->{solve} eq 'imm' and $m->{name}->[0] eq 'ARCH' ) {
print "imm solver: ".Dumper( $m->{value}->{ARCH}->{content})."\n";
} else {
print "no solver???".Dumper($m);
}
}
}
}
I hope you can use this as a starting point.
Observations
given that the foreaches contain only one instruction (the if), they could be rewritten into maps, although I would not encourage that
the if could also be rewritten using ternary operator and print the resulting string
taking into consideration all of the above, the whole XML processing could be rewritten in a single line consisting of 3 maps and 2 ternary ops - I think it would be almost unreadable

Dropdown-Menu with optgroup

i am trying to create a dynamic dropdown-menu that receives its entries out of an xml-file at script-startup.
first i tried a static version like this:
Tr(td([popup_menu( -name=>'betreff', -values=>[optgroup(-name=>'Mädels',
-values=>['Susi','Steffi',''], -labels=>{'Susi'=>'Petra','Steffi'=>'Paula'})
,optgroup(-name=>'Jungs', -values=>['moe', 'catch',''])])]));
that worked fine.
The prob starts when i try to put the -values-parameter of popup_menu into a scalar variable.
Should somehow lokk similar to that one:
$popup_values = "[optgroup(-name=>'Mädels', -values=>['Susi','Steffi',''],
-labels=>{'Susi'=>'Petra','Steffi'=>'Paula'}),optgroup(-name=>'Jungs',
-values=>['moe', 'catch',''])]"
or with single quotation marks.
The goal is to build that string by concatenating the syntax-corrected elements of the xml-file. Thats because i do not know a priori how many optgroups or list elements within the optgroups will exist.
Any idea?
Thx in advance
Jochen
So you have an XML file which you use to generate that string? Why not directly generate the data structure necessary for the popup_menu call? It's just an array (you can call optgroup while "analysing" the XML file)
If you really want to use the string-solution then you could use eval to transform the string to the data structure. Though this solution has certain security issues.
Reading From XML-File
Here's an example of how to transform form XML to the optgroup, this of course depends on how your XML-file looks like.
use strict;
use warnings;
use XML::Simple;
use CGI qw/:standard/;
my $xmlString = join('', <DATA>);
my $xmlData = XMLin($xmlString);
my #popup_values;
foreach my $group (keys(%{$xmlData->{group}})) {
my (#values, %labels);
my $options = $xmlData->{group}->{$group}->{opt};
foreach my $option (keys(%{$options})) {
push #values, $option;
if(exists($options->{$option}->{label}) &&
'' ne $options->{$option}->{label}) {
$labels{$option} = $options->{$option}->{label};
}
}
push #popup_values, optgroup(-name => $group,
-labels => \%labels,
-values => \#values
);
}
print popup_menu(-name=>'betreff', -values=> \#popup_values);
__DATA__
<?xml version="1.0" encoding="UTF-8" ?>
<dropdown>
<group name="Mädels">
<opt name="Susi" label="Petra"/>
<opt name="Steffi" label="Paula"/>
<opt name="" />
</group>
<group name="Jungs">
<opt name="moe" />
<opt name="catch" />
<opt name="" />
</group>
</dropdown>