Perl: unable to extract sibling value using Twig::XPath syntax - perl

Recently I start to use XML::Twig::XPath but the module does not seem to recognize an xpath syntax.
In the following XML, I want the value of "Txt" node if the value of PlcAndNm node is "ext_1"
<?xml version="1.0" encoding="UTF-8"?>
<root>
<Document>
<RedOrdrV03>
<MsgId>
<Id>1</Id>
</MsgId>
<Xtnsn>
<PlcAndNm>ext_1</PlcAndNm>
<Txt>1234</Txt>
</Xtnsn>
<Xtnsn>
<PlcAndNm>ext_2</PlcAndNm>
<Txt>ABC</Txt>
</Xtnsn>
</RedOrdrV03>
</Document>
<Document>
<RedOrdrV03>
<MsgId>
<Id>2</Id>
</MsgId>
<Xtnsn>
<PlcAndNm>ext_1</PlcAndNm>
<Txt>9876</Txt>
</Xtnsn>
<Xtnsn>
<PlcAndNm>ext_2</PlcAndNm>
<Txt>DEF</Txt>
</Xtnsn>
</RedOrdrV03>
</Document>
</root>
I have tried whit expression //Xtnsn[PlcAndNm="ext_1"]/Txt but I received an error
This is the code:
use XML::Twig::XPath;
my $subelt_count = 1;
my #processed_elements;
my $xmlfile = 'c:/test_file.xml';
my $parser = XML::Twig->new(
twig_roots => { 'RedOrdrV03' => \&process_xml } ,
end_tag_handlers => { 'Document' },
);
$parser->parsefile($xmlfile);
sub process_xml {
my ( $twig, $elt ) = #_;
push( #processed_elements, $elt );
if ( #processed_elements >= $subelt_count ) {
my $MsgId = $twig->findvalue('RedOrdrV03/MsgId/Id');
my $Xtnsn_Txt1 = $twig->findvalue('//Xtnsn[PlcAndNm="ext_1"]/Txt');
print "MsgId: $MsgId - Xtnsn_Txt1: $Xtnsn_Txt1\n";
}
$_->delete for #processed_elements;
#processed_elements = ();
$twig->purge;
}
Is there a simple way of using xpath to obtain the value?
I know that a possibility is somenthing like:
my $Xtnsn_Txt1 = $twig->first_elt( sub { $_[0]->tag eq 'PlcAndNm' && $_[0]->text eq 'ext_1' })->next_sibling()->text();
but I prefer using the simplest XPath syntax,
Thanks in advance for your help!

You can use this:
my $Xtnsn_Txt1 = $twig->findvalue('//Xtnsn/PlcAndNm[string()="ext_1"]/../Txt');

Another approach could be :
//Txt[preceding-sibling::PlcAndNm[.="ext_1"]]
You can also modify a little bit your XPath expression to see if it works with :
//Xtnsn[./PlcAndNm[contains(.,"ext_1")]]/Txt
EDIT : This works fine with the original XML::XPath module :
use XML::XPath;
use XML::XPath::Node::Element;
my $xp = XML::XPath->new(filename => 'pathtoyour.xml');
my $nodeset = $xp->find('//Xtnsn[PlcAndNm="ext_1"]/Txt');
foreach my $node ($nodeset->get_nodelist) {
print XML::XPath::Node::Element::string_value($node),"\n\n";
}
Output : 1234 9876

Related

XML SIMPLE PERL - Looping through child nodes problem

I have some perl code
my $res = $ua->get( $access->to_url );
if ($res->is_success) {
my $ref = XMLin( $res->content );
my $xml = new XML::Simple;
$data = $xml->XMLin($res->content,ForceArray => 1);
#print $res->content;
for my $purchase ( #{ $data->{PurchaseOrders}->{PurchaseOrder}} )
This bit is fine....
How ever when i try loop through child elements, if there is only one child element
i get the "not an array reference" error
for my $item ( #{$purchase->{LineItems}->{LineItem}} )
{
$itemCode = $item->{ItemCode};
}
The XML structure is something like this
PurchaseOrders
PurchaseOrder
LineItems
LineItem
i am aware of an issue with xml simple where i have to forceArray, but i am not sure how to forceArray on the child Nodes
I found this article on stackoverflow that seems very close to my exact problem, but i am struggling on how to execute it with in my code
perl, parsing XML using XML::Simple
$VAR1 = {
'PurchaseOrderID' => '82fa50d6-fd45-4fd2-b42d-035aaaa39a2c',
'LineAmountTypes' => 'Exclusive',
'SentToContact' => 'true',
'AttentionTo' => 'sxxxx',
'Status' => 'AUTHORISED',
'LineItems' => {
'LineItem' => {
'LineAmount' => '57.61',
'Quantity' => '1.0000',
'UnitAmount' => '57.6100',
'LineItemID' => 'e295d55d-68bd',
'Description' => 'xxx',
'ItemCode' => 'xxx',
'TaxAmount' => '11.52',
'AccountCode' => '310',
'TaxType' => 'INPUT2'
}
},
'UpdatedDateUTC' => '2018-10-26T14:19:19.053',
'CurrencyCode' => 'GBP',
'Contact' => {
Included a snipet from my print dumper - please note,its just a snipet of the important part, everything is fine until it hits line items
Also here is the XML file
<PurchaseOrder>
<PurchaseOrderID>82fa50</PurchaseOrderID>
<PurchaseOrderNumber>PO-0029</PurchaseOrderNumber>
<Date>2018-10-26T00:00:00</Date>
<DeliveryDate>2018-10-28T00:00:00</DeliveryDate>
<DeliveryAddress>Address/DeliveryAddress>
<AttentionTo>XXX</AttentionTo>
<SentToContact>true</SentToContact>
<Reference>000000078</Reference>
<CurrencyRate>1.000000</CurrencyRate>
<CurrencyCode>GBP</CurrencyCode>
<Contact>
<ContactID>f203ed00-8cd1-4e4d-9b76-f5e7d90a3c19</ContactID>
<ContactStatus>ACTIVE</ContactStatus>
<Name>XXX</Name>
<FirstName>XXXy</FirstName>
<LastName>XXX</LastName>
<Addresses>
<Address>
<AddressType>XXX</AddressType>
<AddressLine1>XXX</AddressLine1>
<AddressLine2>XXX</AddressLine2>
<City>XXX</City>
<Region>XXX</Region>
<PostalCode>XXX</PostalCode>
<Country>GBR</Country>
</Address>
<Address>
<AddressType>XXX</AddressType>
<AddressLine1>Unit 1-3</AddressLine1>
<AddressLine2>XXX</AddressLine2>
<City>XXX</City>
<Region>West Yorkshire</Region>
<PostalCode>POSTCODE</PostalCode>
<Country>GBR</Country>
</Address>
</Addresses>
<UpdatedDateUTC>2018-10-08T17:19:55.083</UpdatedDateUTC>
<DefaultCurrency>GBP</DefaultCurrency>
</Contact>
<BrandingThemeID>2ffe566f-7a88-486a-938c-639d27966197</BrandingThemeID>
<Status>AUTHORISED</Status>
<LineAmountTypes>Exclusive</LineAmountTypes>
<LineItems>
<LineItem>
<ItemCode>xxx</ItemCode>
<Description>des</Description>
<UnitAmount>57.6100</UnitAmount>
<TaxType>INPUT2</TaxType>
<TaxAmount>11.52</TaxAmount>
<LineAmount>57.61</LineAmount>
<AccountCode>310</AccountCode>
<Quantity>1.0000</Quantity>
<LineItemID>e295d55d-68bd-41b0-a0b1-cf1f2d5b7a4f</LineItemID>
</LineItem>
</LineItems>
<SubTotal>57.61</SubTotal>
<TotalTax>11.52</TotalTax>
<Total>69.13</Total>
<UpdatedDateUTC>2018-10-26T14:19:19.053</UpdatedDateUTC>
<HasAttachments>false</HasAttachments>
</PurchaseOrder>
You can avoid issues with ForceArray and confusing data structures by using an XML parser that returns an object that understands the XML tree. Mojo::DOM is a nice one if you know CSS.
use Mojo::DOM;
my $dom = Mojo::DOM->new->xml(1)->parse($res->decoded_content);
for my $purchase ($dom->find('PurchaseOrders > PurchaseOrder')->each) {
# $purchase is a Mojo::DOM object representing a PurchaseOrder element
for my $item ($purchase->find('LineItems > LineItem')->each) {
# It's unclear if ItemCode is an an attribute or a sub-element; assuming sub-element
my $itemCode = $item->at('ItemCode')->text;
...
}
}
XML::LibXML is another option that can be used similarly but using XPath or DOM instead of CSS to locate elements.
use XML::LibXML qw( );
my $doc = XML::LibXML->load_xml(string => $res->decoded_content);
for my $purchase ($doc->findodes('/PurchaseOrders/PurchaseOrder')) {
# $purchase is a XML::LibXML::Element object representing a PurchaseOrder element
for my $item ($purchase->findnodes('LineItems/LineItem')) {
# It's unclear if ItemCode is an an attribute or a sub-element; assuming sub-element
my $itemCode = $item->findvalue('ItemCode');
...
}
}

access sub child value by libxml::xpathcontext

I want to access the value of sub child and modify it. This is my xml
<config xmlns:xc="urn:ietf:params:xml:ns:netconf:base:1.0">
<outer1 xmlns="http://blablabla" >
<inner>
<name>
<prenom>Hello</prenom>
</name>
<profession>warrior</profession>
</inner>
<inner>
<name>
<prenom>Hello</prenom>
</name>
<org>wwf</org>
<profession>warrior</profession>
</inner>
</outer1>
and this is my code
my $dom = XML::LibXML->load_xml( location => $xml);
my $context = XML::LibXML::XPathContext->new( $dom->documentElement() );
$context->registerNs( 'u' => '"urn:ietf:params:xml:ns:netconf:base:1.0' );
$context->registerNs( 'u' => 'http://blablabla');
for my $node ($context->findnodes('//u:inner') ) {
for my $node2 ($node->findnodes('//u:name') ) {
#if (($node->findnodes('u:name', $node2) ->size) != 1) {next;}
my ($mh) = $node->findnodes('u:prenom', $node2);
my $size = $node->findnodes('u:prenom', $node2) ->size;
print "size $size";
if ($size != 1) {next;}
$mh ->removeChildNodes();
$mh->appendText('World12456');
print "mh = $mh";
}
}
I want to access prenom and modify it to 'World12456'. With currrent code; I got this error XPath error : Undefined namespace prefix
error : xmlXPathCompiledEval: evaluation failed. Then I tried different way
for my $node ($context->findnodes('//u:inner') ) {
my ($mh) = $context->findnodes('u:name/prenom', $node);
my $size = $context->findnodes('u:name/prenom', $node) ->size;
print "size $size";
if ($size != 1) {next;}
$mh ->removeChildNodes();
$mh->appendText('World12456');
print "mh = $mh";
}
Then I get the size is 0 for both. It doesn't find the tag prenom. With
for my $node ($context->findnodes('//u:inner/name')
It displays nothing.
I am sorry if this is duplicate but I don't find any link to access the sub child with xpathcontext yet.
I got it . I just need to put u for each element
for my $node ($context->findnodes('//u:inner/u:name')

Perl LibXML raw data from textContent?

Given the following XML:
<?xml version="1.0" encoding="utf-8" ?>
<Request>
<form_submit>
<form_submit id = 1424>
<form_id>1424</form_id>
<field1 id=’5’> <![CDATA[ test ]]> </field1>
<field2 id=’6’> <![CDATA[ test2 ]]> </field2>
</form_submit>
</form_submit>
</Request>
I'm trying to get the raw values for the field1 and field2 elements. I'm using the following code:
foreach my $node ( $xml_request->findnodes('Request/*/*/*[#id]') )
{
my $form_field_value = $node->textContent;
print "Value:\"$form_field_value\"\n";
}
But the output is:
Value:" test "
Value:" test2 "
How do I retrieve the exact data, raw and as is, with all the special characters? So that the output is:
Value:" <![CDATA[ test ]]> "
Value:" <![CDATA[ test2 ]]> "
Thank you.
Am not a libxml expert.
However this is what I could figure out after playing with your xml and libxml a bit.
CDATA is a node/section and is not part of text.
Code below goes one level deep and do a toString() for cdata child nodes
and textContent for other nodes.
foreach my $node ( $xml_request->findnodes('Request/*/*/*[#id]') )
{
my $text;
if($node->childNodes) {
foreach my $child ($node->childNodes()) {
if ($child->nodeType == XML::LibXML::XML_CDATA_SECTION_NODE) {
$text .= $child->toString;
} else {
$text .= $child->textContent;
}
}
} else {
$text = $node->textContent;
}
print qq{"$text"\n};
}
will print
" <![CDATA[ test ]]> "
" <![CDATA[ test2 ]]> "
Your sample data is invalid XML, and won't parse unless you replace 1424, ’5’ and ’6’ with "1424", "5" and "6".
You have asked for the text content and have got exactly that. To get what you need you must search for the children of the <fieldN> elements and use the toString method on them.
This code shows the idea. Note that the spaces before and after the CDATA, which would otherwise appear as separate text nodes, have been eliminated using a keep_blanks => 0 option on the object constructor.
use strict;
use warnings;
use XML::LibXML;
my $xml_request = XML::LibXML->load_xml(string => <<'END', keep_blanks => 0);
<?xml version="1.0" encoding="utf-8" ?>
<Request>
<form_submit>
<form_submit id = "1424">
<form_id>1424</form_id>
<field1 id="5"> <![CDATA[ test ]]> </field1>
<field2 id="6"> <![CDATA[ test2 ]]> </field2>
</form_submit>
</form_submit>
</Request>
END
foreach my $node ( $xml_request->findnodes('//form_submit/*[#id]/text()') ) {
my $form_field_value = $node->toString;
print qq(Value: "$form_field_value"\n);
}
output
Value: "<![CDATA[ test ]]>"
Value: "<![CDATA[ test2 ]]>"
Edit
ikegami has commented that the output requested in the question includes the whitespace surrounding the CDATA section. I don't know whether that is truly part of the requirement, but this edit provides a way to do that.
This would be clearer using XML::LibXML::Reader as it has a readInnerXml method (comparable to JavaScript's innerHTML ) that does exactly what is necessary. Instead, this program has to serialize all the children of the <fieldN> nodes and concatenate them with join.
This is a new foreach loop. The rest of the program remains unchanged except for the construction of $xml_request, which must have the keep_blanks option set to 1 or removed altogether.
foreach my $node ( $xml_request->findnodes('//*[starts-with(name(),"field")][#id]') ) {
my $form_field_value = join '', map $_->toString, $node->childNodes;
print qq(Value: "$form_field_value"\n);
}
output
Value: " <![CDATA[ test ]]> "
Value: " <![CDATA[ test2 ]]> "

Perl OpenOffice::OODoc Modifying header/footer style text

I am trying to figure out how to change text in a footer of an ODT file. The footer is kept in the styles.xml, however I can't seem to access it using selectElementsByContent or any other method:
my $a = odfContainer('test.odt');
my $styles = odfDocument(container => $a, part => 'styles');
foreach my $element ($styles->selectElementsByContent('mytest'))
{
#never runs...
}
The styles.xml in the odt is like:
<office:document-styles>
<office:master-styles>
<style:master-page>
<style:footer>
<text:p test:style-name="P49">
mytest
</text:p>
</style:footer>
</style:master-page>
</office:master-styles>
</office:document-styles>
What is the right way to change the text:p contents?
I ended up having to use odfXPath to loop through:
my $ss = odfXPath(file => 'myfile.odt' , part => 'styles');
my $p =0;
while (my $p = $ss->getElement('//text:p',$p))
{
if ($ss->getText($para) eq 'mytest') { $ss->setText($p,'foobar');}
$p++;
}
$ss->save('mynewfile.odt');

Example Perl code for generating XML from XSD using XML::Compile

Can anybody please show me an example for generating XML from XSD using XML::Compile::Schema.
I am trying to post my script which I am trying along with the XSD but I am not able to do that. so I am looking for a any sample example.
I wrote a tutorial on this a while ago: http://blogs.perl.org/users/brian_e_lozier/2011/10/using-xmlcompile-to-output-xsd-compliant-xml.html
In short words, you'll need to do:
Convert the XSD format to Perl hash structure
Construct this Hash, fill in the data
Convert the Hash to XML
Packages required:
XML::Compile::Schema
XML::LibXML::Document
Following code create a Perl structure from XSD definition.
use XML::Compile::Schema;
use Data::Dumper;
my $filename = $ARGV[0] || "";
if(!$filename){
warn "Please provide the WSDL definition file.";
exit 10;
}
my $schema = XML::Compile::Schema->new($filename);
my $hash;
print Dumper $schema->template('PERL' => 'Application');
Then the Perl data structure created by this program looks like:
{
MakeName =>
{
UniqueID => "anything",
_ => "example", },
MakeDetails =>
{
Name =>
{
UniqueID => "anything",
_ => "example", },
},
};
So the rest of your job will create the same structure in your program, fill in the content like:
my $hash = {
MakeName => {
UniqueID => 'xxxx',
_ => 'Name of the Make',
},
OtherFields => foo_bar_get_other_hash(),
};
....
## breathtaking moment, create the XML from this $hash
my $schema = XML::Compile::Schema->new("/opt/data/your.xsd");
my $doc = XML::LibXML::Document->new();
my $writer = $schema->compile(WRITER => 'Application');
my $xml;
## Create $xml in the memory based on the Schema and your $hash
eval{ $xml = $writer->($doc, $hash);};
if($#){
# Useful if the format is invalid against the Schema definition
# Or if there are other errors may occurs
$err_msg = $#->{message}->toString();
return ("", $err_msg);
}
## If you want save this $xml to file, convert it to string format first
$doc->setDocumentElement($xml);
my $ori_content = $doc->toString(1);
## Now $ori_content holds the full XML content.