Complex XML parsing with Perl and LIBXML - perl

I have XML:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="MeasDataCollection.xsl"?>
<measCollecFile xmlns="">
<fileHeader fileFormatVersion="32.435 V7.2.0">
</fileHeader>
<measData>
<managedElement localDn="bs=8" swVersion="R21A"/>
<measInfo measInfoId="CORE,SIP_session_statistics">
<measType p="1">CPUUSAGE</measType>
<measType p="2">CPUMEM</measType>
<measType p="3">SYSMEM</measType>
<measValue measObjLdn="SGC.bsNo=17,networkRole=2">
<r p="1">10</r>
<r p="2">20</r>
<r p="3">30</r>
</measValue>
<measValue measObjLdn="SGC.bsNo=18,networkRole=2">
<r p="1">40</r>
<r p="2">50</r>
<r p="3">60</r>
</measValue>
</measInfo>
</measData>
</measCollecFile>
QUESTION:
I want to extract the 40 from <r p="1">40</r> element. The only thing given is <measType p="1">CPUUSAGE</measType> and <measValue measObjLdn="SGC.bsNo=18,networkRole=2">
i.e. I only know that I need to find the CPUUSAGE of the bsNo=18. The order of the data is always maintained.
Here is what I have tried so far:
my $qry="//measInfo[measType/text() = 'CPUUSAGE']/measValue";
my #nodes= $conn->findnodes($qry);
foreach my $vnode (#nodes) {
if ($vnode->getAttribute('measObjLdn') =~ /'bsNo=18'/) {
foreach my $node ($vnode) {
foreach my $p ($node->getChildnodes) {
if (ref($p)=~'Element'){
$no=$p->textContent;
print $no;**#this prints the value of all the <r> elements**
}
}
}
}
}
My challenge is there can be many elements like CPUUSAGE,CPUMEM... and how I can reach the correct order in the <r> element in that order for a given measValue attribute (/'bsNo=18'/).
And subsequently modify that 40 to some other desired value**

Your Perl code can't work because you match the attribute value against 'bsNo=18' including single quotes.
If you want to find the r element with the same p attribute as the CPUUSAGE node, you could either try the single XPath expression by ikegami or something like the following:
for my $type_node ($conn->findnodes('//measInfo/measType[.="CPUUSAGE"]')) {
my $p = $type_node->getAttribute('p');
my $qry = <<"EOF";
..
/measValue[contains(concat(\#measObjLdn, ','), 'bsNo=18,')]
/r[\#p='$p']
EOF
for my $r_node ($type_node->findnodes($qry)) {
print $r_node->textContent, "\n";
}
}
This first loops over all measType nodes whose content is CPUUSAGE, gets the p attribute then finds all the corresponding r nodes. This approach should be more efficient than a single XPath query.
To find the r node by position and modify its contents, try:
for my $type_node ($conn->findnodes('//measInfo/measType[.="CPUUSAGE"]')) {
my $pos = $type_node->findvalue('count(preceding-sibling::measType) + 1');
my $qry = <<"EOF";
..
/measValue[contains(concat(\#measObjLdn, ','), 'bsNo=18,')]
/r[$pos]
EOF
for my $r_node ($type_node->findnodes($qry)) {
$r_node->removeChildNodes;
$r_node->appendText('50');
}
}
print $conn->toString;

Related

how to check xml file line by line with perl script

i would like to compare the two file one is user's input file txt file and another file is config file which is xml file. if user's input file value is match with config file then show matched function.
this is user's input file
L84A:FIP:70:155:15:18:
L83A:55FIP:70:155:15:
In the above file: L84A is Design_ID, FIP is Process_ID, and 70 to 18 is register_ID.
this is config file
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Sigma>
<Run>
<DESIGN_ID>L83A</DESIGN_ID>
<PROCESS_ID>55FIP</PROCESS_ID>
<RegisterList>
<Register>70</Register>
<Register>155</Register>
</RegisterList>
</Run>
<Run>
<DESIGN_ID>L83A</DESIGN_ID>
<PROCESS_ID>FRP</PROCESS_ID>
<RegisterList>
<Register>141</Register>
<Register>149</Register>
<Register>151</Register>
</RegisterList>
</Run>
<Run>
<DESIGN_ID>L84A</DESIGN_ID>
<PROCESS_ID>55FIP</PROCESS_ID>
<RegisterList>
<Register>70</Register>
<Register>155</Register>
</RegisterList>
</Run>
</Sigma>
so in this case output should show:
L84A: doesn't has FIP process ID in config file.
L83A:
55FIP
70 - existing register ID
155 - existing register ID
15 - no existing register ID.
my code doesn't check respective process ID and register ID .it shows below.
L84A
FIP
70 - existing register ID
155 - existing register ID
15 - existing register ID
18 - no existing register ID
L83A
55FIP
70 - existing register ID
155 - existing register ID
15 - existing register ID
below is my code:
use strict;
use warnings;
use vars qw($file1 $file1cnt #output);
use XML::Simple;
use Data::Dumper;
# create object
my $xml = new XML::Simple;
# read XML file
my $data = $xml->XMLin("sigma_loader.xml");
my $file1 = "userinput.txt";
readFileinString($file1, \$file1cnt);
while($file1cnt=~m/^((\w){4})\:([^\n]+)$/mig)
{
my $DID = $1;
my $reqconfig = $3;
while($reqconfig=~m/^((\w){5})\:([^\n]+)$/mig) #Each line from user request
{
my $example1 = $1; #check for FPP/QBPP process
my $example2 = $3; #display bin full lists.
if(Dumper($data) =~ $DID)
{
print"$DID\n";
if(Dumper($data) =~ $example1)
{
print"$example1\n";
my #second_values = split /\:/, $example2;
foreach my $sngletter(#second_values)
{
if( Dumper($data) =~ $sngletter)
{
print"$sngletter - existing register ID\n";
}
else
{
print"$sngletter - no existing register ID\n";
}
}
}
else
{
print"$DID doesn't has $example1 process ID in config file\n";
}
}
else
{
print"new Design ID deteced\n";
}
}
while($reqconfig=~m/^((\w){3})\:([^\n]+)$/mig) #Each line from user request
{
my $example1 = $1; #check for FPP/QBPP process
my $example2 = $3; #display bin full lists.
if(Dumper($data) =~ $DID)
{
print"$DID\n";
if(Dumper($data) =~ $example1)
{
print"$example1\n";
my #second_values = split /\:/, $example2;
foreach my $sngletter(#second_values)
{
if( Dumper($data) =~ $sngletter)
{
print"$sngletter - existing register ID\n";
}
else
{
print"$sngletter - no existing register ID\n";
}
}
}
else
{
print"$DID doesn't has $example1 process ID in config file\n";
}
}
else
{
print"new Design ID deteced\n";
}
}
}
sub readFileinString
#------------------>
{
my $File = shift;
my $string = shift;
use File::Basename;
my $filenames = basename($File);
open(FILE1, "<$File") or die "\nFailed Reading File: [$File]\n\tReason: $!";
read(FILE1, $$string, -s $File, 0);
close(FILE1);
}
There are a couple of things in your code that do not really make sense, like using Data::Dumper and parsing the output with a regular expression. I'm not going to review your code as that is off-topic on Stack Overflow, but instead going to give you an alternate solution and walk you through it.
Please note that XML::Simple is not a great tool. Its use is discouraged because it is very bad at handling certain cases. But for your very simple XML structure it will work, so I have kept it.
use strict;
use warnings;
use XML::Simple;
use feature 'say';
# read XML file and reorganise it for easier use
my $data;
foreach my $run (#{XMLin(\*DATA)->{Run}}) {
$data->{$run->{DESIGN_ID}}->{$run->{PROCESS_ID}} =
{map { $_ => 1 } #{$run->{RegisterList}->{Register}}};
}
# read the text file - I've skipped the read
my #user_input = qw(
L84A:FIP:70:155:15:18:
L83A:55FIP:70:155:15:
);
foreach my $line (#user_input) {
chomp $line
; # we don't need this in my example, but you do when you read from a file
my ($design_id, $process_id, #register_ids) = split /:/, $line;
# extra error checking just in case
if (not exists $data->{$design_id}) {
say "$design_id does't exist in data";
next;
}
if (not exists $data->{$design_id}->{$process_id}) {
say "$design_id: doesn't have $process_id";
next;
}
say "$design_id:";
say " $process_id";
foreach my $register_id (#register_ids) {
if (exists $data->{$design_id}->{$process_id}->{$register_id}) {
say " $register_id - existing register ID";
}
else {
say " $register_id - no existing register ID";
}
}
}
__DATA__
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Sigma>
<Run>
<DESIGN_ID>L83A</DESIGN_ID>
<PROCESS_ID>55FIP</PROCESS_ID>
<RegisterList>
<Register>70</Register>
<Register>155</Register>
</RegisterList>
</Run>
<Run>
<DESIGN_ID>L83A</DESIGN_ID>
<PROCESS_ID>FRP</PROCESS_ID>
<RegisterList>
<Register>141</Register>
<Register>149</Register>
<Register>151</Register>
</RegisterList>
</Run>
<Run>
<DESIGN_ID>L84A</DESIGN_ID>
<PROCESS_ID>55FIP</PROCESS_ID>
<RegisterList>
<Register>70</Register>
<Register>155</Register>
</RegisterList>
</Run>
</Sigma>
I've made a few assumptions.
You already know how to read the text file, so I've stuck that into an array line by line. Your file reading code has some issues though, you should be using three-arg open and lexical filehandles. Your call to open should look like this:
open my $fh, '<', $filename or die "$!: error...";
Alternatively, consider using Path::Tiny.
I'm taking the XML file from the __DATA__ section. This is like a filehandle.
So let's look at my code.
When we read the XML structure, it looks like this straight out of XMLin.
\ {
Run [
[0] {
DESIGN_ID "L83A",
PROCESS_ID "55FIP",
RegisterList {
Register [
[0] 70,
[1] 155
]
}
},
[1] {
DESIGN_ID "L83A",
PROCESS_ID "FRP",
RegisterList {
Register [
[0] 141,
[1] 149,
[2] 151
]
}
},
[2] {
DESIGN_ID "L84A",
PROCESS_ID "55FIP",
RegisterList {
Register [
[0] 70,
[1] 155
]
}
}
]
}
This is not very useful for what we plan to do, so we have to rearrange it. I want to use exists on hash references later, to make it easier to look up if there are matches for the IDs we are looking at. This is called a lookup hash. We can through away the ->{Run} key as XML::Simple combines all <Run> elements into an array reference, and the <Sigma> tag is just skipped because it's the root element.
Every Design ID can have multiple Processes, so we organise these two hierarchically, and we put in another lookup hash, where every register is a key, and we just use 1 as a key. The key does not matter.
This gives us a different data structure:
\ {
L83A {
55FIP {
70 1,
155 1
},
FRP {
141 1,
149 1,
151 1
}
},
L84A {
55FIP {
70 1,
155 1
}
}
}
That's much easier to understand and use later on.
Now we parse the user input, and iterate over each line. The format seems clear. It's a bit like a CSV file, but using colons :, so we can split. This gives us the two IDs, and all following values are registers, so we stick them in an array.
Your example doesn't have a case where the Design ID does not exist in the XML file, but given this is based on user input, we should check anyway. In the real world data is always dirty.
We can then check if the $process_id exists inside the $design_id in our data. If it does not, we tell the user and skip to the next line.
Then we have to iterate all the Register IDs. Either the $register_id exists in our second lookup hash, or it doesn't.
This gives us the exact output you're expecting.
L84A: doesn't have FIP
L83A:
55FIP
70 - existing register ID
155 - existing register ID
15 - no existing register ID
This code is much shorter, easier to read and runs faster. I've used Data::Printer to show the data structures.

Perl: unable to extract sibling value using Twig::XPath syntax

Recently I start to use XML::Twig::XPath but the module does not seem to recognize an xpath syntax.
In the following XML, I want the value of "Txt" node if the value of PlcAndNm node is "ext_1"
<?xml version="1.0" encoding="UTF-8"?>
<root>
<Document>
<RedOrdrV03>
<MsgId>
<Id>1</Id>
</MsgId>
<Xtnsn>
<PlcAndNm>ext_1</PlcAndNm>
<Txt>1234</Txt>
</Xtnsn>
<Xtnsn>
<PlcAndNm>ext_2</PlcAndNm>
<Txt>ABC</Txt>
</Xtnsn>
</RedOrdrV03>
</Document>
<Document>
<RedOrdrV03>
<MsgId>
<Id>2</Id>
</MsgId>
<Xtnsn>
<PlcAndNm>ext_1</PlcAndNm>
<Txt>9876</Txt>
</Xtnsn>
<Xtnsn>
<PlcAndNm>ext_2</PlcAndNm>
<Txt>DEF</Txt>
</Xtnsn>
</RedOrdrV03>
</Document>
</root>
I have tried whit expression //Xtnsn[PlcAndNm="ext_1"]/Txt but I received an error
This is the code:
use XML::Twig::XPath;
my $subelt_count = 1;
my #processed_elements;
my $xmlfile = 'c:/test_file.xml';
my $parser = XML::Twig->new(
twig_roots => { 'RedOrdrV03' => \&process_xml } ,
end_tag_handlers => { 'Document' },
);
$parser->parsefile($xmlfile);
sub process_xml {
my ( $twig, $elt ) = #_;
push( #processed_elements, $elt );
if ( #processed_elements >= $subelt_count ) {
my $MsgId = $twig->findvalue('RedOrdrV03/MsgId/Id');
my $Xtnsn_Txt1 = $twig->findvalue('//Xtnsn[PlcAndNm="ext_1"]/Txt');
print "MsgId: $MsgId - Xtnsn_Txt1: $Xtnsn_Txt1\n";
}
$_->delete for #processed_elements;
#processed_elements = ();
$twig->purge;
}
Is there a simple way of using xpath to obtain the value?
I know that a possibility is somenthing like:
my $Xtnsn_Txt1 = $twig->first_elt( sub { $_[0]->tag eq 'PlcAndNm' && $_[0]->text eq 'ext_1' })->next_sibling()->text();
but I prefer using the simplest XPath syntax,
Thanks in advance for your help!
You can use this:
my $Xtnsn_Txt1 = $twig->findvalue('//Xtnsn/PlcAndNm[string()="ext_1"]/../Txt');
Another approach could be :
//Txt[preceding-sibling::PlcAndNm[.="ext_1"]]
You can also modify a little bit your XPath expression to see if it works with :
//Xtnsn[./PlcAndNm[contains(.,"ext_1")]]/Txt
EDIT : This works fine with the original XML::XPath module :
use XML::XPath;
use XML::XPath::Node::Element;
my $xp = XML::XPath->new(filename => 'pathtoyour.xml');
my $nodeset = $xp->find('//Xtnsn[PlcAndNm="ext_1"]/Txt');
foreach my $node ($nodeset->get_nodelist) {
print XML::XPath::Node::Element::string_value($node),"\n\n";
}
Output : 1234 9876

unable to parse xml file using registered namespace

I am using XML::LibXML to parse a XML file. There seems to some problem in using registered namespace while accessing the node elements. I am planning to covert this xml data into CSV file. I am trying to access each and every element here. To start with I tried out extracting attribute values of <country> and <state> tags. Below is the code I have come with . But I am getting error saying XPath error : Undefined namespace prefix.
use strict;
use warnings;
use Data::Dumper;
use XML::LibXML;
my $XML=<<EOF;
<DataSet xmlns="http://www.w3schools.com" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">
<exec>
<survey_region ver="1.1" type="x789" date="20160312"/>
<survey_loc ver="1.1" type="x789" date="20160312"/>
<note>Population survey</note>
</exec>
<country name="ABC" type="MALE">
<state name="ABC_state1" result="PASS">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
Some random text
contained here
]]></comment>
</state>
</country>
<country name="XYZ" type="MALE">
<state name="XYZ_state2" result="FAIL">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
any random text data
]]></comment>
</state>
</country>
</DataSet>
EOF
my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($XML);
my $xc = XML::LibXML::XPathContext->new($doc);
$xc->registerNs('x','http://www.w3schools.com');
foreach my $camelid ($xc->findnodes('//x:DataSet')) {
my $country_name = $camelid->findvalue('./x:country/#name');
my $country_type = $camelid->findvalue('./x:country/#type');
my $state_name = $camelid->findvalue('./x:state/#name');
my $state_result = $camelid->findvalue('./x:state/#result');
print "state_name ($state_name)\n";
print "state_result ($state_result)\n";
print "country_name ($country_name)\n";
print "country_type ($country_type)\n";
}
Update
if I remove the name space from XML and change my XPath slightly it seems to work. Can someone help me understand the difference.
foreach my $camelid ($xc->findnodes('//DataSet')) {
my $country_name = $camelid->findvalue('./country/#name');
my $country_type = $camelid->findvalue('./country/#type');
my $state_name = $camelid->findvalue('./country/state/#name');
my $state_result = $camelid->findvalue('./country/state/#result');
print "state_name ($state_name)\n";
print "state_result ($state_result)\n";
print "country_name ($country_name)\n";
print "country_type ($country_type)\n";
}
This would be my approach
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $XML=<<EOF;
<DataSet xmlns="http://www.w3schools.com" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">
<exec>
<survey_region ver="1.1" type="x789" date="20160312"/>
<survey_loc ver="1.1" type="x789" date="20160312"/>
<note>Population survey</note>
</exec>
<country name="ABC" type="MALE">
<state name="ABC_state1" result="PASS">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
Some random text
contained here
]]></comment>
</state>
</country>
<country name="XYZ" type="MALE">
<state name="XYZ_state2" result="FAIL">
<info>
<type>literacy rate comparison</type>
</info>
<comment><![CDATA[
any random text data
]]></comment>
</state>
</country>
</DataSet>
EOF
my $parser = XML::LibXML->new();
my $tree = $parser->parse_string($XML);
my $root = $tree->getDocumentElement;
my #country = $root->getElementsByTagName('country');
foreach my $citem(#country){
my $country_name = $citem->getAttribute('name');
my $country_type = $citem->getAttribute('type');
print "Country Name -- $country_name\nCountry Type -- $country_type\n";
my #state = $citem->getElementsByTagName('state');
foreach my $sitem(#state){
my #info = $sitem->getElementsByTagName('info');
my $state_name = $sitem->getAttribute('name');
my $state_result = $sitem->getAttribute('result');
print "State Name -- $state_name\nState Result -- $state_result\n";
foreach my $i (#info){
my $text = $i->getElementsByTagName('type');
print "Info --- $text\n";
}
}
print "\n";
}
Of course you can manipulate the data anyway you'd like. If you are parsing from a file change parse_string to parse_file.
For the individual elements in the xml use the getElementsByTagName to get the elements within the tags. This should be enough to get you going
There seem to be two small mistakes here.
1. call findvalue for the XPathContext document with the context node as parameter.
2. name is a attribute in country no a node.
Therefor try :
my $country_name = $xc->findvalue('./x:country/#name', $camelid );
Update to the updated question if I remove the name space from XML and change my XPath slightly it seems to work. Can someone help me understand the difference.
To understand what happens here have a look to NOTE ON NAMESPACES AND XPATH
In your case $camelid->findvalue('./x:state/#name'); calls findvalue is called for an node.
But: The recommended way is to use the XML::LibXML::XPathContext module to define an explicit context for XPath evaluation, in which a document independent prefix-to-namespace mapping can be defined. Which I did above.
Conclusion:
Calling find on a node will only work: if the root element had no namespace
(or if you use the same prefix as in the xml doucment if ther is any)

Organizing data with XPath in Perl

I am using this line of code to get two data entries from an XML file
perl xmlPerl.pl zbxml.xml "//zabbix_export/templates/template/items/item/name/text() | //zabbix_export/templates/template/items/item/description/text()"
Which takes the data, and displays it vertically. For example:
name1
description1
name2
description2
I used this in c# and had some code so that it would display like this
name1 - description1
name2 - description2
name3 - (blank since there
isnt a description)
there were even some blanks in description. Here is the c# code, since it may help.
XPathExpression expr;
expr = nav.Compile("/zabbix_export/templates/template/items/item/name | /zabbix_export/templates/template/items/item/description");
XPathNodeIterator iterator = nav.Select(expr);
//Iterate on the node set
List<string> listBox1 = new List<string>();
listBox1.Clear();
try
{
while (iterator.MoveNext())
{
XPathNavigator nav2 = iterator.Current.Clone();
// nav2.Value;
listBox1.Add(nav2.Value);
Console.Write(nav2.Value);
iterator.MoveNext();
nav2 = iterator.Current.Clone();
Console.Write("-" + nav2.Value + "\n");
Well, I am having to switch it to Perl now, and I am not sure if I should try and find some Perl code to do what I need, or if this can be done in XPath? I tried looking at some w3 tutorials, but didn't find what I was looking for.
Thanks!
edit -
would I need to edit this part of my xmlPerl.pl
# print each node in the list
foreach my $node ( $nodeset->get_nodelist ) {
print XML::XPath::XMLParser::as_string( $node ) . "\n";
}
It cannot be done with an XPath. It can be done with an XSL transformation:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()"/>
<xsl:template match="item">
<xsl:value-of select="concat(name,' - ',description,'
')"/>
</xsl:template>
</xsl:stylesheet>
A simple Perl script that applies this XSLT will do the trick - see this for example (or any other command-line utility that applies an XSLT for that matter - like msxsl.exe)

Perl LibXML raw data from textContent?

Given the following XML:
<?xml version="1.0" encoding="utf-8" ?>
<Request>
<form_submit>
<form_submit id = 1424>
<form_id>1424</form_id>
<field1 id=’5’> <![CDATA[ test ]]> </field1>
<field2 id=’6’> <![CDATA[ test2 ]]> </field2>
</form_submit>
</form_submit>
</Request>
I'm trying to get the raw values for the field1 and field2 elements. I'm using the following code:
foreach my $node ( $xml_request->findnodes('Request/*/*/*[#id]') )
{
my $form_field_value = $node->textContent;
print "Value:\"$form_field_value\"\n";
}
But the output is:
Value:" test "
Value:" test2 "
How do I retrieve the exact data, raw and as is, with all the special characters? So that the output is:
Value:" <![CDATA[ test ]]> "
Value:" <![CDATA[ test2 ]]> "
Thank you.
Am not a libxml expert.
However this is what I could figure out after playing with your xml and libxml a bit.
CDATA is a node/section and is not part of text.
Code below goes one level deep and do a toString() for cdata child nodes
and textContent for other nodes.
foreach my $node ( $xml_request->findnodes('Request/*/*/*[#id]') )
{
my $text;
if($node->childNodes) {
foreach my $child ($node->childNodes()) {
if ($child->nodeType == XML::LibXML::XML_CDATA_SECTION_NODE) {
$text .= $child->toString;
} else {
$text .= $child->textContent;
}
}
} else {
$text = $node->textContent;
}
print qq{"$text"\n};
}
will print
" <![CDATA[ test ]]> "
" <![CDATA[ test2 ]]> "
Your sample data is invalid XML, and won't parse unless you replace 1424, ’5’ and ’6’ with "1424", "5" and "6".
You have asked for the text content and have got exactly that. To get what you need you must search for the children of the <fieldN> elements and use the toString method on them.
This code shows the idea. Note that the spaces before and after the CDATA, which would otherwise appear as separate text nodes, have been eliminated using a keep_blanks => 0 option on the object constructor.
use strict;
use warnings;
use XML::LibXML;
my $xml_request = XML::LibXML->load_xml(string => <<'END', keep_blanks => 0);
<?xml version="1.0" encoding="utf-8" ?>
<Request>
<form_submit>
<form_submit id = "1424">
<form_id>1424</form_id>
<field1 id="5"> <![CDATA[ test ]]> </field1>
<field2 id="6"> <![CDATA[ test2 ]]> </field2>
</form_submit>
</form_submit>
</Request>
END
foreach my $node ( $xml_request->findnodes('//form_submit/*[#id]/text()') ) {
my $form_field_value = $node->toString;
print qq(Value: "$form_field_value"\n);
}
output
Value: "<![CDATA[ test ]]>"
Value: "<![CDATA[ test2 ]]>"
Edit
ikegami has commented that the output requested in the question includes the whitespace surrounding the CDATA section. I don't know whether that is truly part of the requirement, but this edit provides a way to do that.
This would be clearer using XML::LibXML::Reader as it has a readInnerXml method (comparable to JavaScript's innerHTML ) that does exactly what is necessary. Instead, this program has to serialize all the children of the <fieldN> nodes and concatenate them with join.
This is a new foreach loop. The rest of the program remains unchanged except for the construction of $xml_request, which must have the keep_blanks option set to 1 or removed altogether.
foreach my $node ( $xml_request->findnodes('//*[starts-with(name(),"field")][#id]') ) {
my $form_field_value = join '', map $_->toString, $node->childNodes;
print qq(Value: "$form_field_value"\n);
}
output
Value: " <![CDATA[ test ]]> "
Value: " <![CDATA[ test2 ]]> "