perl script to replace the xml values - perl

I have this XML file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<BroadsoftDocument protocol = "OCI" xmlns="C" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<sessionId xmlns="">169.254.52.85,16602326,1324821125562</sessionId>
<command xsi:type="UserAddRequest14sp9" xmlns="">
<serviceProviderId>AtyafBahrain</serviceProviderId>
<groupId>LoadTest</groupId>
<userId>user_0002#atyaf.me</userId>
<lastName>0002</lastName>
<firstName>user</firstName>
<callingLineIdLastName>0002</callingLineIdLastName>
<callingLineIdFirstName>user</callingLineIdFirstName>
<password>123456</password>
<language>English</language>
<timeZone>Asia/Bahrain</timeZone>
<address/>
</command>
</BroadsoftDocument>
and I need to replace the values of some fields (UserID, firstName, password) and output the file to be saved with the same name.
Using the code below I will change the syntax of the xml fields (xml format gets disturbed):
XMLout( $xml, KeepRoot => 1, NoAttr => 1, OutputFile => $xml_file, );
can you please advice how to edit the xml file without changing its syntax?

You can checkout XML::Simple parser for perl. You can refer to the CPAN site. I have used it for parsing XML files but I think this should allow modification as well.

# open XML file (input the XML file name)
open (INPUTFILE, "+<$filename_1");
#file = <INPUTFILE>;
seek INPUTFILE,0,0;
foreach $file (#file)
{
# Find string_1 and replace it by string_2
$file =~ s/$str_1/$str_2/g;
# write to file
print INPUTFILE $file;
}
close INPUTFILE;

Related

How to parse <rss> tag with XML::LibXML to find xmlns defintions

It seems that there is no consistent way that podcasts define their rss feeds.
Ran into one that is using different schema defs for the RSS.
What's the best way to scan for xmlnamespace in an RSS url, using XML::LibXML
E.g.
One feed might be
<rss
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">
Another might be
<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"version="2.0"
xmlns:atom="http://www.w3.org/2005/Atom">
I want to include in my script an assessment of all the namespaces being used so that when parsing the rss, the appropriate field names can be tracked.
Not sure what that will look like yet, as I'm not sure this module has the capability to do the <rss> tag attribute atomization that I want.
I'm not sure I understand exactly what kind of output you're looking for, but XML::LibXML is indeed able to list the namespaces:
use warnings;
use strict;
use XML::LibXML;
my $dom = XML::LibXML->load_xml(string => <<'EOT');
<rss
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">
</rss>
EOT
for my $ns ($dom->documentElement->getNamespaces) {
print $ns->getLocalName(), " / ", $ns->getData(), "\n";
}
Output:
content / http://purl.org/rss/1.0/modules/content/
wfw / http://wellformedweb.org/CommentAPI/
dc / http://purl.org/dc/elements/1.1/
atom / http://www.w3.org/2005/Atom
sy / http://purl.org/rss/1.0/modules/syndication/
slash / http://purl.org/rss/1.0/modules/slash/
I know that OP has already accepted an answer. But for completeness sake it should be mentioned that the recommended way to make searches on the DOM resilient is to use XML::LibXML::XPathContext:
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my #examples = (
<<EOT
<rss xmlns:atom="http://www.w3.org/2005/Atom">
<atom:test>One Ring to rule them all,</atom:test>
</rss>
EOT
,
<<EOT
<rss xmlns:a="http://www.w3.org/2005/Atom">
<a:test>One Ring to find them,</a:test>
</rss>
EOT
,
<<EOT
<rss xmlns="http://www.w3.org/2005/Atom">
<test>The end...</test>
</rss>
EOT
,
);
my $xpc = XML::LibXML::XPathContext->new();
$xpc->registerNs('atom', 'http://www.w3.org/2005/Atom');
for my $example (#examples) {
my $dom = XML::LibXML->load_xml(string => $example)
or die "XML: $!\n";
for my $node ($xpc->findnodes("//atom:test", $dom)) {
printf("%-10s: %s\n", $node->nodeName, $node->textContent);
}
}
exit 0;
i.e. you assign a local namespace prefix for those namespaces you are interested in.
Output:
$ perl dummy.pl
atom:test : One Ring to rule them all,
a:test : One Ring to find them,
test : The end...

How to actually modify values of an XML file using XML::LibXML

I have an XML file (information.xml). I have to extract element and attribute values from this XML file and insert those element and attribute values into another XML file (build.xml). I have to change the build.xml file by filling the appropriate element values and tags from information.xml file.
I have to use XML::LibXML to do so. I am able to extract the element and attribute values from information.xml. But, I am unable to open and fill those values in build.xml
Example :
information.xml
<info>
<app version="10.5.10" long_name ="My Application">
<name> MyApp </name>
<owner>larry </owner>
<description> This is my first application</description>
</app>
</info>
build.xml
<build long_name="" version="">
<section type="Appdesciption">
<description> </description>
</section>
<section type="Appdetails">
<app_name> </app_name>
<owner></owner>
</section>
</build>
Now, my task is to extract value of owner from information.xml, open build.xml, search for owner tag in build.xml and put the extracted value there.
The Perl script looks like:
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $file1="/root/shubhra/myapp/information.xml";
my $file2="/root/shubhra/myapp/build.xml";
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($file1);
foreach my $line ($doc->findnodes('//info/app'))
{
my $owner= $line->findnodes('./owner'); # 1st way
print "\n",$owner->to_literal,"\n";
my ($long_name) = $line->findvalue('./#long_name'); # 2nd way
print "\n $long_name \n";
my $version = $line->findnodes('#version');
print "\n",$version->to_literal,"\n";
}
my $parser2 = XML::LibXML->new();
my $doc2 = $parser2->parse_file($file2);
foreach my $line2 ($doc2->findnodes('//build'))
{
my ($owner2)= $line2->findnodes('./section/owner/text()');
my ($version2)=$line2->findvalue('./#version');
print "\n Build.xml already has version : $version2 \n";
print "\n Build.xml already has owner :",$owner2->to_literal;
$owner2->setData("Windows Application 2"); # Not changing build.xml
$line2->setAttribute(q|version|,"60.60.60"); # Not changing build.xml
my $changedversion = $line2->getAttribute(q|version|);
#superficially changed but didn't changed build.xml content
print "\n The changed version is : $changedversion";
}
build.xml looks like :
<build long_name="" version="9.10.10">
<section type="Appdesciption">
<description> </description>
</section>
<section type="Appdetails">
<app_name> </app_name>
<owner>shubhra</owner>
</section>
</build>
my $doc3 = XML::LibXML->load_xml(location => $file2, no_blanks => 1);
my $xpath_expression = '/build/section/owner/text()';
my #nodes = $doc3->findnodes( $xpath_expression );
for my $node (#nodes) {
my $content = $node->toString;
$content = $owner;
$node->setData($content);
}
$doc->toFile($file2 . '.new', 1);
The following fails to find anything (setting $owner2 to undef) since owner has no text:
my ($owner2) = $line2->findnodes('./section/owner/text()');
You want
my ($owner2) = $line2->findnodes('./section/owner');
This entails changing
print "\n Build.xml already has owner :", $owner2->to_literal;
to
print "\n Build.xml already has owner :", $owner2->textContent;
and
$owner2->setData("Windows Application 2");
to
$owner2->removeChildNodes();
$owner2->appendText("Windows Application 2");
You imply you want the following to change build.xml, but it doesn't even mention build.xml:
$line2->setAttribute(q|version|, "60.60.60");
It does modify $doc2, but you'll need to add the following code to modify build.xml too:
$doc2->toFile('build.xml');

Could not open file perl

I am trying to convert a plist files into a JUnit style XMLs. I have a xsl stylesheet which converts the plist to JUnit/ANT XML.
Here is the perl code which I run to convert the plist to XML:
my $parser = XML::LibXML->new();
my $xslt = XML::LibXSLT->new();
my $stylesheet = $xslt->parse_stylesheet_file("\\\~/Hudson/build/workspace/ui-automation/automation\\\ test\\\ suite/plist2junit.xsl");
my $counter = 1;
my #plistFiles = glob('Logs/*/*.plist');
foreach (#plistFiles){
#Escape the file path and specify abosulte path
my $plistFile = $_;
$plistFile =~ s/([ ()])/\\$1/g;
$path2plist = "\\\~/Hudson/build/workspace/ui-automation/automation\\\ test\\\ suite/$plistFile";
#transform the plist file to xml
my $source = $parser->parse_file($path2plist);
my $results = $stylesheet->transform($source);
my $resultsFile = "\\\~/Hudson/build/workspace/ui-automation/automation\\\ test\\\ suite/JUnit/results$counter.xml";
#create the output file
unless(open FILE, '>'.$resultsFile) {
# Die with error message
die "\nUnable to create $file\n";
}
# Write results to the file.
$stylesheet->output_file($results, FILE);
close FILE;
$counter++;
}
After running the perl script on Hudson/Jenkins, it outputs this error message:
Couldn't open ~/Hudson/build/workspace/ui-automation/automation\ test\ suite/Logs/Run\ 1/Automation\ Results.plist: No such
file or directory
The error is caused by my $source = $parser->parse_file($path2plist); in the code. I am unable to figure out why it cannot find/read the file.
Anyone know what might be causing the error?
There are three obvious error in the path mentioned in the error message.
~/Hudson/build/workspace/ui-automation/automation\ test\ suite/Logs/Run\ 1/Automation\ Results.plist
Those are:
There's no directory named ~ in the current directory. Perhaps you meant to use the value of $ENV{HOME} there?
There's no directory named automation\ test\ suite anywhere on your disk, but there is probably one named automation test suite.
Similarly, there's no directory named Run\ 1 anywhere on your disk, but there is probably one named Run 1.

Escape special character at text

I am reading a xml file, and I add some additional text, but I can't get exact text because some special characters automatically converted.
I try this:
<book>
<book-meta>
<book-id pub-id-type="doi">1545</book-id>
<book-title>Regenerating <?tex?> the Curriculum</book-title>
</book-meta>
</book>
Script:
use strict;
use XML::Twig;
open(my $out, '>', 'Output.xml') or die "can't Create stroy file $!\n";
my $story_file = XML::Twig->new(
twig_handlers => {
'book-id' => sub { $_->set_text('<?sample?>') },
keep_atts_order => 1,
},
pretty_print => 'indented',
);
$story_file->parsefile('sample.xml');
$story_file->print($out);
Output:
<book>
<book-meta>
<book-id pub-id-type="doi"><?sample?></book-id>
<book-title>Regenerating <?tex?> the Curriculum</book-title>
</book-meta>
</book>
I would like output as:
<book>
<book-meta>
<book-id pub-id-type="doi"><?sample?></book-id>
<book-title>Regenerating <?tex?> the Curriculum</book-title>
</book-meta>
</book>
How can I escape this type of character in XML twig. I tried the set_asis option, but I can't get it to work.
XML::Twig is correctly inserting the string <?sample?> for you as you are asking for a PCDATA node to be added and < must be replaced with < in such a node. However what you want is a processing instruction node.
The easiest way to insert such a node using XML::Twig is using the set_inner_xml method, which will parse an XML tree fragment from a string and insert it as the contents of the current node.
If you replace
$_->set_text('<?sample?>')
with
$_->set_inner_xml('<?sample?>')
then your code should do what you want. The output I get is
<book>
<book-meta>
<book-id pub-id-type="doi"><?sample?></book-id>
<book-title>Regenerating <?tex?> the Curriculum</book-title>
</book-meta>
</book>
<? ..... ?> is not (part of) text but a processing instruction. When you add it you your XML with set_text however it is processed as text, hence the <.
I'm not familiar with XML::Twig myself, but I think you should check for the possibility to add a processing instruction instead of text.

Perl using XML Path Context to extract out data

I have the following xml
<?xml version="1.0" encoding="utf-8"?>
<Response>
<Function Name="GetSomethingById">
<something idSome="1" Code="1" Description="TEST01" LEFT="0" RIGHT="750" />
</Function>
</Response>
and I want the attributes of <something> node as a hash. Im trying like below
my $xpc = XML::LibXML::XPathContext->new(
XML::LibXML->new()->parse_string($xml) # $xml is containing the above xml
);
my #nodes = $xpc->findnodes('/Response/Function/something');
Im expecting to have something like $nodes[0]->getAttributes, any help?
my %attributes = map { $_->name => $_->value } $node->attributes();
Your XPATH query seems to be wrong - you are searching for '/WSApiResponse/Function/something' while the root node of your XML is Response and not WSApiResponse
From the docs of XML::LibXML::Node (the kind of stuff that findnodes() is expected to return), you should look for my $attrs = $nodes[0]->attributes() instead of $nodes[0]->getAttributes
I use XML::Simple for this type of thing. So if the XML file is data.xml
use strict;
use XML::Simple();
use Data::Dumper();
my $xml = XML::Simple::XMLin( "data.xml" );
print Data::Dumper::Dumper($xml);
my $href = $xml->{Function}->{something};
print Data::Dumper::Dumper($href);
Note: With XML::Simple the root tag maps to the result hash itself. Thus there is no $xml->{Response}