Print output using XML::LibXML - perl

my $doc = $parser->parse_string( $res->content );
my $root = $doc->getDocumentElement;
my #objects = $root->getElementsByTagName('OBJECT');
foreach my $object ( #objects ){
my $name = $object->firstChild;
print "OBJECT = " . $name . "\n";}
OUTPUT is:
OBJECT = XML::LibXML::Text=SCALAR(0x262e170)
OBJECT = XML::LibXML::Text=SCALAR(0x2ee4b00)
OBJECT = XML::LibXML::Text=SCALAR(0x262e170)
OBJECT = XML::LibXML::Text=SCALAR(0x2ee4b00)
Can anyone please explain why print prints the $name attribute values like this? Why does it print normal when I use the function getAttribute with virtually he same code?

getAttribute returns an attribute, while firstChild returns a text node, element, processing instruction, or a comment.
What you see is a normal Perl way of printing an object: it prints its class and address. Your version of XML::LibXML seems to be a bit antique, recent versions overload the stringification and the code produces the actual text node.
#!/usr/bin/perl
use warnings;
use strict;
use XML::LibXML;
my $doc = 'XML::LibXML'->load_xml( string => << '__XML__');
<root>
<OBJECT name="o1">hello</OBJECT>
</root>
__XML__
my #objects = $doc->getElementsByTagName('OBJECT');
for my $object (#objects) {
print 'OBJECT = ', $object->firstChild, "\n";
}
Output:
OBJECT = hello
In the old versions, one needed to call the nodeValue or data method.
print 'OBJECT = ', $object->firstChild->data, "\n";

Related

How do I iterate over methods for Perl object

I've created an Object such as
my $hex = Hexagram->new();
and it has various methods:
top
bot
chinese
title
meaning
This object will be created numerous times and each time I need to gather and test information for each of the above methods.
I would like to do something like
foreach my $method ( qw/top bot chinese title meaning/ )
{
&gather_info($hex,$method);
}
and then have something like
sub gather_info {
my ($hex,$method) = #_;
print "What is the $method? ";
my $response = <STDIN>;
chomp $response;
$hex->${method}($reponse);
.... and other actions ....
}
But this doesn't work. Instead, for each method I seem to have to write out the basic code structure again and again which just seems plain wasteful.
I've also tried something where I try to pass a reference to the method call such as in
foreach my $ra ( [\$hex->top, "top"],
[\$hex->bot, "bot"],....)
{
my ($object_method, $name) = #{$ra};
&rgather_info($object_method, $name);
}
where
sub $gather_info {
my ($rhex, $name) = #_;
print "What is the $name?";
my $response = <STDIN>;
chomp $response;
&{$rhex}($response);
.... and other actions ....
}
But this time I get an error about
Not a CODE reference at <program name> line <line number>,....
Any suggestions on how I can do this?
According to perlobj method calls can be made using a string variable.
$object->$method( #args );
So your foreach loop should have worked fine. Or this one, which is much less wordy:
use strict;
use warnings;
my $hex = Hexagram->new();
gather_info( $hex, $_ )
for qw/top bot chinese title meaning/;
sub gather_info {
my ($hex, $method) = #_;
print "What is $method?\n";
my $response = <STDIN>;
chomp $response;
$hex->$method( $response );
}
Make sure you have strict and warnings enabled and try again. Update you post with errors, etc.

MIME::Parser - Can't save binary attachment

I am using Perl to read messages, look for and save attachments. Attachments will always be binary pdf documents and there will never be more than one attachment. I need to read the subject, check for and save an attachment (if exists) an copy message to a folder for temporary storage.
The reading, printing, copying functions all work. I've tried a lot of different scenarios with MIME::Parser (I have MIME::Tools installed) but either get a blank file or file with 1 or 2 characters. I'd also like to know how to determine / set the file extension rather than just blindly rename to .pdf.
#!/usr/bin/perl
use Net::IMAP::Simple::SSL;
use Email::Simple;
use MIME::Parser;
print "Content-type: text/html\n\n";
$server = new Net::IMAP::Simple::SSL('xxx');
$server->login('xxx','xxx');
my $folder='inbox';
my ($unseen, $recent, $total) = $server->status($folder);
my $newm = $server->select('INBOX');
my $tmp=($total-9); #limit for testing
my $outputdir = "./temp";
my $parser = new MIME::Parser;
$parser->output_dir($outputdir);
for (my $i = $tmp; $i <= $total; $i++) {
if ($server->seen($i)) {
print "Message #$i has been seen before...<br />";
} else {
my $es=Email::Simple->new(join '', #{$server->top($i)});
print $es->header('Subject')." on ";
print $es->header('Date')."<br />";
print "You've just seen message #$i<br />" if $server->see($i)."<br />";
$msg = $server->get($i);
$parser->parse_data($msg);
$server->copy($i,'dump');
}
}
$server->quit();
exit;
Error
parse_data: wrong argument ref type: Net::IMAP::Simple::_message at mailextract.pl line x
Don't know why you're using two different parsers...
my $entity = $parser->parse_data($message);
my $from = $entity->head->get('From');
my $subject = $entity->head->get('Subject');
my $timestamp = $entity->head->get('Date');
for my $part ($entity->parts()) {
if ( $part->mime_type eq 'application/pdf' ) { ### Few different types in use, see what your
### messages get sent with
my $filename = $part->bodyhandle->path;
...
### Do whatever
}
}
Edit: And your error is happening because you're not feeding through the correct thing to be parsed, a Net::IMAP::Simple::_message instead of:
parse_data DATA
Instance method. Parse a MIME message that's already in core. This internally creates an "in memory" filehandle on a Perl scalar value
using PerlIO
You may supply the DATA in any of a number of ways...
A scalar which holds the message. A reference to this scalar will be used internally.
A ref to a scalar which holds the message. This reference will be used internally.
DEPRECATED
A ref to an array of scalars. The array is internally concatenated into a temporary string, and a reference to the new
string is used internally.
It is much more efficient to pass in a scalar reference, so please consider refactoring your code to use that interface instead.
If you absolutely MUST pass an array, you may be better off using
IO::ScalarArray in the calling code to generate a filehandle, and
passing that filehandle to parse()
Try $parser->parse($server->getfh($i));
#!/usr/bin/perl
use Net::IMAP::Simple::SSL;
use MIME::Parser;
print "Content-type: text/html\n\n";
$server = new Net::IMAP::Simple::SSL('xxx');
$server->login('xxx','xxx');
my $newm=0;
$newm = $server->select('INBOX');
if ($newm==0) {
$server->quit();
print "No New Messages.";
exit;
}
my $outputdir = "./temp";
my $parser = new MIME::Parser;
$parser->output_dir($outputdir);
for (my $i = 1; $i <= $newm; $i++) {
my $entity = $parser->parse($server->getfh($i));
my $from = $entity->head->get('From');
my $subject = $entity->head->get('Subject');
my $timestamp = $entity->head->get('Date');
print "#$i $from / $subject / $timestamp<br />";
for my $part ($entity->parts()) {
print " / ".$part->mime_type;
if ( $part->mime_type eq 'application/octet-stream' || $part->mime_type eq 'application/pdf' ) {
my $filename = $part->bodyhandle->path;
print " / $filename";
}
print "<br />";
}
$server->copy($i,'dump');
$server->delete($i);
}
$server->quit();

Doing XPath using Perl

I am coding with Perl on a Window 7 machine. I am able to extract data from the XML using the XPath code below
use strict;
use warning;
use XML::LibXML;
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($newfile);
my $query = "/tradenet/message/header/unique_ref_no/date/text( )";
my($node) = $doc->findnodes($query);
$node->setData("$file_seq_number");
However, when i use the same code on a different XML, the xpath from the second document looks as below:
/TradenetResponse/OutboundMessage/out:OutwardPermit/out:Declaration/out:Header/cac:UniqueReferenceNumber/cbc:SequenceNumeric
Together with the Perl code, this is what the extraction code looks like:
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($newfile);
my $query = "/TradenetResponse/OutboundMessage/out:OutwardPermit/out:Declaration/out:Header/cac:UniqueReferenceNumber/cbc:SequenceNumeric/text( )";
my($node) = $doc->findnodes($query);
$node->setData("$file_seq_number");
Using the second code, I am unable to retrieve the data from the second XML. I receive this error "Can't call method "setData"on an undefined value at Perl.pl line 5".
Does the ":" character in the second XPATH address affecting the code?
You have to define what out, cac, and cbc mean in order for the XPath query to find the appropriate nodes:
my $doc = $parser->parse_file($newfile);
my $xpath_context = XML::LibXML::XPathContext->new($doc->documentElement());
# These URIs need to be the same as the ones in the source document
$xpath_context->registerNs('out', 'http://example.com/out.xsd');
$xpath_context->registerNs('cac', 'http://example.com/cac.xsd');
$xpath_context->registerNs('cbc', 'http://example.com/cbc.xsd');
my $query = "/TradenetResponse/OutboundMessage/out:OutwardPermit/out:Declaration/out:Header/cac:UniqueReferenceNumber/cbc:SequenceNumeric/text( )";
my ($node) = $xpath_context->findnodes($query);
As promised, here is a working example. First, the test input file:
<?xml version="1.0"?>
<!-- input.xml -->
<TradenetResponse xmlns:a="http://example.com/out.xsd"
xmlns:b="http://example.com/cac.xsd"
xmlns:c="http://example.com/cbc.xsd">
<OutboundMessage>
<a:OutwardPermit>
<a:Declaration>
<a:Header>
<b:UniqueReferenceNumber>
<c:SequenceNumeric>1234</c:SequenceNumeric>
</b:UniqueReferenceNumber>
</a:Header>
</a:Declaration>
</a:OutwardPermit>
</OutboundMessage>
</TradenetResponse>
And here is the working Perl script:
#!/usr/bin/perl
# parse.pl
use strict;
use warnings;
use XML::LibXML;
my $parser = XML::LibXML->new();
my $newfile = "input.xml";
my $doc = $parser->parse_file($newfile);
my $xpath_context = XML::LibXML::XPathContext->new($doc->documentElement());
# These URIs need to be the same as the ones in the source document
$xpath_context->registerNs('out', 'http://example.com/out.xsd');
$xpath_context->registerNs('cac', 'http://example.com/cac.xsd');
$xpath_context->registerNs('cbc', 'http://example.com/cbc.xsd');
# Query wrapped for clarity
my $query = "/TradenetResponse/OutboundMessage/out:OutwardPermit" .
"/out:Declaration/out:Header/cac:UniqueReferenceNumber" .
"/cbc:SequenceNumeric/text()";
my ($node) = $xpath_context->findnodes($query);
print "Value: " . $node->getData() . "\n";
The output for me is:
sean#localhost:~xmltest$ ./parse.pl
Value: 1234

CppUnit output to TAP format converter

I seek a perl module to convert CppUnit output to TAP format. I want to use the prove command afterwards to run and check the tests.
Recently I was doing some converting from junit xml (not to TAP format though).
It was very easy to do by using XML::Twig module.
You code should look like this:
use XML::Twig;
my %hash;
my $twig = XML::Twig->new(
twig_handlers => {
testcase => sub { # this gets called per each testcase in XML
my ($t, $e) = #_;
my $testcase = $e->att("name");
my $error = $e->field("error") || $e->field("failure");
my $ok = defined $error ? "not ok" : "ok";
# you may want to collect
# testcase name, result, error message, etc into hash
$hash{$testcase}{result} = $ok;
$hash{$testcase}{error} = $error;
# ...
}
}
);
$twig->parsefile("test.xml");
$twig->purge();
# Now XML processing is done, print hash out in TAP format:
print "1..", scalar(keys(%hash)), "\n";
foreach my $testcase (keys %hash) {
# print out testcase result using info from hash
# don't forget to add leading space for errors
# ...
}
This should be relatively easy to polish into working state

How can I access attributes and elements from XML::LibXML in Perl?

I am having trouble understanding / using name spaces with XML::LibXML package in Perl. I can access an element successfully but not an attribute. I have the following code which accesses an XML file (http://pastebin.com/f3fb9d1d0).
my $tree = $parser->parse_file($file); # parses the file contents into the new libXML object.
my $xpc = XML::LibXML::XPathContext->new($tree);
$xpc->registerNs(microplateML => 'http://moleculardevices.com/microplateML');
I then try and access an element called common-name and an attribute called name.
foreach my $camelid ($xpc->findnodes('//microplateML:species')) {
my $latin_name = $camelid->findvalue('#name');
my $common_name = $camelid->findvalue('common-name');
print "$latin_name, $common_name" ;
}
But only the latin-name (#name) is printing out, the common-name is not. What am I doing wrong and how can I get the common-name to print out as well?
What does the #name do in this case? I presume it is an array, and that attributes should be put into an array as there can be more than one, but elements (like common-name) should not be because there should just be one?
I've been following the examples here: http://www.xml.com/pub/a/2001/11/14/xml-libxml.html
and here: http://perl-xml.sourceforge.net/faq/#namespaces_xpath, and trying to get their example camel script working with my namespace, hence the weird namespace.
Make sure you XML file is valid then use $node->getAttribute("someAttribute") to access attributes.
#name is a attribute name. You'd use it in findnodes() to specify elements with a given attribute set. Eg. a path like:
//camelids/species[#name="Camelus bactrianus"]/
Here is a simple/contrived example:
#!/usr/bin/perl -w
use XML::LibXML;
my $parser = XML::LibXML->new();
my $doc = $parser->parse_file('/Users/castle/Desktop/animal.xml');
my $xc = XML::LibXML::XPathContext->new( $doc->documentElement() );
$xc->registerNs('ns', 'http://moleculardevices.com/microplateML');
my #n = $xc->findnodes('//ns:species');
foreach $nod (#n) {
print "A: ".$nod->getAttribute("name")."\n";
my #c = $xc->findnodes("./ns:common-name", $nod);
foreach $cod (#c) {
print "B: ".$cod->nodeName;
print " = ";
print $cod->getFirstChild()->getData()."\n";
}
}
Output is:
perl ./xmltest.pl
A: Camelus bactrianus
B: common-name = Bactrian Camel