XML::Simple not grabbing individual XML nodes - perl

I'm using the MediaWiki API to get search results. I simply want to grab the URL to the first result, the XML element marked 'Url'. There will eventually be other things I will want to do with the XML, but I suppose in getting an answer for this I will realize what I'm doing wrong and be able to do the other stuff. Here's the page I'm working with.
require HTTP::Request;
require LWP::UserAgent;
require XML::Simple;
my $url = URI->new("http://en.wikipedia.org/w/api.php?action=opensearch&search=rooney&limit=10&namespace=0&format=xml");
my $request = HTTP::Request->new(GET => $url);
my $ua = LWP::UserAgent->new;
my $response = $ua->request($request);
my $xml = XML::Simple->new();
my $data = $xml->XMLin($response->content);
Everything up to here seems to work fine. My HTTP request goes through alright (if I just print $response->content it returns the XML content fine and if I print $data, I am told that it is a hash.
In attempt to get the 'Url' element, I have tried numerous approaches based on the searching I've done. A few below:
print $data->{'Url'};
print $data->{Url};
print $data{Url}

Pro tip: use Data::Dumper to look inside your data structure.
use Data::Dumper;
print Dumper($data);
You'll get something like this ...
$VAR1 = {
'xmlns' => 'http://opensearch.org/searchsuggest2',
'Section' => {
'Item' => [
{
'Url' => {
'content' => 'http://en.wikipedia.org/wiki/Rooney',
'xml:space' => 'preserve'
},
'Description' => {
'content' => 'Rooney may refer to:',
'xml:space' => 'preserve'
},
'Text' => {
'content' => 'Rooney',
'xml:space' => 'preserve'
}
},
... much much more ...
from which you can deduce that the route to your desired data is through
$data->{Section}{Item}[0]{Url}{content}
You should also look into using something like XML::XPath, which makes it much easier to conduct this kind of search.

Related

Reuse LWP Useragent object with HTTP POST query in a for/while loop

I am using LWP Useragent to make multiple POST calls with basic Authorization, wherein POST URL parameters are read from a CSV file. Here is my code:
use strict;
use warnings;
use LWP::UserAgent;
use JSON 'from_json';
use MIME::Base64 'encode_base64';
use Data::Dumper;
my #assets;
my %data;
my $response;
my $csvfile = 'ScrappedData_Coins.csv';
my $dir = "CurrencyImages";
open (my $csv, '<', "$dir/$csvfile") || die "cant open";
foreach (<$csv>) {
chomp;
my #currencyfields = split(/\,/);
push(#assets, \#currencyfields);
}
close $csv;
my $url = 'https://example.org/objects?';
my %options = (
"username" => 'API KEY',
"password" => '' ); # Password field is left blank
my $ua = LWP::UserAgent->new(keep_alive=>1);
$ua->agent("MyApp/0.1");
$ua->default_header(
Authorization => 'Basic '. encode_base64( $options{username} . ':' . $options{password} )
);
my $count =0;
foreach my $row (#cryptoassets) {
$response = $ua->post(
$url,
Content_Type => 'multipart/form-data',
Content => {
'name'=>${$row}[1],
'lang' => 'en',
'description' => ${$row}[6],
'parents[0][Objects][id]' => 42100,
'Objects[imageFiles][0]' =>[${$row}[4]],
}
);
if ( $response->is_success ) {
my $json = eval { from_json( $response->decoded_content ) };
print Dumper $json;
}
else {
$response->status_line;
print $response;
}
}
sleep(2);
}
Basically, I want to reuse the LWP object. For this, I am creating the LWP object, its headers, and response objects once with the option of keep_alive true, so that connection is kept open between server and client. However, the response from the server is not what I want to achieve. One parameter value ('parents[0][Objects][id]' => 42100) seems to not get passed to the server in HTTP POST calls. In fact, its behavior is random, sometimes the parentID object value is passed, and sometimes not, while all other param values are passing correctly. Is this a problem due to the reusing of the LWP agent object or is there some other problem? Because when I make a single HTTP POST call, all the param values are passed correctly, which is not the case when doing it in a loop. I want to make 50+ POST calls.
Reusing the user-agent object would not be my first suspicion.
Mojo::UserAgent returns a complete transaction object when you make a request. It's easy for me to inspect the request even after I've sent it. It's one of the huge benefits that always annoyed my about LWP. You can do it, but you have to break down the work to form the request first.
In this case, create the query hash first, then look at it before you send it off. Does it have everything that you expect?
Then, look at the request. Does the request match the hash you just gave it?
Also, when does it go wrong? Is the first request okay but the second fails, or several are okay then one fails?
Instead of testing against your live system, you might try httpbin.org. You can send it requests in various ways
use Mojo::UserAgent;
use Mojo::Util qw(dumper);
my $hash = { ... };
say dumper( $hash );
my $ua = Mojo::UserAgent->new;
$ua->on( prepare => sub { ... } ); # add default headers, etc
my $tx = $ua->post( $url, form => $hash );
say "Request: " . $tx->req->to_string;
I found the solution myself. I was passing form parameter data (key/value pairs) using hashref to POST method. I changed it to arrayref and the problem was solved. I read how to pass data to POST method on CPAN page. Thus, reusing LWP object is not an issue as pointed out by #brian d foy.
CPAN HTTP LWP::UserAgent API
CPAN HTTP Request Common API

Perl JIRA POST error "headers must be presented as a hashref"

I am writing a Perl script to POST an attachment to JIRA using
REST::Client to access the API
but I am getting an error.
use REST::Client;
use warnings;
use strict;
use File::Slurp;
use MIME::Base64;
my $user = 'user';
my $pass = 'pass';
my $url = "http://******/rest/api/2/issue/BugID/attachments";
my $client = REST::Client->new();
$client->addHeader( 'Authorization', 'Basic' . encode_base64( $user . ':' . $pass ) );
$client->addHeader( 'X-Atlassian-Token', 'no-check' );
$client->setHost( $url );
# my %header = ('Authorization' => 'Basic'. encode_base64($user . ':' . $pass),'X-Atlassian-Token' => 'no-check');
my $attachment = "C:\\Folder\\Test.txt";
$client->POST(
$url,
'Content_Type' => 'form-data',
'Content' => [ 'file' => [$attachment] ]
);
if ( $client->responseCode() eq '200' ) {
print "Updated\n";
}
# print the result
print $client->responseContent() . "\n";
The error I get is
REST::Client exception: headers must be presented as a hashref at C:\Users\a\filename.pl line 24.
As shown in the code, I have tried setting headers in different ways but I still get same error.
Please suggest if there is any other method.
I have tried using JIRA module but it gives error too.
According to the documentation, the POST method:
Takes an optional body content and hashref of custom request headers.
You need to put your headers in a hashref, e.g.:
$client->POST($url, $content, {
foo => 'bar',
baz => 'qux'
});
But...it looks like you're expecting REST::Client to use HTTP::Request::Common to construct a multipart/form-data request. Unfortunately, that's not the case, so you'll have to build the content by hand.
You could use HTTP::Request::Common directly like this:
use strict;
use warnings 'all';
use 5.010;
use HTTP::Request::Common;
use REST::Client;
my $client = REST::Client->new;
my $url = 'http://www.example.com';
my $req = POST($url,
Content_Type => 'form-data',
Content => [ file => [ 'foo.txt' ] ]
);
$client->POST($url, $req->content(), {
$req->headers->flatten()
});
But this is a bit convoluted; I would recommend dropping REST::Client and using LWP::UserAgent instead. REST::Client is just a thin wrapper for LWP::UserAgent with a few convenience features, like prepending a default host to all requests. In this case, it's just getting in the way and I don't think the conveniences are worth the trouble.
From the documentation:
POST ( $url, [$body_content, %$headers] )
And you're doing:
$client->POST(
$url,
'Content_Type' => 'form-data',
'Content' => [ 'file' => [$attachment] ]
);
So - passing a list of scalars, with an arrayref at the end.
Perhaps you want something like:
$client->POST(
$url,
$attachment,
{ 'Content-Type' => 'form-data' }
);
Note the {} to construct an anonymous hash for the headers.
Although you probably want to open and include the 'attachment', because there's nothing in REST::Client about opening files and sending them automagically.

Using perl REST::Application to parse url query string

I'm using perl's REST::Application to generate html dynamically on my server with cgi script that looks somewhat like:
my $url_map = {
qr|^/?$| => {
GET => \&get_help,
},
qr|^/builders/?$| => {
GET => \&get_builders,
},
};
I would like to add support for query strings. For example, if the url would be www.example.com/query?param1=a&param2=b I would like to be able to call a procedure with those values as parameters. However, If only one of the parameters is given, it should be supported as well.
Possible solution could be using HTML::Entities and CGI:
Add relevant modules:
use CGI;
use HTML::Entities;
Add prpoer line to $url_map:
qr|^/query(.*)| => {
GET => \&get_query,
}
And implement:
sub get_uploads {
$rest->header(-type => "text/plain, charset=utf8");
my $cgi = CGI->new;
my %params = map { $_ => HTML::Entities::encode(join("; ", split("\0", $cgi->Vars->{$_}))) } $cgi->param;
.
.
.
}
At this stage %params will hold the query string.
For example, if I used the url in the question, %params would be:
$VAR1 = {
'param1' => 'a',
'param2' => 'b'
};

How do I work with just one key and value from Data::Dumper output

I have data dumper outputting a remotely hosted xml file into a local text file and I am getting the following info:
$VAR1 = {
'resource' => {
'005cd410-41d6-4e3a-a55f-c38732b73a24.xml' => {
'standard' => 'DITA',
'area' => 'holding',
'id' => 'Comp_UKCLRONLINE_UKCLR_2000UKCLR0278',
},
'003c2a5e-4af3-4e70-bf8b-382d0b4edda1.xml' => {
'standard' => 'DITA',
'area' => 'holding',
'id' => 'Comp_UKCLRONLINE_UKCLR_2000UKCLR0278',
},
etc. What I want to do is work with just one/key and value in each resource. Ie pick out the ID and then create a url from that.
I would normally use a regex on the file and pull the info I need from that but I'm thinking there must be an easier/proper way but can't think of the right term to use in a search and am therefore not finding it.
Here is the code I am using to write this output to a file:
#-----------------------------------------------
sub request_url {
#-----------------------------------------------
my $useragent = LWP::UserAgent->new;
my $request = HTTP::Request->new( GET => "http://digitalessence.net/resource.xml" );
$resource = $useragent->request( $request );
}
#-----------------------------------------------
sub file_write {
#-----------------------------------------------
open OUT, ">$OUT" or Log_message ("\n$DATE - $TIME - Could not create filelist.doc \t");
Log_message ("\n$DATE - $TIME - Opened the output file");
print OUT Dumper (XML::Simple->new()->XMLin( $resource->content ));
Log_message ("\n$DATE - $TIME - Written the output file");
}
thanks
I'm not really understanding your question, but I'm guessing you want to access some data from the hash.
You don't need a regex or other strage stuff; just `do` your data and get the value from the hassref you get back:
A simple one liner as an example (assuming your file is called `dumper.out`):
perl -Mstrict -wE 'my $hashref = do{ do "dumper.out" }; say $hashref->{resource}{"005cd410-41d6-4e3a-a55f-c38732b73a24.xml"}{id}'
HTH, Paul
Maybe you want to walk the data structure built by XML::Simple.
Each resource is inside an ARRAYREF you get using the resource key with $doc data structure.
use XML::Simple;
use LWP;
use Data::Dumper;
my $ua = LWP::UserAgent->new;
my $req = HTTP::Request->new( GET => "http://digitalessence.net/resource.xml" );
my $res = $ua->request( $req );
my $xs = XML::Simple->new();
my $doc = $xs->XMLin( $res->content );
printf "resources: %s\n", scalar keys %{ $doc->{ resource } };
foreach ( keys %{ $doc->{ resource } } ) {
printf "resource => %s, id => %s\n", $_, $doc->{ resource }->{ $_ }->{ id };
}
The output is this:
resources: 7
resource => 005cd410-41d6-4e3a-a55f-c38732b73a24.xml, id => Comp_UKCLRONLINE_UKCLR_2000UKCLR0278
resource => 003c2a5e-4af3-4e70-bf8b-382d0b4edda1.xml, id => Comp_UKCLRONLINE_UKCLR_2002UKCLR0059
resource => 0033d4d3-c397-471f-8cf5-16fb588b0951.xml, id => Comp_UKCLRONLINE_UKCLR_navParentTopic_67
resource => 002a770a-db47-41ef-a8bb-0c8aa45a8de5.xml, id => Comp_UKCLRONLINE_UKCLR_navParentTopic_308
resource => 000fff79-45b8-4ac3-8a57-def971790f16.xml, id => Comp_UKCLRONLINE_UKCLR_2002UKCLR0502
resource => 00493372-c090-4734-9a50-8f5a06489591.xml, id => Comp_UKCLRONLINE_COMPCS_2010_10_0002
resource => 004377bf-8e24-4a69-9411-7c6baca80b87.xml, id => Comp_CLJONLINE_CLJ_2002_01_11

How can I access Closure JavaScript minifier using Perl's LWP::UserAgent?

I'm trying to get Code Closure to work, but unfortunately, there's always an error thrown.
Here's the code:
use LWP::UserAgent;
use HTTP::Request::Common;
use HTTP::Response;
my $name = 'test.js';
my $agent = new LWP::UserAgent();
$agent->agent("curl/7.21.0 (x86_64-pc-linux-gnu) libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18");
$res = $agent->request(POST 'http://closure-compiler.appspot.com/compile',
content_type => 'multipart/form-data',
content => [
output_info => 'compiled_code',
compilation_level => 'SIMPLE_OPTIMIZATIONS',
output_format => 'text',
js_code => [File::Spec->rel2abs($name)]
]);
if ($res->is_success) {
$minified = $res->decoded_content;
print $minified;die;
}
I get the following error:
Error(13): No output information to produce, yet compilation was requested.
Here's the api reference I used:
http://code.google.com/intl/de-DE/closure/compiler/docs/api-ref.html
Hope anyone knows what's going wrong here. Thanks.
#!/usr/bin/perl
use strict; use warnings;
use File::Slurp;
use LWP::UserAgent;
my $agent = LWP::UserAgent->new;
my $script = 'test.js';
my $response = $agent->post(
'http://closure-compiler.appspot.com/compile',
content_type => 'application/x-www-form-urlencoded',
content => [
compilation_level => 'SIMPLE_OPTIMIZATIONS',
output_info => 'compiled_code',
output_format => 'text',
js_code => scalar read_file($script),
],
);
if ($response->is_success) {
my $minified = $response->decoded_content;
print $minified;
}
Output:
C:\Temp> cat test.js
// ADD YOUR CODE HERE
function hello(name) {
alert('Hello, ' + name);
}
hello('New user');
C:\Temp> t
function hello(a){alert("Hello, "+a)}hello("New user");
Pass as js_code the actual code to compile. Try (removing the form-data content_type header):
use File::Slurp "read_file";
...
js_code => scalar( read_file($name) ),
I see you are trying to use POST's file upload feature; what in the API documentation do you see that makes you think that would work? If there is something there, I don't see it.