LWP getstore usage - perl

I'm pretty new to Perl. While I just created a simple scripts to retrieve a file with
getstore($url, $file);
But how do I know whether the task is done correctly or the connection interrupted in the middle, or authentication failed, or whatever response. I searched all the web and I found some, like a response list, and some talking about useragent stuff, which I totally can't understand, especially the operator $ua->.
What I wish is to an explanation about that operator stuff (I don't even know what -> used for), and the RC code meaning, and finally, how to use it.
Its a lot of stuff so I appreciate any answer given, even just partially. And, thanks first for whoever will to help. =)

The LWP::Simple module is just that: quite simplistic. The documentation states that the getstore function returns the HTTP status code which we can save into a variable. There are also the is_success and is_error functions that tell us whether a certain return value is ok or not.
my $url = "http://www.example.com/";
my $filename = "some-file.html";
my $rc = getstore($url, $filename)
if (is_error($rc)) {
die "getstore of <$url> failed with $rc";
}
Of course, this doesn't catch errors with the file system.
The die throws a fatal exception that terminates the execution of your script and displays itself on the terminal. If you don't want to abort execution use warn.
The LWP::Simple functions provide high-level controls for common tasks. If you need more control over the requests, you have to manually create an LWP::UserAgent. An user agent (abbreviated ua) is a browser-like object that can make requests to servers. We have very detailed control over these requests, and can even modify the exact header fields.
The -> operator is a general dereference operator, which you'll use a lot when you need complex data structures. It is also used for method calls in object-oriented programming:
$object->method(#args);
would call the method on $object with the #args. We can also call methods on class names. To create a new object, usually the new method is used on the class name:
my $object = The::Class->new();
Methods are just like functions, except that you leave it to the class of the object to figure out which function exactly will be called.
The normal workflow with LWP::UserAgent looks like this:
use LWP::UserAgent; # load the class
my $ua = LWP::UserAgent->new();
We can also provide named arguments to the new method. Because these UA objects are robots, it is considered good manners to tell everybody who sent this Bot. We can do so with the from field:
my $ua = LWP::UserAgent->new(
from => 'ss-tangerine#example.com',
);
We could also change the timeout from the default three minutes. These options can also be set after we constructed a new $ua, so we can do
$ua->timeout(30); # half a minute
The $ua has methods for all the HTTP requests like get and post. To duplicate the behaviour of getstore, we first have to get the URL we are interested in:
my $url = "http://www.example.com/";
my $response = $ua->get($url);
The $response is an object too, and we can ask it whether it is_success:
$response->is_success or die $response->status_line;
So if execution flows past this statement, everything went fine. We can now access the content of the request. NB: use the decoded_content method, as this manages transfer encodings for us:
my $content = $response->decoded_content;
We can now print that to a file:
use autodie; # automatic error handling
open my $fh, ">", "some-file.html";
print {$fh} $content;
(when handling binary files on Windows: binmode $fh after opening the file, or use the ">:raw" open mode)
Done!
To learn about LWP::UserAgent, read the documentation. To learn about objects, read perlootut. You can also visit the perl tag on SO for some book suggestions.

Related

Why does Mechanize onerror callback set content to error message?

use WWW::Mechanize;
$mech = new WWW::Mechanize( onerror => \&mecherror );
$mech->get("http://stackoverflow.comxxxx");
print $mech->content;
sub mecherror {
$mech->get("http://stackoverflow.com");
}
The output on line 4 is an error string relating to the first failed get and not the content of the get executed in sub mecherror. Why?
The onerror callback of WWW::Mechanize is meant to supply a response to error
onerror => \&func
Reference to a die-compatible function, such as Carp::croak, that is called when there's a fatal error.
It is clearly not intended for recovery or any use of the object that is being constructed.
That said, your call in onerror works, but the object doesn't get to know about it.
use warnings;
use strict;
use feature 'say';
use WWW::Mechanize;
my $mech = new WWW::Mechanize( onerror => \&mecherror );
$mech->get("http://stackoverflow.comxxxx");
say $mech->content;
sub mecherror {
my $response = $mech->get("http://stackoverflow.com");
# say $mech->content;
say "response is " . ref($response);
say $response->decoded_content;
}
This shows that we duly got an HTTP::Response object, and prints out the page. Then we may hope to pass a reference to the callback to connect it to the calling code. However, a mechanism for this is not provided -- this is not supported. We are warned against messing with internals, though.
As for why the object isn't updated, it depends on the callback implementation. From the source code we see that the code reference goes into object's data and is run when needed via the wrapper
sub die {
my $self = shift;
return unless my $handler = $self->{onerror};
return $handler->(#_);
}
A lot of other code is involved when this triggers, while nothing is done here to change the object's state. That is just unsupported and may result in undefined behavior.
Note that the callback here knows what $mech is because it is global, so it has the right object.
To summarize discussions in comments, it is plausible that the page retrieved by the callback gets overwritten by the error message. We get to see this when invoking content, and it appears to be due to this part of the method (see source)
$content = $self->response()->decoded_content(charset => 'none');
The decoded_content method is from HTTP::Response, inherited via LWP::UserAgent, and the error message indeed seems to have come from that class. (Neither W::M nor LWP::UA have a method named "decode_content.") This is summarized in W::M::content page
$mech->content(...)
Returns the content that the mech uses internally for the last page fetched. Ordinarily this is the same as $mech->response()->decoded_content(), [ ... ]
However, we anyway couldn't rely on the object being in a consistent state, as discussed.

Asynchronous Chat Server using Mojolicious

Hello Ladies and Gentleman! I am currently writing a minimalistic chat server that will somewhat resemble IRC. I am writing it in perl using the Mojolicious, but unfortunately have run into an issue. I have the following code:
#!/usr/bin/perl
use warnings;
use strict;
use Mojo::IOLoop::Server;
my $server = Mojo::IOLoop::Server->new;
$server->on(accept => sub {
my ($server, $handle) = #_;
my $data;
print $handle "Connected!\n";
while(1) {
$handle->recv($data, 4096);
if($data) {
print $server "$data";
}
}
});
$server->listen(port => $ARGV[0]);
$server->start;
$server->reactor->start unless $server->reactor->is_running;
Unfortunately, the print $server "$data"; line does not actually work. It gives off the error:
Mojo::Reactor::Poll: I/O watcher failed: Not a GLOB reference at ./server.pl line 20.
I have looked through the documentation for Mojolicious, but cannot find how to send the line I get from client A to the rest of the clients connected.
While $handle is something like a stream you can write on, $server is a Mojo::IOloop::Server object, so it's not a surprise you can't write on it like you're trying to do.
Even if I use Mojolicious quite often, I'm not familiar with every possibilities (there are a lot), but here what I would suggest : you need to store a list of all connected clients (in a hash or an array for instance), and when you receive a message, you iterate through that client list, to send the message to all of them.
You also need a way (not hard to do) to delete clients from your clients list when they disconnect.
Also I'm not quite sure about your infinite loop : I wouldn't be surprised if it was blocking the server on the 1st connected client.
It's better to use Mojolicious functions to do so :
$serv->on(message => sub { send the message to all clients });
And that function would be called every time a message is received.
Here is a good example, using Mojolicious::Light, pretty easy to understand I think : https://github.com/kraih/mojo/wiki/Writing-websocket-chat-using-Mojolicious-Lite

500 Internal Server Error in perl-cgi program

I am getting error as "Internal Server Error.The server encountered an internal error or misconfiguration and was unable to complete your request."
I am submitting a form in html and get its values.
HTML Code (index.cgi)
#!c:/perl/bin/perl.exe
print "Content-type: text/html; charset=iso-8859-1\n\n";
print "<html>";
print "<body>";
print "<form name = 'login' method = 'get' action = '/cgi-bin/login.pl'> <input type = 'text' name = 'uid'><br /><input type = 'text' name = 'pass'><br /><input type = 'submit'>";
print "</body>";
print "</html>";
Perl Code to fetch data (login.pl)
#!c:/perl/bin/perl.exe
use CGI::Carp qw(fatalsToBrowser);
my(%frmfields);
getdata(\%frmfields);
sub getdata {
my ($buffer) = "";
if (($ENV{'REQUEST_METHOD'} eq 'GET')) {
my (%hashref) = shift;
$buffer = $ENV{'QUERY_STRING'};
foreach (split(/&/,$buffer)) {
my ($key, $value) = split(/=/, $_);
$key = decodeURL($key);
$value= decodeURL($value);
$hashref{$key} = $value;
}
}
else{
read(STDIN,$buffer,$ENV{'CONTENT_LENGTH'})
}
}
sub decodeURL{
$_=shift;
tr/+/ /;
s/%(..)/pack('c', hex($1))/eg;
return($_);
}
The HTML page opens correctly but when i submit the form, i get internal server error.
Please help.
What does the web server's error log say?
Independent of what it says, you must stop parsing the form data yourself. There are modules for that, specifically CGI.pm. Using that, you can do this instead:
use CGI;
my $CGI = CGI->new();
my $uid = $CGI->param( 'uid' );
my $pass = $CGI->param( 'pass' );
# rest of your script
Much cleaner and much safer.
I agree with Tore that you must not parse this yourself. Your code has multiple errors. You don't allow multiple parameter values, you don't allow the ; alternate separator, you don't handle POST with a query string in the URL, and so on.
I don't know how long it will be online for free, but chapter 15 of my new "Beginning Perl" book covers Web programming. That should get you started on some decent basics. Note that the online version is an early, rough draft. The actual book also includes Chapter 19 which has a complete Web app example.
could it be this line that's the problem?
my (%hashref) = shift;
You're initialising a proper hash, but shift will give you a hash reference, since you did getdata(\%frmfields);. You probably want this, instead:
my $hashref = shift;
"500 Internal Server Error" just means that something didn't work the way the web server expected. Maybe you don't have CGI enabled. Maybe the script isn't executable. Maybe it's in a directory the web server isn't allowed to access. It's even possible that maybe the web server ran the script successfully and it worked perfectly, but didn't start its output with a valid set of HTTP headers. You need to look in the web server's error log to find out what it didn't like, which may or may not be a Perl issue.
Like everyone else has said, though, don't try to parse query strings and grovel though %ENV yourself. Use one of the many fine modules or frameworks which are available and already known to work correctly. CGI.pm is the granddaddy of them all and works well for smaller projects, but I'd recommend looking into a proper web application framework such as Dancer, Mojolicious, or Catalyst (there are many others, but those are the big three) if you're planning to build anything with more than a handful of relatively simple pages and forms.

Perl: Getting complete request from SOAP::WSDL object

I'm working with SOAP::WSDL and another company's custom WSDL file. Every time they make a change for me and I recreate my modules, something breaks. Finding the problem is rather tedious because I don't find a proper way to access the actual request that is sent to the SOAP server.
The only way to get to the request so far has been to use tcpdump in conjunction with wireshark to extract the request and result. That works, but since I don't have root privileges on the dev machine I have to get an admin over every time I want to do that. I feel there must be another way to get to the HTTP::Request object inside the SOAP::WSDL thing. But if the server returns a fault, I don't even have a response object, but rather a SOAP::WSDL::SOAP::Typelib::Fault11 object that has no visible relation to the request.
I've also tried using the debugger but I'm having trouble finding the actual request part. I've not yet understood how to tell the debuger to skip to a specific part deep inside a complex number of packages.
I stumbled across this, having the same problem myself. I found the answer is using both options that raina77ow listed.
$service->outputxml(1);
returns the whole SOAP envelope xml, but this needs to be combined with
$service->no_dispatch(1);
With no_dispatch set, the SOAP request is printed, instead of the reply from the request. Hopefully this can help others.
Have you tried to use SOAP::WSDL::Client tracing methods - and outputxml in particular? It returns the raw SOAP envelope which is to be sent to the server.
You can also use no_dispatch configuration method of SOAP::WSDL package:
When set, call() returns the plain request XML instead of dispatching
the SOAP call to the SOAP service. Handy for testing/debugging.
I found a way to at least print out the generated XML code.
First, I looked at SOAP::WSDL::Client as raina77ow suggested. That wasn't what I needed, though. But then I came across SOAP::WSDL::Factory::Serializer. There, it says:
Serializer objects may also be passed directly to SOAP::WSDL::Client
by using the set_serializer method.
A little fidgeting and I came up with a wrapper class for SOAP::WSDL::Serializer::XSD which is the default serializer used by SOAP::WSDL. A look at the code helped, too.
Here's the module I wrote. It uses SOAP::WSDL::Serializer::XSD as a base class and overloads the new and serialize methods. While it only passes arguments to new, it grabs the returned XML from serialize and prints it, which suffices for debugging. I'm not sure if there's a way to put it somewhere I can easily get it from.
package MySerializer;
use strict;
use warnings;
use base qw(SOAP::WSDL::Serializer::XSD);
sub new {
my $self = shift;
my $class = ref($self) || $self;
return $self if ref $self;
# Create the base object and return it
my $base_object = $class->SUPER::new(#_);
return bless ($base_object, $class);
}
sub serialize {
my ($self, $args_of_ref) = #_;
# This is basically a wrapper function that calls the real Serializer's
# serialize-method and grabs and prints the returned XML before it
# giving it back to the caller
my $xml = ref($self)->SUPER::serialize($args_of_ref);
print "\n\n$xml\n\n"; # here we go
return $xml;
}
1;
And here's how I call it:
my $serializer = MySerializer->new();
$self->{'_interface'} = Lib::Interfaces::MyInterface->new();
$self->{'_interface'}->set_serializer($serializer); # comment out to deactivate
It's easy to deactivate. Only put a comment in the set_serializer line.
Of course printing a block of XML to the command line is not very pretty, but it gets the job done. I only need it once in a while why coding/testing, so this is fine I guess.

Remote function call using SOAP::Lite

I'm trying to write a client application in Perl using SOAP::Lite. I am trying to call a specific function, but I cannot seem to get the parameters right. I keep getting a response back saying "Found more elements in the soap envelope than required by the WSDL", but no more information beyond that.
Is there any way in SOAP::Lite to directly find out the parameters needed for the remote procedure call?
Thank you.
I navigated by a combination of reading the WSDL and dumping out SOAP::Lite objects as I could manufacture them.
Below is the way that I was able to pick through the returns from SOAP::Lite. Keep in mind that I'm working around some of the bugs in SOAP::Lite by avoiding the SOAP::Schema::load call, and avoiding SL's dislike of more than one defined service in a WSDL, where it kindly croaks on you.
use strict;
use warnings;
use Data::Dumper qw<Dumper>;
use SOAP::Lite; # trace => 'all'; # <- trace can help
my $schema = SOAP::Schema->new( schema_url => $destination_URL )->parse();
my $services = $schema->services();
my $defintion;
foreach my $service ( values %$services ) {
$definition = $service->{$method_name};
}
print Dumper( $definition );
Most of variables that are not defined above are things that you would have to supply.