Perl - LWP::UserAgent; slow response - more then 4096 characters - perl

I'm trying to retrieve interface statistics data via HTTPS GET from a remote network device. The script is working, however when a response is larger than 4098 characters the connection hangs for 60 seconds. Please help ! I'm slowly going crazy.
my $interface_getAll = HTTP::Request->new( GET => $URL );
$interface_getAll->content( 'session_id=' . $sessionID . '&format=json&method=network.interface.fetchAllStatistics' );
my $res = $ua->request($interface_getAll);
$interface_getAll_rData = $res->content;
my $InterfaceValues = decode_json($interface_getAll_rData);
print Dumper($InterfaceValues);
I can't replicate this issue when using curl or wget, I get a response instantly. The API on the remote device responds to GET requests only. If you think that my code is invalid I'm open to any suggestions.
request URL example/format:
https://192.168.99.51:443/services/rest/V2.1/session_id=b121c16aa0c0361e9bebe5bd67e60a&format=json&method=network.interface.fetchStatistics

You say you need to fetch url
https://192.168.99.51:443/services/rest/V2.1/session_id=b121c16aa0c0361e9bebe5bd67e60a&format=json&method=network.interface.fetchStatistics
but your code is fetching url
https://192.168.99.51:443/services/rest/V2.1/
The following is how you fetch the correct url:
use URI;
my $url = URI->new("https://192.168.99.51:443/services/rest/V2.1/");
$url->query_form(
session_id => $sessionID,
format => 'json',
method => 'network.interface.fetchAllStatistics',
);
my $response = $ua->get($url);
Using $url->query_form ensures your values are properly encoded.

The API on the remote device responds to GET requests only.
Using a message body with a GET request has no defined behavior. In fact RFC 7231 clearly says
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
Thus if your API really requires a GET with a message body you should consider the API broken.
But given the URL you show I would rather suggest that the API does not require a message body but that you simply use LWP in the wrong way.
According to your example URL the session_id=... is not inside the message body but actually part of the URL and should be added to this, i.e.
$URL = "https://192.168.99.51:443/services/rest/V2.1/session_id=...."
And while you could just append it directly to the URL a more clean and readable way would be to construct your URL with URI as suggested by ikegami#:
use URI;
my $url = URI->new("https://192.168.99.51:443/services/rest/V2.1/");
$url->query_form(
session_id => $sessionID,
format => 'json',
method => 'network.interface.fetchAllStatistics',
);
my $interface_getAll = HTTP::Request->new( GET => $url );
# no setting content() here

Related

Send HTTP streaming request in perl

I want to send xml request using HTTP streaming protocol . where transfer-encoding is "chunked". Currently i am using LWP::UserAgent to send the xml transaction.
my $userAgent = LWP::UserAgent->new;
my $starttime = time();
my $response = $userAgent->request(POST $url,
Content_Type => 'application/xml',
Transfer_Encoding => 'Chunked',
Content => $xml);
print "Response".Dumper($response);
But i am getting http status code 411 Length Required. Which means "client error response code indicates that the server refuses to accept the request without a defined "
How we can handle this while sending a request in chunked ?
LWP::UserAgent's API isn't designed to send a stream, but it is able to do so with minimal hacking.
use strict;
use warnings qw( all );
use HTTP::Request::Common qw( POST );
use LWP::UserAgent qw( );
my $ua = LWP::UserAgent->new();
# Don't provide any content.
my $request = POST('http://stackoverflow.org/junk',
Content_Type => 'application/xml',
);
# POST() insists on adding a Content-Length header.
# We need to remove it to get a chunked request.
$request->headers->remove_header('Content-Length');
# Here's where we provide the stream generator.
my $buffer = 'abc\n';
$request->content(sub {
return undef if !length($buffer); # Return undef when done.
return substr($buffer, 0, length($buffer), ''); # Return a chunk of data otherwise.
});
my $response = $ua->request($request);
print($response->status_line);
Using a proxy (Fiddler), we can see this does indeed send a chunked request:
There's no point in using a chunked request if you already have the entire document handy like in the example you gave. Instead, let's say wanted to upload the output of some external tool as it produced its output. To do that, you could use the following:
open(my $pipe, '-|:raw', 'some_tool');
$request->content(sub {
my $rv = sysread($pipe, my $buf, 64*1024);
die $! if !defined($rv);
return undef if !$rv;
return $buf;
});
But i am getting http status code 411 Length Required.
Not all servers understand a request with a chunked payload even though this is standardized in HTTP/1.1 (but not in HTTP/1.0). For example nginx only supports chunking within a request since version 1.3.9 (2012), see Is there a way to avoid nginx 411 Content-Length required errors?. If the server does not understand a request with chunked encoding there is nothing you can do from the client side, i.e. you simply cannot use chunked transfer encoding then. If you have control over the server make sure that the server actually supports it.
I've also never experienced browsers send such requests, probably since they cannot guarantee that the server will support such request. I've only seen mobile apps used where the server and app is managed by the same party and thus support for chunked requests can be guaranteed.

Fetch text/event-stream web response using WWW::Mechanize or LWP::UserAgent

I'm using WWW::Mechanize to fetch a web page that includes a Google Maps widget that receives constant data from a single response of type text/event-stream.
That kind of response is like a never ending response from the server that constantly returns updated data for the widget to work.
I'm trying to find out how to read that exact response from Perl. Using something like:
my $mech = WWW::Mechanize->new;
# Do some normal GET and POST requests to authenticate and set cookies for the session
# Now try to get that text/event-stream response
$mech->get('https://some.domain.com/event_stream_page');
But that doesn't work because the response never ends.
How can I make that request and start reading the response and do something with that data every time the server updates the stream?
Found a way to do this. Using a handler from LWP, from which WWW::Mechanize inherits:
$mech->add_handler (
'response_data',
sub {
my ($response, $ua, $h, $data) = #_;
# Your chunk of response is now in $data, do what you need
# If you plan on reading an infinite stream, it's a good idea to clean the response so it doesn't grow infinitely too!
$response->content(undef);
# Important to return a true value if you want to keep reading the response!
return 1;
},
);

Guzzle not sending PSR-7 POST body correctly

It is either not being sent, or not being received correctly. Using curl direct from the command line (using the -d option) or from PHP (using CURLOPT_POSTFIELDS) does work.
I start with a PSR-7 request:
$request = GuzzleHttp\Psr7\Request('POST', $url);
I add authentication header, which authenticates against the API correctly:
$request = $request->withHeader('Authorization', 'Bearer ' . $accessToken);
Then I add the request body:
// The parameter for the API function
$body = \GuzzleHttp\Psr7\stream_for('args=dot');
$request = $request->withBody($body);
I can send the message to the API:
$client = new \GuzzleHttp\Client();
$response = $client->send($request, ['timeout' => 2]);
The response I get back indicates that the "args" parameter was simply not seen by the API. I have tried moving the authentication token to the args:
'args=dot&access_token=123456789'
This should work, and does work with curl from the command line (-d access_token=123456789) but the API fails to see that parameter also when sending cia curl (6.x) as above.
I can see the message does contain the body:
var_dump((string)$request->getBody());
// string(8) "args=dot"
// The "=" is NOT URL-encoded in any way.
So what could be going wrong here? Are the parameters not being sent, or are they being sent in the wrong format (maybe '=' is being encoded?), or is perhaps the wrong content-type being used? It is difficult to see what is being sent "on the wire" when using Guzzle, since the HTTP message is formatted and sent many layer deep.
Edit: Calling up a local test script instead of the remote API, I get this raw message detail:
POST
CONNECTION: close
CONTENT-LENGTH: 62
HOST: acadweb.co.uk
USER-AGENT: GuzzleHttp/6.1.1 curl/7.19.7 PHP/5.5.9
args=dot&access_token=5e09d638965288937dfa0ca36366c9f8a44d4f3e
So it looks like the body is being sent, so I guess something else is missing to tell the remote API how to interpret that body.
Edit: the command-line curl that does work, sent to the same test script, gives me two additional header fields in the request:
CONTENT-TYPE: application/x-www-form-urlencoded
ACCEPT: */*
I'm going to guess it is the content-type header which is missing from the Guzzle request which is the source of the problem. So is this a Guzzle bug? Should it not always sent a Content-Type, based on the assumptions it makes that are listed in the documentation?
The Content-Type header was the issue. Normally, Guzzle will hold your hand and insert headers it deems necessary, and makes a good guess at the Content-Type based on what you have given it, and how you have given it.
With Guzzle's PSR-7 messages, none of that hand-holding is done. It strictly leaves all the headers for you to handle. So when adding POST parameters to a PSR-7 Request, you must explicitly set the Content-Type:
$params = ['Foo' => 'Bar'];
$body = \GuzzleHttp\Psr7\stream_for(http_build_query($params));
$request = $request->withBody($body);
$request = $request->withHeader('Content-Type', 'application/x-www-form-urlencoded');
The ability to pass in the params as an array and to leave Guzzle to work out the rest, does not apply to Guzzle's PSR-7 implementation. It's a bit clumsy, as you need to serialise the POST parameters into a HTTP query string, and then stick that into a stream, but there you have it. There may be an easier way to handle this (e.g. a wrapper class I'm not aware of), and I'll wait and see if any come up before accepting this answer.
Be aware also that if constructing a multipart/form-data Request message, you need to add the boundary string to the Content-Type:
$request = $request->withHeader('Content-Type', 'multipart/form-data; boundary=' . $boundary);
Where $boundary can be something like uniq() and is used in construction the multipart body.
The GuzzleHttp\Client provides all necessary wrapping.
$response = $client->post(
$uri,
[
'auth' => [null, 'Bearer ' . $token],
'form_params' => $parameters,
]);
Documentation available Guzzle Request Options
Edit: However, if your requests are being used within GuzzleHttp\Pool then, you can simply everything into the following:
$request = new GuzzleHttp\Psr7\Request(
'POST',
$uri,
[
'Authorization' => 'Bearer ' . $token,
'Content-Type' => 'application/x-www-form-urlencoded'
],
http_build_query($form_params, null, '&')
);

LWP Send Post request and get headers only in response

I have code like this
my $ua = new LWP::UserAgent;
$ua->timeout($timeout);
$ua->agent($useragent);
$response = $ua->post($domain,['login_name'=>$login,'login_password'=> $password])->as_string;
Content of page so large, thatI can't receive it. How to get only headers with sending post data?
I think this should do it for you.
my $ua = LWP::UserAgent->new();
$ua->timeout($timeout);
$ua->agent($useragent);
my $response = $ua->post(
$domain,
[ 'login_name' => $login, 'login_password' => $password ]
);
use Data::Dumper;
print Dumper( $response->headers() );
print $response->request()->content(), "\n";
To first, check if you can pass this login_name and login_password via HEAD (in url string: domain/?login_name=...&login_password=...). If this will not work, you are in bad case.
You cannot use POST with behavior of HEAD. LWP will wait full response.
Using POST the server will send you the content anyway, but you can avoid receiving all content using sockets tcp by yourself: gethostbyname, connect, sysread until you get /\r?\n\r?\n/ and close socket after this. Some traffic will be utilized anyway, but you can save memory and receive time.
Its not normal thing to do this with sockets, but sometimes when you have highload/big data - there is no better way than such mess.

How to detect a changed webpage?

In my application, I fetch webpages periodically using LWP. Is there anyway to check whether between two consecutive fetches the webpage has got changed in some respect (other than explicitly doing a comparison) ? Is there any signature(say CRC) that is being generated at lower protocol layers which can be extracted and compared against older signatures to see possible changes ?
There are two possible approaches. One is to use a digest of the page, e.g.
use strict;
use warnings;
use Digest::MD5 'md5_hex';
use LWP::UserAgent;
# fetch the page, etc.
my $digest = md5_hex $response->decoded_content;
if ( $digest ne $saved_digest ) {
# the page has changed.
}
Another option is to use an HTTP ETag, if the server provides one for the resource requested. You can simply store it and then set your request headers to include an If-None-Match field on subsequent requests. If the server ETag has remained the same, you'll get a 304 Not Modified status and an empty response body. Otherwise you'll get the new page. (And new ETag.) See Entity Tags in RFC2616.
Of course, the server could be lying, and sending the same ETag even though the content has changed. There's no way to know unless you look.
You should use the If-Modified-Since request header, noting the gotchas in the RFC. You send this header with the request. If the server supports it and thinks the content is newer, it sends it to you. If it thinks you have the most recent version, it returns a 304 with no message body.
However, as other answers have noted, the server doesn't have to tell you the truth, so you're sometimes stuck downloading the content and checking yourself. Many dynamic things will always claim to have new content because many developers have never thought about supporting basic HTTP things in their web apps.
For the LWP bits, you can create a single request with an extra header:
use HTTP::Request;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $request = HTTP::Request->new( GET => $url );
$r->header( 'If-Modified-Since' => $time );
$ua->request( $request );
For all requests, you can set a request handler:
$ua->add_handler(
request_send => sub {
my($request, $ua, $h) = #_;
# ... look up time from local store
$r->header( 'If-Modified-Since' => $time );
}
);
However, LWP can do most of this for you with mirror if you want to save the files:
$ua->mirror( $url, $filename )