Get raw response headers from LWP? - perl

Is there a way to grab raw, unmodified response headers from an HTTP request made with LWP? This is for a diagnostic tool that needs to identify problems with possibly malformed headers.
The closest thing I've found is:
use LWP::UserAgent;
my $ua = new LWP::UserAgent;
my $response = $ua->get("http://somedomain.com");
print $response->headers()->as_string();
But this actually parses the headers, and then reconstructs a canonicalized, cleaned-up version of them from the parsed data. I really need the entire header text in exactly the form in which it was returned by the server, so anything malformed or non-standard will be clearly identifiable.
If it turns out there is no way to do this with LWP, is there perhaps some other Perl module that can do this?

Net::HTTP provides lower level access with less processing. Since it is a subclass of IO::Socket::INET you can read directly from the object after making the request.
use Net::HTTP;
# Make the request using Net::HTTP.
my $s = Net::HTTP->new(Host => "www.perl.com") || die $#;
$s->write_request(GET => "/", 'User-Agent' => "Mozilla/5.0");
# Read the raw headers.
my #headers;
while(my $line = <$s>) {
# Headers are done on a blank line.
last unless $line =~ /\S/;
push #headers, $line;
}
print #headers;

Based on an inspection of the HTTP::Response object (and the HTTP::Headers object it contains), the headers are discarded as they're parsed.
I would recommend you try WWW::Curl instead.
EDIT Snippet using WWW::Curl:
use WWW::Curl::Easy;
my ($header, $body);
my $curl = WWW::Curl::Easy->new;
$curl->setopt(CURLOPT_URL, $url_to_get); # get this URL
$curl->setopt(CURLOPT_WRITEHEADER, \$header); # save header text in this var
$curl->setopt(CURLOPT_WRITEDATA, \$body); # save body text in this var
my $code = $curl->perform;
if (0 == $code) {
# header text is in $header, body text in $body
} else {
print $curl->strerror($code).": ".$curl->errbuf."\n";
}

Related

Reuse LWP Useragent object with HTTP POST query in a for/while loop

I am using LWP Useragent to make multiple POST calls with basic Authorization, wherein POST URL parameters are read from a CSV file. Here is my code:
use strict;
use warnings;
use LWP::UserAgent;
use JSON 'from_json';
use MIME::Base64 'encode_base64';
use Data::Dumper;
my #assets;
my %data;
my $response;
my $csvfile = 'ScrappedData_Coins.csv';
my $dir = "CurrencyImages";
open (my $csv, '<', "$dir/$csvfile") || die "cant open";
foreach (<$csv>) {
chomp;
my #currencyfields = split(/\,/);
push(#assets, \#currencyfields);
}
close $csv;
my $url = 'https://example.org/objects?';
my %options = (
"username" => 'API KEY',
"password" => '' ); # Password field is left blank
my $ua = LWP::UserAgent->new(keep_alive=>1);
$ua->agent("MyApp/0.1");
$ua->default_header(
Authorization => 'Basic '. encode_base64( $options{username} . ':' . $options{password} )
);
my $count =0;
foreach my $row (#cryptoassets) {
$response = $ua->post(
$url,
Content_Type => 'multipart/form-data',
Content => {
'name'=>${$row}[1],
'lang' => 'en',
'description' => ${$row}[6],
'parents[0][Objects][id]' => 42100,
'Objects[imageFiles][0]' =>[${$row}[4]],
}
);
if ( $response->is_success ) {
my $json = eval { from_json( $response->decoded_content ) };
print Dumper $json;
}
else {
$response->status_line;
print $response;
}
}
sleep(2);
}
Basically, I want to reuse the LWP object. For this, I am creating the LWP object, its headers, and response objects once with the option of keep_alive true, so that connection is kept open between server and client. However, the response from the server is not what I want to achieve. One parameter value ('parents[0][Objects][id]' => 42100) seems to not get passed to the server in HTTP POST calls. In fact, its behavior is random, sometimes the parentID object value is passed, and sometimes not, while all other param values are passing correctly. Is this a problem due to the reusing of the LWP agent object or is there some other problem? Because when I make a single HTTP POST call, all the param values are passed correctly, which is not the case when doing it in a loop. I want to make 50+ POST calls.
Reusing the user-agent object would not be my first suspicion.
Mojo::UserAgent returns a complete transaction object when you make a request. It's easy for me to inspect the request even after I've sent it. It's one of the huge benefits that always annoyed my about LWP. You can do it, but you have to break down the work to form the request first.
In this case, create the query hash first, then look at it before you send it off. Does it have everything that you expect?
Then, look at the request. Does the request match the hash you just gave it?
Also, when does it go wrong? Is the first request okay but the second fails, or several are okay then one fails?
Instead of testing against your live system, you might try httpbin.org. You can send it requests in various ways
use Mojo::UserAgent;
use Mojo::Util qw(dumper);
my $hash = { ... };
say dumper( $hash );
my $ua = Mojo::UserAgent->new;
$ua->on( prepare => sub { ... } ); # add default headers, etc
my $tx = $ua->post( $url, form => $hash );
say "Request: " . $tx->req->to_string;
I found the solution myself. I was passing form parameter data (key/value pairs) using hashref to POST method. I changed it to arrayref and the problem was solved. I read how to pass data to POST method on CPAN page. Thus, reusing LWP object is not an issue as pointed out by #brian d foy.
CPAN HTTP LWP::UserAgent API
CPAN HTTP Request Common API

How to login in Perl using LWP::UserAgent and can check cookies value using HTTP::Cookies?

I'm trying to automate a test-case in perl using LWP::UserAgent. I need to check the cookie value post login to the application.
Have tried the following sample perl script, but while printing I'm getting output in HASH value :
my $ua = LWP::UserAgent->new;
$ua->agent("MyApp/0.1 ");
my $req = HTTP::Request->new(POST => 'http://' . $test_IP . '/login');
$req->content_type('application/x-www-form-urlencoded');
my $postdata = 'object={"login":{"username":"test","password":"test"}}';
$req->content($postdata);
print "REQUEST is";
print $req;
my $res = $ua->request($req);
print "RESPONSE is";
print $res;
# Check the outcome of the response
if ($res->is_success) {
print LOG $res->headers()->as_string();
}
else {
print $res->status_line, "\n";
}
my $cookie_jar = HTTP::Cookies->new();
print "cookie is";
$cookie_jar->extract_cookies( $res );
print $cookie_jar->extract_cookies( $res );
print $cookie_jar->as_string;
if( $cookie_jar->as_string =~ m/httponly/i )
{
print "Success";
}
else
{
print "FAILED";
}
Have received the output in HASH value :
REQUEST isHTTP::Request=HASH(0x69e85a0)RESPONSE isHTTP::Response=HASH(0x6aa1d98)cookie isHTTP::Response=HASH(0x6aa1d98)
Please suggest how can I login to the application and can check for the required values (here cookie value).
but while printing I'm getting output in HASH value
You're talking about these lines:
print "REQUEST is";
print $req;
my $res = $ua->request($req);
print "RESPONSE is";
print $res;
Both $req and $res are objects. If you print an object, you get the type of object that it is ("HTTP::Request" for $req and "HTTP::Response" for $res). You also see the reference to the object. And as both of these classes implement their objects as hash references, you get two hash references displayed.
I'm not sure what you expected to see, but there are two ways to get more useful information from an object.
You can call methods to access the attributes of the object. For example, later in your code, you use both $res->headers and $res->status_line. Both of these classes are subclasses of HTTP::Message, so they both inherit that class's as_string() method. So if you just want to see the string that gets sent as your HTTP request or the string that you get back as the HTTP response, you can call $req->as_string() or $res->as_string().
You can use a module like Data::Dumper to see all of the internals of your objects. You should really be treating objects as black boxes and just using their published interfaces (the methods as discussed above) but it can sometimes be useful to use a quick 'n' dirty statement like print Dumper $req.
As for creating a cookie jar, the synopsis for HTTP::Cookies is pretty clear. You create a cookie jar (almost) in the way that you do it and then associate that with your UserAgent object by calling the UA's cookie_jar() method. But you need to do that as soon as your UA is created - certainly before you use the UA to make any requests.

How to run cgi scripts in a simple perl web server

For an upcoming school project I need to implement a simple web server, It is from the book "Net Programming with Perl". I am trying to get my head around it all as it is all new to me. For now, all I want to do is have a cgi script run as the home page.
I need to get the code to run from the Web.pm script that comes with the webserver (chapter 15 of the book).
I can get it to serve the cgi file as the home page, but it just show the code. I have tried numerous things and the closest I got was the html that the script is supposed to generate was displayed in the command line window that was running the web server but the server output the message from the not_found subroutine.
Here is the beginning of the Web.pm code with the handle_connection and lookup_file subroutines (straight from the book) the subroutines that I have left our are:
invalid_request, redirect and not_found
package Web;
use strict;
use vars '#ISA','#EXPORT';
use IO::File;
use CGI;
require Exporter;
#ISA = 'Exporter';
#EXPORT = qw(handle_connection docroot);
# set to your home directory
my $DOCUMENT_ROOT = '.';
my $CRLF = "\015\012";
###############################
sub docroot {
$DOCUMENT_ROOT = shift if #_;
return $DOCUMENT_ROOT;
}
###############################
# Outline of Handle_Connection()
# Set the socket handle supplied as a parameter
# Set the standard end-of-line character for HTTP messages
# Read the contents from the socket handle into a request variable
# SECTION TO CHECK FOR ERRORS
# Check to make sure the main request line has the right string format. Call invalid_request() otherwise. Set $method to GET or HEAD, and $url to the supplied URL
# Call lookup_file() to find the specified $url in the file system. Call not_found() if lookup_file() fails
# If the type of 'file' return from lookup_file() is actually a directory, call redirect()
# Print the status line and the headers for the response to the socket handle (ie. to the client)
# If the HTTP method is “GET”, print the file requested in the URL to the socket handle (ie. to the client)
sub handle_connection {
my $c = shift; # socket
my ($fh,$type,$length,$url,$method);
local $/ = "$CRLF$CRLF"; # set end-of-line character
my $request = <$c>; # read the request header
print $request; # print request to the command line
# error checking
return invalid_request($c)
unless ($method,$url) = $request =~ m!^(GET|HEAD) (/.*) HTTP/1\.[01]!;
return not_found($c) unless ($fh,$type,$length) = lookup_file($url);
return redirect($c,"$url/") if $type eq 'directory';
# print the header to socket
print $c "HTTP/1.0 200 OK$CRLF";
print $c "Content-length: $length$CRLF";
print $c "Content-type: $type$CRLF";
print $c $CRLF;
return unless $method eq 'GET';
# print the content to socket
my $buffer;
while ( read($fh,$buffer,1024) ) {
print $c $buffer;
}
close $fh;
}
The cgi file is as follows
#!/usr/bin/perl -w
# from http://perl.about.com/od/cgiweb/a/perlcgipm.htm
use CGI qw/:standard/;
print header,
start_html('Hello World'),
h1('CGI.pm is simple.'),
end_html;
The cgi works fine on Apache.
I believe I need to make a system call, but all of my efforts have been unsuccessful.
Any help would be greatly appreciated.

How do I handle and send POST requests in Perl and FastCGI?

Unfortunately, I'm not familiar with Perl, so asking here. Actually I'm using FCGI with Perl.
I need to 1. accept a POST request -> 2. send it via POST to another url -> 3. get results -> 4. return results to the first POST request (4 steps).
To accept a POST request (step 1) I use the following code (found it somewhere in the Internet):
$ENV{'REQUEST_METHOD'} =~ tr/a-z/A-Z/;
if ($ENV{'REQUEST_METHOD'} eq "POST") {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
}
else {
print ("some error");
}
#pairs = split(/&/, $buffer);
foreach $pair (#pairs) {
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%(..)/pack("C", hex($1))/eg;
$FORM{$name} = $value;
}
The content of $name (it's a string) is the result of the first step. Now I need to send $name via POST request to some_url (step 2) which returns me another result (step 3), which I have to return as a result to the very first POST request (step 4).
Any help with this would be greatly appreciated.
Thank you.
To accept the POST, you can use the hand-rolled code you've shown, but the very best way is to make use of CGI (which is now a core module so it should be in your Perl distribution). For passing on a POST to somewhere else, you can use LWP::UserAgent
#/usr/bin/perl
use strict;
use warnings;
use CGI;
use LWP::UserAgent;
my $cgi = CGI->new; # Will process post upon instantiation
my %params = $cgi->Vars;
my $ua = LWP::UserAgent->new;
my $postTo = 'http://www.somewhere.com/path/to/script';
my $response = $ua->post($postTo, %params);
if ($response->is_success) {
print $response->decoded_content; # or maybe $response->content in your case
} else {
die $response->status_line;
}
}
I highly recommend that you do not try to solve this problem yourself but instead use existing libraries to make you life MUCH easier. The best part of Perl is the vast collection of existing libraries. See http://search.cpan.org/
Good starting places include CGI.pm or a web framework like Catalyst.
The code you've quoted is very buggy. Coincidentally, there was just a post by a popular Perl blogger dissecting this exact code.

What is the easiest way in pure Perl to stream from another HTTP resource?

What is the easiest way (without opening a shell to curl and reading from stdin) in Perl to stream from another HTTP resource? I'm assuming here that the HTTP resource I'm reading from is a potentially infinite stream (or just really, really long)
Good old LWP allows you to process the result as a stream.
E.g., here's a callback to yourFunc, reading/passing byte_count bytes to each call to yourFunc (you can drop that param if you don't care how large the data is to each call, and just want to process the stream as fast as possible):
use LWP;
...
$browser = LWP::UserAgent->new();
$response = $browser->get($url,
':content_cb' => \&yourFunc,
':read_size_hint' => byte_count,);
...
sub yourFunc {
my($data, $response) = #_;
# do your magic with $data
# $respose will be a response object created once/if get() returns
}
HTTP::Lite's request method allows you to specify a callback.
The $data_callback parameter, if used, is a way to filter the data as it is received or to handle large transfers. It must be a function reference, and will be passed: a reference to the instance of the http request making the callback, a reference to the current block of data about to be added to the body, and the $cbargs parameter (which may be anything). It must return either a reference to the data to add to the body of the document, or undef.
However, looking at the source, there seems to be a bug in sub request in that it seems to ignore the passed callback. It seems safer to use set_callback:
#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Lite;
my $http = HTTP::Lite->new;
$http->set_callback(\&process_http_stream);
$http->http11_mode(1);
$http->request('http://www.example.com/');
sub process_http_stream {
my ($self, $phase, $dataref, $cbargs) = #_;
warn $phase, "\n";
return;
}
Output:
C:\Temp> ht
connect
content-length
done-headers
content
content-done
data
done
It looks like a callback passed to the request method is treated differently:
#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Lite;
my $http = HTTP::Lite->new;
$http->http11_mode(1);
my $count = 0;
$http->request('http://www.example.com/',
\&process_http_stream,
\$count,
);
sub process_http_stream {
my ($self, $data, $times) = #_;
++$$times;
print "$$times====\n$$data\n===\n";
}
Wait, I don't understand. Why are you ruling out a separate process? This:
open my $stream, "-|", "curl $url" or die;
while(<$stream>) { ... }
sure looks like the "easiest way" to me. It's certainly easier than the other suggestions here...
Event::Lib will give you an easy interface to the fastest asynchronous IO method for your platform.
IO::Lambda is also quite nice for creating fast, responsive, IO applications.
Here is a version I ended up using via Net::HTTP
This is basically a copy of the example from the Net::HTTP man page / perl doc
use Net::HTTP;
my $s = Net::HTTP->new(Host => "www.example.com") || die $#;
$s->write_request(GET => "/somestreamingdatasource.mp3");
my ($code, $mess, %h) = $s->read_response_headers;
while (1) {
my $buf;
my $n = $s->read_entity_body($buf, 4096);
die "read failed: $!" unless defined $n;
last unless $n;
print STDERR "got $n bytes\n";
print STDOUT $buf;
}