Snooping on http headers between different plack middlewares - perl

If I understand right, the PSGI application works as next:
got the request from a browser
the request is "bubbles" thru some middlewares in the order as them is defined in the builder
the request comes to my app
my app produces some respond
this respond again bubbles thru some middlewares
finally the respon is send to the browser
I can easily debug-print all headers (e.g. cookies) when the request landed in my $app.
The question is:
How to debug-print the actual state of headers while the request coming thru many middlewares to my app and while the respond is going-out again thru middlewares.
So, Having an (simplyfied) app.psgi, like the next:
use strict;
use warnings;
use Plack::Builder;
my $app = sub { ... };
builder {
# <- debug-print the first request headers
# and the last respond headers here
enable "Debug";
# <- debug-print the actual state of request/respond headers here
enable "mid2";
# <- and here
enable "mid3";
# <- and here
$app; # <- and finally here - this is of course EASY
}
It is probably not as easy as something like,
print STDERR Dumper $dont_know_what->request->headers(); #HTTP::Headers ???
print STDERR Dumper $dont_know_what->respond->headers();
so adding a bounty :) ;)

One basic approach is to create a middleware that dumps the headers before executing the wrapped application and then right afterward. Then you enable this middleware at each point where you want to see the headers as you have pointed out in your pseudocode.
The following code does this by building an in-line middleware each time you enable it.
use Plack::Builder;
use Plack::Request;
use Plack::Response;
sub headers_around {
my $position = shift;
# build and return the headers_around middleware as a closure
return sub {
my $app = shift;
# gets called each request
return sub {
my $env = shift;
my $req = Plack::Request->new($env);
# display headers before next middleware
print STDERR "req headers before $position:\n" . $req->headers->as_string . "\n=====\n";
# execute the next app on the stack
my $res = $app->($env);
my $response = Plack::Response->new(#$res);
# display headers after previous middleware
print STDERR "res headers after $position:\n" . $response->headers->as_string . "\n=====\n";
return $res;
};
};
};
builder {
enable headers_around('Debug');
enable 'Debug';
enable headers_around('Lint');
enable 'Lint';
enable headers_around('StackTrace');
enable 'StackTrace', force => 1;
enable headers_around('App');
mount '/' => builder { sub {
return [ 200, [ 'Content-Type' => 'text/plain' ], [ 'Hello World' ] ];
}}
};
# now build the application enabling regular middleware with our inline middleware
builder {
enable headers_around('Debug');
enable 'Debug';
enable headers_around('Lint');
enable 'Lint';
enable headers_around('StackTrace');
enable 'StackTrace', force => 1;
enable headers_around('App');
mount '/' => builder { sub {
return [ 200, [ 'Content-Type' => 'text/plain' ], [ 'Hello World' ] ];
}}
};
When I run it with plackup I get the following output:
$ plackup --app between_middleware.psgi
HTTP::Server::PSGI: Accepting connections at http://0:5000/
req headers before Debug:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
req headers before Lint:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
req headers before StackTrace:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
req headers before App:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
res headers after App:
Content-Type: text/plain
=====
res headers after StackTrace:
Content-Type: text/plain
=====
res headers after Lint:
Content-Type: text/plain
=====
res headers after Debug:
Content-Type: text/plain
=====
127.0.0.1 - - [02/Apr/2014:19:37:30 -0700] "GET / HTTP/1.0" 200 11 "-" "Wget/1.12 (linux-gnu)"
Obviously you could turn this into an actual middleware like Ashley's and you may have to tweak it to send log messages using whatever facility you have in place.

Middleware
package ShowMeTheHeaders;
use parent "Plack::Middleware";
use Plack::Request;
use Plack::Response
require Text::Wrap;
my $_call_back = sub {
my $response = Plack::Response->new(#{+shift});
print "* Response Headers:\n",
Text::Wrap::wrap("\t", "\t", $response->headers->as_string);
return; # Explicit return suggested by docs.
};
sub call {
my $self = shift;
my $request = Plack::Request->new(shift);
print "* Request Headers:\n",
Text::Wrap::wrap("\t", "\t", $request->headers->as_string);
my $response = $self->app->($request);
Plack::Util::response_cb($response, $_call_back);
}
1;
You can do this without the objectification (Plack::Request and Plack::Response) but then you have to deal with raw attributes and keys for the header fields instead of the entirely more pleasant ->as_string. See also the “response callback” section of Plack::Middleware.
demo psgi
use warnings;
use strict;
use Plack::Builder;
my $app = sub {
[ 200,
[ "Content-Type" => "text/plain" ],
[ "O HAI, PLAK!" ]
];
};
builder {
enable "+ShowMeTheHeaders";
mount "/" => $app;
};

Related

Mailchimp v3.0 API, using Perl Curl

I'm trying to find a way to do a Curl request to MailChimps new API v3.0, that will subscribe a user to a given list. Here is what I have thus far:
use warnings;
use WWW::Curl::Easy;
use JSON;
my $apikey = 'xxxx';
my $listid = 'xxxx';
my $email = 'andy#test.co.uk';
my $endpoint = "https://us6.api.mailchimp.com/3.0/lists";
my $json = JSON::encode_json([
'email_address' => $email,
'status' => 'pending',
'merge_fields' => [
'FNAME' => "andy",
'LNAME' => "newby"
]
]);
my $curl = WWW::Curl::Easy->new;
my $url = "$endpoint/$listid/members/" . Digest::MD5::md5(lc($email));
$curl->setopt(CURLOPT_HEADER,1);
$curl->setopt(CURLOPT_URL, $endpoint);
$curl->setopt(CURLOPT_USERPWD, 'user:' . $apikey);
$curl->setopt(CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
$curl->setopt(CURLOPT_TIMEOUT, 10);
$curl->setopt(CURLOPT_CUSTOMREQUEST, 'PUT');
$curl->setopt(CURLOPT_SSL_VERIFYPEER, 0);
$curl->setopt(CURLOPT_POSTFIELDS, $json);
my $response_body;
$curl->setopt(CURLOPT_WRITEDATA,\$response_body);
# Starts the actual request
my $retcode = $curl->perform;
# Looking at the results...
if ($retcode == 0) {
print("Transfer went ok\n");
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
print "Received response: $response_body\n";
} else {
# Error code, type of error, error message
print "An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n";
}
The documentation is pretty scarce, due to it being a new API. Has anyone had any success with the MailChimp v3 API, for subscribing someone in Perl? (I'm also open to suggestions for command line curl requests... but everything I tried with regards to that, failed with "internal server errors" coming back from MailChimp, which wasn't very helpful)
UPDATE: As suggested below, I enabled verbose, and it now spits out:
Hostname was NOT found in DNS cache
Trying 184.86.100.251...
Connected to us6.api.mailchimp.com (184.86.100.251) port 443 (#0)
successfully set certificate verify locations:
CAfile: none CApath: /etc/ssl/certs
SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
Server certificate:
subject: C=US; ST=GA; L=Atlanta; O=ROCKET SCIENCE GROUP; OU=Rocket Science Group; CN=*.api.mailchimp.com
start date: 2015-09-22 14:39:14 GMT
expire date: 2016-09-22 14:39:13 GMT
subjectAltName: us6.api.mailchimp.com matched
issuer: C=NL; L=Amsterdam; O=Verizon Enterprise Solutions; OU=Cybertrust; CN=Verizon Akamai SureServer CA G14-SHA2
SSL certificate verify ok.
Server auth using Basic with user 'user'
PUT /3.0/lists HTTP/1.1 Authorization: Basic xxxx Host: us6.api.mailchimp.com Accept: / Content-Type: application/json
Content-Length: 108
upload completely sent off: 108 out of 108 bytes < HTTP/1.1 405 Method Not Allowed
Server nginx is not blacklisted < Server: nginx < Content-Type: application/problem+json; charset=utf-8 < Content-Length: 253 <
X-Request-Id: 5f6ab08f-69e7-4c9b-b22a-91714331d5b7 < Link:
https://us6.api.mailchimp.com/schema/3.0/ProblemDetailDocument.json;
rel="describedBy" < Allow: GET, POST < Date: Tue, 13 Oct 2015 11:24:32
GMT < Connection: close < Set-Cookie: _AVESTA_ENVIRONMENT=prod; path=/
<
Closing connection 0 Transfer went ok Received response: HTTP/1.1 405 Method Not Allowed Server: nginx Content-Type:
application/problem+json; charset=utf-8 Content-Length: 253
X-Request-Id: 5f6ab08f-69e7-4c9b-b22a-91714331d5b7 Link:
https://us6.api.mailchimp.com/schema/3.0/ProblemDetailDocument.json;
rel="describedBy" Allow: GET, POST Date: Tue, 13 Oct 2015 11:24:32 GMT
Connection: close Set-Cookie: _AVESTA_ENVIRONMENT=prod; path=/
{"type":"http://kb.mailchimp.com/api/error-docs/405-method-not-allowed","title":"Method
Not Allowed","status":405,"detail":"The requested method and resource
are not compatible. See the Allow header for this resource's available
methods.","instance":""}
I'm not really sure what to make of that though :/
Working code: Thanks to TooMuchPete, I managed to get it going. For anyone who may come across this while trying to use the MailChimp API (3.0) in Perl, below is a working sample (you just need to replace the values of the email, name, api key, and list id);
use WWW::Curl::Easy;
use JSON;
use Digest::MD5;
my $apikey = 'xxxx-us6';
my $listid = 'xxxx';
my $email = 'andy#testr.co.uk';
my $endpoint = "https://us6.api.mailchimp.com/3.0/lists";
my $json = JSON::encode_json({
'email_address' => $email,
'status' => 'pending',
'merge_fields' => {
'FNAME' => "andy",
'LNAME' => "newby"
}
});
my $curl = WWW::Curl::Easy->new;
my $url = "$endpoint/$listid/members/" . Digest::MD5::md5(lc($email));
$curl->setopt(CURLOPT_HEADER,1);
$curl->setopt(CURLOPT_URL, $url);
$curl->setopt(CURLOPT_VERBOSE, 1);
$curl->setopt(CURLOPT_USERPWD, 'user:' . $apikey);
$curl->setopt(CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
$curl->setopt(CURLOPT_TIMEOUT, 10);
$curl->setopt(CURLOPT_CUSTOMREQUEST, 'PUT');
$curl->setopt(CURLOPT_SSL_VERIFYPEER, 0);
$curl->setopt(CURLOPT_POSTFIELDS, $json);
# A filehandle, reference to a scalar or reference to a typeglob can be used here.
my $response_body;
$curl->setopt(CURLOPT_WRITEDATA,\$response_body);
# Starts the actual request
my $retcode = $curl->perform;
# Looking at the results...
if ($retcode == 0) {
print("Transfer went ok\n");
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
# judge result and next action based on $response_code
print "Received response: $response_body\n";
} else {
# Error code, type of error, error message
print "An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n";
}
I hope this saves someone the grief I had :)
You're attempting to connect to $endpoint instead of $url.
my $url = "$endpoint/$listid/members/" . Digest::MD5::md5(lc($email));
$curl->setopt(CURLOPT_HEADER,1);
$curl->setopt(CURLOPT_URL, $endpoint);
should be:
my $url = "$endpoint/$listid/members/" . Digest::MD5::md5(lc($email));
$curl->setopt(CURLOPT_HEADER,1);
$curl->setopt(CURLOPT_URL, $url);
I received an illegal characters response from MailChimp using the code above until I switched to the md5_base64() call.

Form a CURL request to PayMill in Perl

I'm out of my depth with curl.
I want to integrate PayMill into my site (which is written in Perl).
There isn't a Perl lib for Paymill yet, so I need to connect to them via curl.
I have completed the front end JS Paymill integration, and received a payment token from PayMill.
I now need to pass the token received from Paymill to my backend and use curl to ask PayMill to complete the transaction and charge the user.
At this point I'm stuck.
To make a transaction, the PayMill documentation says that I must do the following:
curl https://api.paymill.de/v2/transactions \
-u b94a7550bd908877cbae5d3cf0dc4b74: \
-d "amount=4200" \
-d "currency=EUR" \
-d "token=098f6bcd4621d373cade4e832627b4f6" \
-d "description=Test Transaction"
I believe -u is the Paymill secret key to authenticate my request although the documentation is not clear here.
I've had a look at WWW::Curl::Easy, Net:Curl::Easy and LWP::Curl, however nothing in the documentation for those methods makes it obvious to me how to form the query above.
I've tried (without really believing it would work), simply encoding a string in perl as described above;
my $request = '-u ' . $private_key . " ";
foreach my $key (keys %$params_in) {
$request .= '-d "' . lc($key) .'='.$params_in->{$key} . ' ';
}
And then passing $request to my attempt at curl as follows;
my $curl = WWW::Curl::Easy->new;
$curl->setopt(WWW::Curl::Easy::CURLOPT_HEADER(), 1);
$curl->setopt(WWW::Curl::Easy::CURLOPT_URL(), $paymill_server);
$curl->setopt(WWW::Curl::Easy::CURLOPT_POST(), 1);
$curl->setopt(WWW::Curl::Easy::CURLOPT_POSTFIELDS(), $request);
my $response;
$curl->setopt(WWW::Curl::Easy::CURLOPT_WRITEDATA(), \$response);
my $retcode = $curl->perform;
however that fails with an Access Denied error, which I assume is because Paymill is not finding my key because I'm messing up the Curl (assuming -u is supposed to be the secret_key).
I feel I'm missing something obvious here.
Could someone point me in the right direction re how to do this?
Thanks
UPDATE
Excellent answers, thanks everyone for your help, it's working now. I went with Matthias's solution in the end and the final complete solution for making a transaction looked as follows;
use LWP::UserAgent;
use MIME::Base64;
use JSON::XS;
my $ua = LWP::UserAgent->new;
$ua->default_header(Authorization => "Basic " . encode_base64(private_key));
my $response = $ua->post(https://api.paymill.de:443/v2/transactions , $params );
if ( $response->is_success ) {
my $obj = eval { decode_json $response->content } || {};
etc
}
Like other answers propose the best way would be using LWP::UserAgent for doing the requests.
Edit: Since PAYMILL is sending challenge responses since a while now I updated the code.
Since Paymill doesn't comply with RFC 2616, Section 14.47 (the API isn't sending a challenge response) LWP::UserAgent and similar are failing in sending a second request with the credentials. The solution is to "force" LWP::UserAgent to send the credentials with the first request by adding them as header:
use LWP::UserAgent;
use MIME::Base64;
my $ua = LWP::UserAgent->new;
# Use the following line no longer:
# $ua->default_header(Authorization => "Basic " . encode_base64("your PRIVATE key"))
$ua->credentials('api.paymill.de:443', '', 'YOUR PRIVATE KEY');
# Dumping only
use Data::Dumper;
print Dumper($ua->get("https://api.paymill.de:443/v2/clients"));
Disclosure: I work at Paymill.
I don't know if the authentication part with the user/password and your token is correct as I don't know what the 'Realm' is supposed to be. Still, have a go with LWP. It's not that I don't like Curl, I just don't know it, but I do know LWP.
use strict; use warnings;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->credentials(
'api.paymill.de:80',
'Realm?',
'b94a7550bd908877cbae5d3cf0dc4b74'
);
my $response = $ua->post(
' https://api.paymill.de/v2/transactions',
{
amount => "4200",
currency => "EUR",
token => "098f6bcd4621d373cade4e832627b4f6",
description => "Test Transaction",
}
);
if ( $response->is_success ) {
print $response->decoded_content; # or whatever
} else {
die $response->status_line;
}
Edit: I read a little in the Paymill documentation. It says:
Authentication
Example
% curl https://api.paymill.de/v2/clients \
-u e73fa5e7b87620585b5ea5d73c4d23bb:
To authenticate at the Paymill API, you need the private key of your
test or live account. You have to use http basic access
authentification. Your key has to be set as the username. A password
isn’t required and you don’t have to insert one. But if you want, feel
free to insert an arbitrary string.
Note
Please keep your private keys secure and don’t pass them to anybody. These private keys have extreme secure information for
handling the transactions of your shop.
All your requests must be made via https. Requests which will be made in another way will fail. This is for security reasons of the
submitted data.
There is also a link to http://en.wikipedia.org/wiki/HTTP_Secure, which clears up the -u part pretty much I believe.
You can use LWP::Protocol::Net::Curl to integrate LWP and libcurl organically. Check this:
#!/usr/bin/env perl
use common::sense;
use Data::Printer;
use JSON::XS;
use LWP::Protocol::Net::Curl verbose => 1;
use LWP::UserAgent;
# create user agent
my $ua = LWP::UserAgent->new;
# POST request
my $res = $ua->post(
'https://b94a7550bd908877cbae5d3cf0dc4b74:#api.paymill.de/v2/transactions',
'Accept-Encoding' => 'gzip',
Content => {
amount => 4200,
currency => 'EUR',
token => '098f6bcd4621d373cade4e832627b4f6',
description => 'Test Transaction',
},
);
# parse received data
my $obj = eval { decode_json $res->content } // {};
# output
p $obj;
The output:
* About to connect() to api.paymill.de port 443 (#0)
* Trying 62.138.241.3...
* Connected to api.paymill.de (62.138.241.3) port 443 (#0)
* Connected to api.paymill.de (62.138.241.3) port 443 (#0)
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
* SSL connection using RC4-SHA
* Server certificate:
* subject: OU=Domain Control Validated; OU=PositiveSSL Wildcard; CN=*.paymill.de
* start date: 2012-07
* expire date: 2013-10
* subjectAltName: api.paymill.de matched
* issuer: C=GB; S
* SSL certificate verify ok.
* Server auth using Basic with user 'b94a7550bd908877cbae5d3cf0dc4b74'
> POST /v2/transactions HTTP/1.1
Authorization: Basic Yjk0YTc1NTBiZDkwODg3N2NiYWU1ZDNjZjBkYzRiNzQ6
User-Agent: libwww-perl/6.04 libcurl/7.28.0 OpenSSL/1.0.0e zlib/1.2.3.4 libidn/1.22 libssh2/1.2.8
Host: api.paymill.de
Accept: */*
Accept-Encoding: gzip
Content-Length: 92
Content-Type: application/x-www-form-urlencoded
* upload completely sent off: 92 out of 92 bytes
< HTTP/1.1 200 OK
< Server: nginx
< Date: Wed, 09 Jan 2013 17:22:54 GMT
< Content-Type: application/json
< Transfer-Encoding: chunked
< Connection: close
< Set-Cookie: PHPSESSID=rmdo5a8c6u107gma28lotmmn24; path=/
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< X-Server: hrtt-frn5-de13
<
* Closing connection #0
Printing in line 28 of paymill.pl:
\ {
data {
amount 4200,
client {
created_at 1357752174,
description undef,
email undef,
id "client_85cb0bfc837f31c81015",
payment [],
subscription undef,
updated_at 1357752174
},
created_at 1357752174,
currency "EUR",
description "Test Transaction",
id "tran_c672daa0538e2a04e919",
livemode false,
origin_amount 4200,
payment {
card_holder undef,
card_type "visa",
client "client_85cb0bfc837f31c81015",
country undef,
created_at 1357752174,
expire_month 12,
expire_year 2014,
id "pay_2732689f44928301c769",
last4 1111,
type "creditcard",
updated_at 1357752174
},
preauthorization undef,
refunds undef,
status "closed",
updated_at 1357752174
},
mode "test"
}

How do I change the order of HTTP request headers sent by Perl's LWP?

For a test i need to do a get requets to a website - unfortunatly when using perl lwp the "connection" appears in the header b4 the host. As a result the request gets filtered by the web application firewall. All i need is to remove or move down the connection line in the header. When i do the requets with my script:
use warnings;
use IO::Socket;
use LWP::UserAgent;
use LWP::Protocol::http;
use HTTP::Request;
my $ua = LWP::UserAgent->new();
push(#LWP::Protocol::http::EXTRA_SOCK_OPTS, SendTE => 0, PeerHTTPVersion => "1.1");
$ua->default_header(Cookie => 'XXX', User-Agent => 'whateva');
my $request = $ua->get('https://www.test.com/test.html?...');
....
The header looks like this:
GET /test.html?... HTTP/1.1
Connection: close
Host: www.test.com
User-Agent: whateva
Cookie: XXXX
BUT it should look like this to work (conenction comes after host):
GET /test.html?... HTTP/1.1
Host: www.test.com
Connection: close
User-Agent: whateva
Cookie: XXXX
How do i get rid of that connection line in LWP? I just need to re-oder it....Its not that it needs to be completly removed; I am happy to add it later in there again as
# $userAgent->default_header ("Connection" => "keep-alive");..
Thx a lot in advance!
To work around the bug in your firewall*, change
return _bytes(join($CRLF, "$method $uri HTTP/$ver", #h2, #h, "", $content));
in Net/HTTP.pm to
my #h3 = ( #h2, #h );
if (my ($idx) = grep /^Host:/, 0..$#h3) {
unshift(#h3, splice(#h3, $idx, 1));
}
return _bytes(join($CRLF, "$method $uri HTTP/$ver", #h3, "", $content));
* — According to the HTTP/1.1 spec, RFC 2616, "The order in which header fields with differing field names are received is not significant."

Send a plain string request with LWP

To get a response from a certain website, I have to give one exact request string, HTTP/1.1. I tried that one with telnet, it gives me the response I want (a redirect, but I need it).
But when I try to give the same request string to HTTP::Request->parse(), I merely get the message 400 URL must be absolute.
I am not sure if it's the website or LWP giving me that, because as I said, the response worked with telnet.
This is the code:
my $req = "GET / HTTP/1.1\n".
"Host: www.example-site.de\n".
"User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\n".
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\n".
"Accept-Language: en-us,en;q=0.5\n".
"Accept-Encoding: gzip, deflate\n".
"Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\n".
"Keep-Alive: 115\n".
"Connection: keep-alive\n";
# Gives correct request string
print HTTP::Request->parse($req)->as_string;
my $ua = LWP::UserAgent->new( cookie_jar => {}, agent => '' );
my $response = $ua->request(HTTP::Request->parse($req));
# 400 error
print $response->as_string,"\n";
Anyone can help me here?
LWP::UserAgent dies with the error you are getting if there is no schema specified in request. It probably need it to properly work with it.
So, to make it work, you need to specify full url for your request:
my $req_str = "GET http://www.example.de/\n".
"User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\n".
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\n".
"Accept-Language: en-us,en;q=0.5\n".
"Accept-Encoding: gzip, deflate\n".
"Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\n".
"Keep-Alive: 115\n".
"Connection: keep-alive\n";
Ok, I did it using Sockets. After all, I had the HTTP request and wanted the plain response. Here the code for people who are interested:
use IO::Sockets;
my $sock = IO::Socket::INET->new(
PeerAddr => 'www.example-site.de',
PeerPort => 80,
Proto => 'Tcp',
);
die "Could not create socket: $!\n" unless $sock;
print $sock, $req;
while(<$sock>) {
# Look for stuff I need
}
close $sock;
It's just important to remember to leave the while, as the HTTP response won't end with an EOF.
It looks to me like parsing the request isn't 100 % round-trip-safe, meaning you cannot feed the response back into a request.
Looks like a bug at first sight, but the module's been out for such a long time… On the other hand, I didn't even know you could use this module to parse a request, so maybe it's not so well tested.
The following test case should point you to the problem, which is that the URL isn't properly assembled for being fed to the $req->request method.
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Request;
use Test::More;
my $host = 'www.example.com';
my $url = '/bla.html';
my $req = <<"EOS";
GET $url HTTP/1.1
Host: $host
EOS
# (1) parse the request
my $reqo = HTTP::Request->parse($req);
isa_ok $reqo, 'HTTP::Request';
diag explain $reqo;
diag $reqo->as_string;
# (2) construct the request
my $reqo2 = HTTP::Request->new( GET => "http://$host$url" );
isa_ok $reqo2, 'HTTP::Request';
diag explain $reqo2;
diag $reqo2->as_string;
is $reqo->uri, $reqo2->uri, 'both URLs are identical';
my $ua = LWP::UserAgent->new( cookie_jar => {}, agent => '' );
for ( $reqo, $reqo2 ) {
my $response = $ua->request( $_ );
diag $response->as_string,"\n";
}
done_testing;

How can I determine if a URL redirects?

If I have a URL (eg. http://www.foo.com/alink.pl?page=2), I want to determine if I am being redirected to another link. I'd also like to know the final URL (eg. http://www.foo.com/other_link.pl). Finally, I want to be able to do this in Perl and Groovy.
In Perl:
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $request = HTTP::Request->new( GET => 'http://google.com/' );
my $response = $ua->request($request);
if ( $response->is_success and $response->previous ) {
print $request->url, ' redirected to ', $response->request->uri, "\n";
}
Well, I know nothing about either Perl or groovy, so I'll give you an another from an HTTP point of view, and you'll have to adapt.
Normally, you make an HTTP request, and you get back some HTML text along with a response code. The response code for Success is 200. Any response code in the 300 range is some form of a redirect.
Referring to James's answer - sample HTTP session:
$ telnet www.google.com 80
HEAD / HTTP/1.1
HOST: www.google.com
HTTP/1.1 302 Found
Location: http://www.google.it/
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: ##############################
Date: Thu, 30 Oct 2008 20:03:36 GMT
Server: ####
Content-Length: 218
Using HEAD instead of GET you get only the header. "302" means a temporary redirection, "Location:" is where you are redirected to.
A quick & dirty groovy script to show the concepts -- Note, this is using java.net.HttpURLConnection
In order to detect the redirect, you have to use setFollowRedirects(false). Otherwise, you end up on the redirected page anyway with a responseCode of 200. The downside is you then have to navigate the redirect yourself.
URL url = new URL ('http://google.com')
HttpURLConnection conn = url.openConnection()
conn.followRedirects = false
conn.requestMethod = 'HEAD'
println conn.responseCode
// Not ideal - should check response code too
if (conn.headerFields.'Location') {
println conn.headerFields.'Location'
}
301
["http://www.google.com/"]
In Perl you can use LWP::Useragent for that. I guess the easiest way is to add a response_redirect handler using add_handler.
I think this will work for 301 redirects.
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $request = HTTP::Request->new( GET => 'http://google.com/' );
my $response = $ua->request($request);
if ( $response->is_redirect ) {
print $request->url . " redirected to location " . $response->header('Location') . "\n";
}