Perl LWP Post form-data with my boundary - perl

I want to post data with content type multipart/form-data:
use strict;
use warnings;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->post (
'http://127.0.0.1:12555',
'Content-Type' => 'form-data',
Content => {
'data1' => rand,
'data2' => rand,
}
);
And i tested the submitted data:
use strict;
use warnings;
use IO::Socket::INET;
use Socket qw ( inet_aton );
my $sock_listen = new IO::Socket::INET (
LocalHost => '127.0.0.1',
LocalPort => '12555',
Proto => 'tcp',
Listen => 3,
Reuse => 1,
);
$sock_listen->autoflush ();
my $sock;
while ( $sock = $sock_listen->accept ( ) )
{
my $data = '';
$sock->recv ( $data, 4096 );
print $data . "\n";
}
Test #1 result:
POST / HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: 127.0.0.1:12555
User-Agent: libwww-perl/6.05
Content-Length: 162
Content-Type: multipart/form-data; boundary=xYzZY
--xYzZY
Content-Disposition: form-data; name="data2"
0.876556396484375
--xYzZY
Content-Disposition: form-data; name="data1"
0.62921142578125
--xYzZY--
Test #2 result:
POST / HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: 127.0.0.1:12555
User-Agent: libwww-perl/6.05
Content-Length: 163
Content-Type: multipart/form-data; boundary=xYzZY
--xYzZY
Content-Disposition: form-data; name="data2"
0.896942138671875
--xYzZY
Content-Disposition: form-data; name="data1"
0.041656494140625
--xYzZY--
I added a data:
'data3' => '--xYzZY'
and got:
POST / HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: 127.0.0.1:12555
User-Agent: libwww-perl/6.05
Content-Length: 221
Content-Type: multipart/form-data; boundary=Tegj
--Tegj
Content-Disposition: form-data; name="data2"
0.34613037109375
--Tegj
Content-Disposition: form-data; name="data3"
--xYzZY
--Tegj
Content-Disposition: form-data; name="data1"
0.678955078125
--Tegj--
The question is how i can set the boundary manually to 32 chars string like browser's ----WebKitFormBoundary[...] using LWP?
Or can just use IO::Socket?

LWP allows you set the boundary manually when you do multipart/form-data requests. This feature is unfortunately not documented at all.
However, you have to do multipart explicitly. You can set your own boundary by appending the boundary as an additional field of the Content-Type. It will be converted to a header appropriately by HTTP::Request::Common.
my $ua = LWP::UserAgent->new;
$ua->post(
'http://127.0.0.1:12555',
'Content-Type' =>
'multipart/form-data;boundary=Nobody-has-the-intention-to-erect-a-wall',
# ^^^^^^^ ^^^^^^^^
Content => {
data1 => rand,
data2 => rand,
},
);
With your listener, this result in the following output.
POST / HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: 127.0.0.1:12555
User-Agent: libwww-perl/6.15
Content-Length: 269
Content-Type: multipart/form-data; boundary=Nobody-has-the-intention-to-erect-a-wall
--Nobody-has-the-intention-to-erect-a-wall
Content-Disposition: form-data; name="data2"
0.0575856828104122
--Nobody-has-the-intention-to-erect-a-wall
Content-Disposition: form-data; name="data1"
0.677908250902878
--Nobody-has-the-intention-to-erect-a-wall--
Note that HTTP::Request::Common will replace your boundary with a random string if it finds the boundary string in the body of any of the parts. It will not just add a number to your boundary.

The sole purpose of the boundary is to separate the message parts and the only requirement on it is that it doesn't appear anywhere in the message. I don't see a good reason to attempt to set it to be "the same" as anything else. Besides, no tool guarantees that it will always use the same one.
More importantly, setting it to a fixed string (without regard for the message) is dangerous: how does anyone know that such a string may not be in a message?
Finally, I don't think it is possible to do so, precisely because the boundary must be checked to ensure that it indeed isn't in the message; so no tool should provide a way to set it to a predefined string.
Have a look at HTTP::Request::Common's source. See how the sub boundary() badly mangles the string to return, and how much work goes into the boundary elsewhere. Then CHECK_BOUNDARY: block changes it further if it isn't good enough. This is clearly not meant to be set outside.
The post method of LWP::UserAgent exists as a shortcut for this module's one.
Note that simbabque found a way to set the boundary, which then also undergoes the checks.

Related

Extract REMOTE_ADDR using mod_perl

Please help how to extract REMOTE_ADDR value using mod_perl.
I have following perl script:
use Apache2::RequestUtil;
use Apache2::RequestRec;
$| = 1;
print "Content-type: text/plain\n\n";
my $r = Apache2::RequestUtil->request;
print $r->as_string();
the result of script:
GET /tmp/recheaders.pl HTTP/1.0
X-Scheme: http
REMOTE_ADDR: 81.204.25.44
Host: dom.net
X-Real-IP: 81.204.25.44
X-Forwarded-for: 81.204.25.44
Connection: close
User-Agent: Opera/9.80 (X11; Linux x86_64) Presto/2.12.388 Version/12.16
Accept: text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/webp, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate
Cookie: SESSION_ID=963bd96302cd70047c9f017640c7cbb8;
Cache-Control: no-cache
HTTP/1.0 (null)
How can I extract only REMOTE_ADDR: 81.204.25.44 ? What API method and how should I use?
SOLUTION
I found more elegant solution, to fetch just REMOTE_ADDR using headers_in:
use Apache2::Request;
my $r = shift;
my $req = Apache2::Request->new($r);
my $remip = $req->headers_in->{REMOTE_ADDR};
print $remip;
Just use %ENV
print $ENV{REMOTE_ADDR};
Alternatively there is Apache2::Connection->remote_addr()
use Apache2::Connection ();
use Apache2::RequestRec ();
my $c = $r->connection;
# this connection's local and remote socket addresses
my $local_sa = $c->local_addr();
my $remote_sa = $c->remote_addr();

Snooping on http headers between different plack middlewares

If I understand right, the PSGI application works as next:
got the request from a browser
the request is "bubbles" thru some middlewares in the order as them is defined in the builder
the request comes to my app
my app produces some respond
this respond again bubbles thru some middlewares
finally the respon is send to the browser
I can easily debug-print all headers (e.g. cookies) when the request landed in my $app.
The question is:
How to debug-print the actual state of headers while the request coming thru many middlewares to my app and while the respond is going-out again thru middlewares.
So, Having an (simplyfied) app.psgi, like the next:
use strict;
use warnings;
use Plack::Builder;
my $app = sub { ... };
builder {
# <- debug-print the first request headers
# and the last respond headers here
enable "Debug";
# <- debug-print the actual state of request/respond headers here
enable "mid2";
# <- and here
enable "mid3";
# <- and here
$app; # <- and finally here - this is of course EASY
}
It is probably not as easy as something like,
print STDERR Dumper $dont_know_what->request->headers(); #HTTP::Headers ???
print STDERR Dumper $dont_know_what->respond->headers();
so adding a bounty :) ;)
One basic approach is to create a middleware that dumps the headers before executing the wrapped application and then right afterward. Then you enable this middleware at each point where you want to see the headers as you have pointed out in your pseudocode.
The following code does this by building an in-line middleware each time you enable it.
use Plack::Builder;
use Plack::Request;
use Plack::Response;
sub headers_around {
my $position = shift;
# build and return the headers_around middleware as a closure
return sub {
my $app = shift;
# gets called each request
return sub {
my $env = shift;
my $req = Plack::Request->new($env);
# display headers before next middleware
print STDERR "req headers before $position:\n" . $req->headers->as_string . "\n=====\n";
# execute the next app on the stack
my $res = $app->($env);
my $response = Plack::Response->new(#$res);
# display headers after previous middleware
print STDERR "res headers after $position:\n" . $response->headers->as_string . "\n=====\n";
return $res;
};
};
};
builder {
enable headers_around('Debug');
enable 'Debug';
enable headers_around('Lint');
enable 'Lint';
enable headers_around('StackTrace');
enable 'StackTrace', force => 1;
enable headers_around('App');
mount '/' => builder { sub {
return [ 200, [ 'Content-Type' => 'text/plain' ], [ 'Hello World' ] ];
}}
};
# now build the application enabling regular middleware with our inline middleware
builder {
enable headers_around('Debug');
enable 'Debug';
enable headers_around('Lint');
enable 'Lint';
enable headers_around('StackTrace');
enable 'StackTrace', force => 1;
enable headers_around('App');
mount '/' => builder { sub {
return [ 200, [ 'Content-Type' => 'text/plain' ], [ 'Hello World' ] ];
}}
};
When I run it with plackup I get the following output:
$ plackup --app between_middleware.psgi
HTTP::Server::PSGI: Accepting connections at http://0:5000/
req headers before Debug:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
req headers before Lint:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
req headers before StackTrace:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
req headers before App:
Connection: Keep-Alive
Accept: */*
Host: 0:5000
User-Agent: Wget/1.12 (linux-gnu)
=====
res headers after App:
Content-Type: text/plain
=====
res headers after StackTrace:
Content-Type: text/plain
=====
res headers after Lint:
Content-Type: text/plain
=====
res headers after Debug:
Content-Type: text/plain
=====
127.0.0.1 - - [02/Apr/2014:19:37:30 -0700] "GET / HTTP/1.0" 200 11 "-" "Wget/1.12 (linux-gnu)"
Obviously you could turn this into an actual middleware like Ashley's and you may have to tweak it to send log messages using whatever facility you have in place.
Middleware
package ShowMeTheHeaders;
use parent "Plack::Middleware";
use Plack::Request;
use Plack::Response
require Text::Wrap;
my $_call_back = sub {
my $response = Plack::Response->new(#{+shift});
print "* Response Headers:\n",
Text::Wrap::wrap("\t", "\t", $response->headers->as_string);
return; # Explicit return suggested by docs.
};
sub call {
my $self = shift;
my $request = Plack::Request->new(shift);
print "* Request Headers:\n",
Text::Wrap::wrap("\t", "\t", $request->headers->as_string);
my $response = $self->app->($request);
Plack::Util::response_cb($response, $_call_back);
}
1;
You can do this without the objectification (Plack::Request and Plack::Response) but then you have to deal with raw attributes and keys for the header fields instead of the entirely more pleasant ->as_string. See also the “response callback” section of Plack::Middleware.
demo psgi
use warnings;
use strict;
use Plack::Builder;
my $app = sub {
[ 200,
[ "Content-Type" => "text/plain" ],
[ "O HAI, PLAK!" ]
];
};
builder {
enable "+ShowMeTheHeaders";
mount "/" => $app;
};

Extra spaces being added to PHP mail headers

I'm sending a multipart HTML email using PHP's mail() function. In my Postfix configuration I have my SMTP server set to Amazon's SES. Here is the PHP for sending the email:
$boundary = uniqid("HTMLDEMO");
$headers = "From: me#mydomain.com\r\n";
$headers .= "MIME-Version: 1.0\r\n";
$headers .= "Content-Type: multipart/alternative; boundary = ".$boundary."\r\n\r\n";
// plain text
$content = "--".$boundary."\r\n" .
"Content-Type: text/plain; charset=ISO-8859-1\r\n" .
"Content-Transfer-Encoding: base64\r\n\r\n" .
chunk_split(base64_encode($plaintext_message));
// HTML
$content .= "--".$boundary."\r\n" .
"Content-Type: text/html; charset=ISO-8859-1\r\n" .
"Content-Transfer-Encoding: text/html \r\n\r\n" .
"<html><body>".$html_message."</body></html>";
//send message
mail($to, $subject, $content, $headers);
When I echo the message content, this is what I see in the browser:
--HTMLDEMO527d8d851e72f
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: base64
VGhpcyBpcyB0aGUgcGxhaW4gdGV4dCB2ZXJzaW9uIQ==
--HTMLDEMO527d8d851e72f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: text/html
<html><body><p>My message here.</p></body></html>
But when I view the message source in Gmail, I now see this (including the message headers):
From: me#mydomain.com
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary = HTMLDEMO527d8d851e72f
Message-ID: <blah-blah-blah#email.amazonses.com>
Date: Sat, 9 Nov 2013 01:19:02 +0000
X-SES-Outgoing: 2013.11.09-12.34.5.67
--HTMLDEMO527d8d851e72f
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: base64
VGhpcyBpcyB0aGUgcGxhaW4gdGV4dCB2ZXJzaW9uIQ==
--HTMLDEMO527d8d851e72f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: text/html
The multipart headers are now double-spaced, causing the HTML to display as plain text. SES is clearly modifying the message headers (it added Message-ID, Date, and X-SES-Outgoing), so could that also be the culprit for the extra spaces in the multipart headers? When I send an identical email from a non-Amazon server, it comes through normally and renders the HTML like it should.
Also, when I send it as a simple HTML email (not multipart), then it works just fine.
Thanks.
I had got the same issue and I resolved it by changing the end of line character to '\n' instead of '\r\n'.

How do I change the order of HTTP request headers sent by Perl's LWP?

For a test i need to do a get requets to a website - unfortunatly when using perl lwp the "connection" appears in the header b4 the host. As a result the request gets filtered by the web application firewall. All i need is to remove or move down the connection line in the header. When i do the requets with my script:
use warnings;
use IO::Socket;
use LWP::UserAgent;
use LWP::Protocol::http;
use HTTP::Request;
my $ua = LWP::UserAgent->new();
push(#LWP::Protocol::http::EXTRA_SOCK_OPTS, SendTE => 0, PeerHTTPVersion => "1.1");
$ua->default_header(Cookie => 'XXX', User-Agent => 'whateva');
my $request = $ua->get('https://www.test.com/test.html?...');
....
The header looks like this:
GET /test.html?... HTTP/1.1
Connection: close
Host: www.test.com
User-Agent: whateva
Cookie: XXXX
BUT it should look like this to work (conenction comes after host):
GET /test.html?... HTTP/1.1
Host: www.test.com
Connection: close
User-Agent: whateva
Cookie: XXXX
How do i get rid of that connection line in LWP? I just need to re-oder it....Its not that it needs to be completly removed; I am happy to add it later in there again as
# $userAgent->default_header ("Connection" => "keep-alive");..
Thx a lot in advance!
To work around the bug in your firewall*, change
return _bytes(join($CRLF, "$method $uri HTTP/$ver", #h2, #h, "", $content));
in Net/HTTP.pm to
my #h3 = ( #h2, #h );
if (my ($idx) = grep /^Host:/, 0..$#h3) {
unshift(#h3, splice(#h3, $idx, 1));
}
return _bytes(join($CRLF, "$method $uri HTTP/$ver", #h3, "", $content));
* — According to the HTTP/1.1 spec, RFC 2616, "The order in which header fields with differing field names are received is not significant."

Perl MIME::Parser and encoding in nested bodys (message/rfc_822)

arghhh, it's not easy. I'm trying to parse some mails with perl. Let's take an example:
From: abc#def.de
Content-Type: multipart/mixed;
boundary="----_=_NextPart_001_01CBE273.65A0E7AA"
To: ghi#def.de
This is a multi-part message in MIME format.
------_=_NextPart_001_01CBE273.65A0E7AA
Content-Type: multipart/alternative;
boundary="----_=_NextPart_002_01CBE273.65A0E7AA"
------_=_NextPart_002_01CBE273.65A0E7AA
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: base64
[base64-content]
------_=_NextPart_002_01CBE273.65A0E7AA
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: base64
[base64-content]
------_=_NextPart_002_01CBE273.65A0E7AA--
------_=_NextPart_001_01CBE273.65A0E7AA
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----_=_NextPart_003_01CBE272.13692C80"
From: bla#bla.de
To: xxx#xxx.de
This is a multi-part message in MIME format.
------_=_NextPart_003_01CBE272.13692C80
Content-Type: multipart/alternative;
boundary="----_=_NextPart_004_01CBE272.13692C80"
------_=_NextPart_004_01CBE272.13692C80
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
=20
Viele Gr=FC=DFe
------_=_NextPart_004_01CBE272.13692C80
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<html>...</html>
------_=_NextPart_004_01CBE272.13692C80--
------_=_NextPart_003_01CBE272.13692C80
Content-Type: application/x-zip-compressed;
name="abc.zip"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="abc.zip"
[base64-content]
------_=_NextPart_003_01CBE272.13692C80--
------_=_NextPart_001_01CBE273.65A0E7AA--
This mail is sent from Outlook with another attached message. As you can see, this is a very complex mail with many different content types (text/plain, text/html, message/rfc_822, application/xyz)...
And the rfc_822 part is the problem. I've written a script in Perl 5.8 (Debian Squeeze) to parse this message with MIME::Parser.
use MIME::Parser;
my $parser = MIME::Parser->new;
$parser->output_to_core(1);
my $top_entity = $parser->parse(\*STDIN);
my $plain_body = "";
my $html_body = "";
my $content_type;
foreach my $part ($top_entity->parts_DFS) {
$content_type = $part->effective_type;
$body = $part->bodyhandle;
if ($body) {
if ($content_type eq 'text/plain') {
$plain_body = $plain_body . "\n" if ($plain_body ne '');
$plain_body = $plain_body . $body->as_string;
} elsif ($content_type eq 'text/html') {
$html_body = $html_body . "\n" if ($html_body ne '');
$html_body = $html_body . $body->as_string;
}
}
}
# parsing of attachment comes later
print $plain_body;
The first message part (base64-content) contains german umlauts, which are shown correctly at STDOUT. The nested rfc_822 message is parsed by MIME::Parser automatically and is pooled with the top level body as one entity. This nested rfc_822 contains also german umlauts in quoted-printable as you can see. But these are not shown correctly at STDOUT. When doing a
utf8::encode($plain_body);
before print, the quoted-printable umlauts are shown correctly, but not the base64 encoded ones. I'm trying now for hours to extract the rfc_822 seperatly and doing some encoding, but nothing helps. Who else can help?
Regards
Assuming that your console displays UTF-8, this make sense.
It correctly shows what you have decoded, but, of course, latin1 characters are not shown correctly.
Later, you do a conversion to UTF-8, but this does not make sense if the data is already UTF8. So only the former latin1 umlauts are shown.
There is no way to get this right without looking at the "charset" in the content-type and acting accordingly.