Detect duplicated header in HTTP::Response - perl

I have a problem with a HTTP::Response Perl object from a remote server that sometimes returns the HTTP response with duplicated 'Content-Length' headers.
When this occurs, if the content-length value is '43215', when I read the header value with:
print ($response->header('Content-length'));
the result is:
4321543215
How can I detect if the header is duplicated and access to the real value?

From the fine manual for HTTP::Headers:
A multi-valued field will be returned as separate values in list context and will be
concatenated with "," as separator in scalar context.
and this is list context:
print ($response->header('Content-length'))
So, $response->header() is returning both Content-length headers as a list and the result is, essentially:
print join('', 43215, 43215)
You can either use kork's $response->content_length() approach or grab all the Content-length headers in an array and use the first one as the length:
my #lengths = $response->header('Content-length');
my $length = $lengths[0];
If you end up getting multiple Content-length headers and they're different then someone is very confused.

You cannot detect this, at least not reliably. You could of course split the header value in the middle and try to find out if the left value is equal to the right but when you got sizes like 4444, you don't know if it's duplicated or not. The only chance to fix this is fixing it in the upstream server that sends you duplicated headers.
You could maybe try to access the content length via the content_length property:
$response->content_length
Maybe this is aware of duplicate headers, but i did not try it.

Related

Go (lang) parsing an email header and keeping order

I'm using net/mail library in Go, everything is great, however I want to pass in an original email and keep the order of the headers. This is important because the mail servers that pass the message on each add their headers in an order. Without order, its hard to know who received what, when and what headers each server added.
The net/mail library stores the headers in a map, which by definition has no concept of order. Seems a strange choice as header order is based only on order in the email, but it is the case.
Anyone got any suggestions as to how I can retain order the headers were read?
Thanks
The net/mail package uses the net/textproto package to parse the headers
(see ReadMessage()). Specifically, it uses ReadMIMEHeader() for
the headers, which is documented as:
The returned map m maps CanonicalMIMEHeaderKey(key) to a sequence of values
in the same order encountered in the input.
You can view the full source if you want, but the basic process is:
headers = make(map[string][]string)
for {
key, value := readNextHeader()
if key == "" {
return headers // End of headers
}
if headers[key] == nil {
headers[key] = []string{value}
} else {
headers[key] = append(headers[key], value)
}
}
It's true that the original order of the headers as they appeared in the message
is lost, but I'm not aware of any scenario where this truly matters. What
isn't lost is the order of the multi-values headers. The slice ensures they're
in the same order as they appeared in the email.
You can verify this with a simple program which loops over the headers and
compares the values (such as this one in the
Playground).
However, matching Received and Received-SPF headers is a bit more complex,
as:
not every Received header may have a corresponding Received-SPF header;
the Received-SPF header may not appear above the Received header; this is
recommended but not mandated by the RFC (besides, many programs don't
even follow the RFC, so this wouldn't be a guarantee anyway).
So you'll either need to parse the value of the headers and match them based on
that, or use the net/textproto package for more low-level access to the
headers. You can use the source of ReadMIMEHeader() as a starting point.

Net::IMAP::Simple get not getting the entire message

I'm pretty noobish in perl, so maybe I did something stupid.
My issue is that I'm currently using a script to get all the messages contained by an email box through the IMAP protocol, using Net::IMAP::Simple in PERL, but it does not gives me the entire body of the messages.
My entire code looks like:
use strict;
use Net::IMAP::Simple;
my $imap = Net::IMAP::Simple->new('xxxxxxxxxxxxxx') or die 'Impossible to connect '.$!;
$imap->login('xxxxxxxx', 'xxxxxxxxx') or die 'Login Error!';
my $nbmsg = $imap->select('INBOX') or die 'Impossible to reach this folder !';
print 'You have '.$nbmsg." messages in this folder !\n\n";
my $index = $imap->list() or die 'Impossible to list these messages !';
foreach my $msgnum (keys %$index) {
#if(!$imap->seen($msgnum)) {
my $msg = $imap->get($msgnum) or die 'Impossible to retrieve this message'.$msgnum.' !';
print $msg."\n\n";
# }
}
$imap->quit() or die 'quitting issue !';
And everytime that it is retrieving an email, it is giving me the first characters (which in my case are cryptics useless metadata generated by the bot that sending the messages), but not the entire body.
EDIT: Here is the body part displayed in the output:
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: BASE64
Q2UgbWVzc2FnZSBhIMOpdMOpIGfDqW7DqXLDqSBhdXRvbWF0aXF1ZW1lbnQgcGFyIGwnaW1wcmlt
YW50ZSBtdWx0aWZvbmN0aW9ucyBYZXJveCBYRVJPWF83ODMwLgogICBFbXBsYWNlbWVudCBkdSBz
eXN0w6htZSA6IAogICBBZHJlc3NlIElQIHN5c3TDqG1lIDogMTkyLjE2OC4xLjIwMAogICBBZHJl
c3NlIE1BQyBzeXN0w6htZSA6IDlDOjkzOjRFOjMzOjM1OkJECiAgIE1vZMOobGUgc3lzdMOobWUg
OiBYZXJveCBXb3JrQ2VudHJlIDc4MzAKICAgTnVtw6lybyBkZSBzw6lyaWUgc3lzdMOobWUgOiAz
OTEyNzk4ODk0CgpMJ0Fzc2lzdGFudCBkZSBjb21wdGV1ciBhIGVudm95w6kgbGUgcmVsZXbDqSBz
dWl2YW50IGF1IHNlcnZldXIgZGUgY29tbXVuaWNhdGlvbiBYZXJveCBTTWFydCBlU29sdXRpb25z
IGxlICAxNC8xMS8xNiAgIDA5OjI0OiAKICBUaXJhZ2VzIGVuIG5vaXIgOiAxMzIwNwogIFRpcmFn
ZXMgZW4gY291bGV1ciA6IDkyNjg3CiAgVG91cyBsZXMgdGlyYWdlcyA6IDEwNTg5NA==
It is always ending by this "==" btw, which is making me think that the module is shortening the output.
I looked after some details about it in the CPAN documentation but sadly didn't find anything.
Your messages are encoded as Base64. It's perfectly normal for emails to have that MIME type, though not required. You need to decode them. A good way to do that is to use MIME::Base64. Note that the == is part of the Base64 string. It's a padding to make the string have the right length.
use strict;
use warnings;
use MIME::Base64 'decode_base64';
my $decoded_msg = decode_base64($msg_body);
However, you need to get the body out of those message objects. The documentation is vague about that, it doesn't say what those objects are, and get only returns the raw message.
I suggest you install Data::Printer and use it to dump one of your $msg objects. That dump will include the internals of the object (which is likely a hash reference), and all the methods the object has. It's possible this object includes an accessor to get the already decoded content. If that's the case, you don't need to decode yourself. If not, grab the body out and decode it with decode_base64.
Update: I read the code, and it creates Net::IMAP::Simple::_message objects in the get method. There is a package definition at the top of the code. It's a bit complex, but it's obvious. It uses the arrayref of lines as the data structure behind the object, so I was wrong above.
q( package Net::IMAP::Simple::_message;
use overload fallback=>1, '""' => sub { local $"=""; "#{$_[0]}" };
sub new { bless $_[1] })
And further down:
return wantarray ? #lines : Net::IMAP::Simple::_message->new(\#lines)
So to get the body, you need to get rid of the header string. Once you've dumped out the object, you should see how many elements at the beginning of the array are the header and the empty line. I assume index 0 is the header line, and index 1 is the empty line. If you don't care about those, you can just throw them away.
This will change the object.
shift #$msg; # get rid of header
shift #$msg; # get rid of empty line
my $decoded_msg = decode_base64("$msg");

Overriding Content-Length header with Rack/Sinatra

In Sinatra, how can I override the Content-Length header in the response to set my own value?
The last line in my method returns the following:
[200, {'Content-Type' => component.content_type,
'Content-Length' => component.content_length.to_s}, component.content.data]
This way I was hoping to override the content value, but it results in an exception:
Unexpected error while processing request: Content-Length header was 2, but should be 0
I would like to return a different value for the content length. Is there a way to do this?
This error is being raised by the Rack::Lint middleware, so the quick fix would be to not use that piece of middleware. Depending on how you are starting your application that may be tricky though – Rack adds it in certain cases if you use rackup.
A better solution would be to change your client to use a HTTP HEAD request rather than a GET. In Sinatra defining a GET route automatically defines a matching HEAD route. Using HEAD will cause the server to send the headers but not the body, and you should be able to set the Content-Length to whatever you want. It will also avoid the Rack::Lint error.
Here is a gist explaining how to disable Rack::Lint:
module Rack
class Lint
def call(env = nil)
#app.call(env)
end
end
end
(Taken from https://gist.github.com/shtirlic/2146256).

Sending a video from Perl to a client over HTTP

I am currently making a perl script that will convert a file to webm/ogg/mp4 format and then send it back to the user but in embed video. It all works except that I can not send an EOF so the HTML5 video player knows what the end is and so he can correctly use the file (like going to a specific time and even knowing when the file has ended (now it just stops but you can't do anything anymore with the video.
Start-code:
elsif ($path =~ /^\/((\w|\d){11})\.webm$/ig) {
print "HTTP/1.0 200 OK\r\n";
$handler = \&resp_youtubemovie;
$handler->($cgi,$1);
Function to send webm file
sub resp_youtubemovie {
my $cgi = shift;
my $youtubeID = shift;
return if !ref $cgi;
open(movie,"<$youtubeID.webm");
local($/);
$movie = <movie>;
close(movie);
print "Content-type: movie/webm\n";
print $movie;
}
I've already tried with a while loop and a buffer but that doesn't work either, I've also tried to change the HTTP status code to 206 Partial Content because I wiresharked some other video streaming websites used it but it didn't matter. So anyone an idea how to open a movie file and stream it correctly?
Rather than doing this by hand, a framework like Dancer can take care of this. This will save you many, many, many headaches. It also allows you to take advantage of the Plack/PSGI superglue which figures out how to talk to web servers for you.
use Dancer;
get qr{/(\w{11}\.webm)$}i => sub {
my($video_file) = splat;
return send_file(
$video_file,
streaming => 1,
);
}
Using Dancer routes you should be able to adapt your existing code pretty easily especially if its a big if/elsif matching against various paths. Dancer does a very good job making simple things simple, it also gives you a huge amount of control over the exact HTTP response if you need it.
A few notes...
The content-type for webm is video/webm which may be the source of your problems. Dancer should just get it right. If not you can tell send_file the content type explicitly.
(\w|\d){11} is better written as \w{11} since \w includes \d.
You must use the 206 Partial Content HTTP status and you must also send:
The Accept-Range: bytes header.
A Content-Range: 0-2048/123456 header where you send the starting and ending byte index of the content followed by the total byte length of the content. The client will be sending you the byte ranges it wants in the request header. The client may send multiple byte ranges in a single request, in which case you'd also need to send the content with multipart word boundaries.
Finally, to get back to your question, if the client requests a byte range that isn't satisfiable then you send a 416 HTTP status and close the connection.

How can I extract non-standard HTTP headers using Perl's LWP?

I'm working with a web application that sends some non-standard HTTP headers in its response to a login request. The header in question is:
SSO_STATUS: LoginFailed
I tried to extract it with LWP::Response as $response->header('SSO_STATUS') but it does not work. It does work for standard headers such as Set-Cookie, Expires etc.
Is there a way to work with the raw headers?
if you see the documentation of HTTP::Headers, it states that
The header field name spelling is normally canonicalized
including the '_' to '-' translation. There are some application where
this is not appropriate. Prefixing field names with ':' allow you to
force a specific spelling. For example if you really want a header field
name to show up as foo_bar instead of "Foo-Bar", you might set it like
this:
$h->header(":foo_bar" => 1);
These field names are returned with the ':' intact for
$h->header_field_names and the $h->scan callback, but the colons do
not show in $h->as_string.
See this thread on Perlmonks.
You need to access the value of the header field as $response->header('SSO-STATUS').
The syntax for setting fields with underscores in names:
$response->header(':SSO_STATUS' => 'foo');