How do I do a chunked transfer-encoding upload with WWW:Mechanize? - perl

I'm attempting to use a particular web service, and I can successfully perform the upload with the following command:
curl -X POST --header "Transfer-Encoding: chunked" -d #Downloads/file.pdf https://some.webservice/upload
I get back a json response indicate success.
However, I'm unable to figure out how to do the same with WWW::Mechanize.
$mech->post("https://" . $server . "/upload", Content_Type => 'multipart/form-data', Content => [upID => $upid, name => $dlfile, userID => 0, userK => 0, file_0 => [$dlfile]]);
This receives a similar json response with a big fat error message in it.
Do I need to explicitly set the Transfer-Encoding header first? Is there some other trick to it? Google's not shedding much light on this, Perlmonks neither, and the documentation's a little obtuse.

You can do it using HTTP::Request::StreamingUpload
my $starttime = time();
my $req = HTTP::Request::StreamingUpload->new(
POST => $url,
path => $file,
headers => HTTP::Headers->new(
'Transfer-Encoding' => 'chunked'
),
);
my $gen = $req->content;
die unless ref($gen) eq "CODE";
my $total = 0;
$req->content(sub {
my $chunk = &$gen();
$total += length($chunk);
print "\r$total / $size bytes ("
. int($total/$size*100)
. "%) sent, "
. int($total/1000/(time()-$starttime+1))
. " k / sec ";
return $chunk;
});
my $resp = $ua->request($req);
print "\n";
unless ($resp->is_success) {
die "Failed uploading the file: ", $resp->status_line;
}
my $con = $resp->content;
return $con;

Do you really need WWW::Mechanize? It is a subclass of LWP::UserAgent with additional functionality that gives browser-like functionality like filling in and submitting forms, clicking links, a page history with a "back" operation etc. If you don't need all of that then you may as well use LWP::UserAgent directly
Either way, the post method is inherited unchanged from LWP::UserAgent, and it's fine to use it directly as you have done
The way to send a chunked POST is to set the Content to a reference to a subroutine. The subroutine must return the next chunk of data each time it is called, and finally ann empty string or undef when there is no more to send
Is the data supposed to be a JSON string?
It's easiest to write a factory subroutine that returns a closure, like this
sub make_callback {
my ($data) = shift;
sub { substr($data, 0, 512, "") }
}
Then you can call post like this
my $payload = to_json(...);
$mech->post(
"https://$server/upload",
Content_Type => 'multipart/form-data',
Content => make_callback($payload)
);
Please be aware that all of this is untested

Related

Delayed response to slash command with Mojolicious in Perl

I am trying to create a slack application in Perl with mojolicious and I am having the following use case:
Slack sends a request to my API from a slash command and needs a response in a 3 seconds timeframe. However, Slack also gives me the opportunity to send up to 5 more responses in a 30 minute timeframe but still needs an initial response in 3 seconds (it just sends a "late_response_url" in the initial call back so that I could POST something to that url later on). In my case I would like to send an initial response to slack to inform the user that the operation is "running" and after a while send the actual outcome of my slow function to Slack.
Currently, I can do this by spawning a second process using fork() and using one process to respond imidiately as Slack dictates and the second to do the rest of the work and respond later on.
I am trying to do this with Mojolicious' subprocesses to avoid using fork(). However I can't find a way to get this to work....
a sample code of what I am already doing with fork is like this:
sub withpath
{
my $c = shift;
my $user = $c->param('user_name');
my $response_body = {
response_type => "ephemeral",
text => "Running for $user:",
attachments => [
{ text => 'analyze' },
],
};
my $pid = fork();
if($pid != 0){
$c->render( json => $response_body );
}else{
$output = do_time_consuming_things()
$response_body = {
response_type => "in-channel",
text => "Result for $user:",
attachments => [
{ text => $output },
],
};
my $ua = Mojo::UserAgent->new;
my $tx = $ua->post(
$response_url,
{ Accept => '*/*' },
json => $response_body,
);
if( my $res = $tx->success )
{
print "\n success \n";
}
else
{
my $err = $tx->error;
print "$err->{code} response: $err->{message}\n" if $err->{code};
print "Connection error: $err->{message}\n";
}
}
}
So the problem is that no matter how I tried I couldn't replicate the exact same code with Mojolicious' subproccesses. Any ideas?
Thanks in advance!
Actually I just found a solution to my problem!
So here is my solution:
my $c = shift; #receive request
my $user = $c->param('user_name'); #get parameters
my $response_url = $c->param('response_url');
my $text = $c->param('text');
my $response_body = { #create the imidiate response that Slack is waiting for
response_type => "ephemeral",
text => "Running for $user:",
attachments => [
{ text => 'analyze' },
],
};
my $subprocess = Mojo::IOLoop::Subprocess->new; #create the subprocesses
$subprocess->run(
sub {do_time_consuming_things($user,$response_url,$text)}, #this callback is the
#actuall subprocess that will run in background and contains the POST request
#from my "fork" code (with the output) that should send a late response to Slack
sub {# this is a dummy subprocess doing nothing as this is needed by Mojo.
my ($subprocess, $err, #results) = #_;
say $err if $err;
say "\n\nok\n\n";
}
);
#and here is the actual imidiate response outside of the subprocesses in order
#to avoid making the server wait for the subprocess to finish before responding!
$c->render( json => $response_body );
So I actually simply had to put my code of do_time_consuming_things in the first callback (in order for it to run as a subprocess) and use the second callback (that is actually linked to the parent process) as a dummy one and keep my "imidiate" response in the main body of the whole function instead of putting it inside one of the subprocesses. See code comments for more information!

Image request failing. Possible HTTPS issue

I'm working with some Perl code that I don't understand:
my $tmpdir = "XXX";
my $src = "tmp" . time . int rand 10 . ".jpg";
while ( -s "$tmpdir/$src" ) {
$src = "tmp" . time . int rand 10 . ".jpg";
}
my $ua = LWP::UserAgent->new( keep_alive => 0, timeout => 10 );
my $req = HTTP::Request->new(
"GET",
"https://www.com/act/couponimage/1877",
);
my $res = $ua->request( $req, "$tmpdir/$src" );
if ( ! $res || ! -s "$tmpdir/$src" ) {
header( 301, 0, 0, "https://www.com/" );
exit 0;
}
For some reason it hits the point where it redirects to the main page (the header code).
There is clearly something wrong here because it never executes past the last if clause which redirects. Maybe it has to do with using HTTPS?
It is not a filesystem issue. The following coupon image path works fine
Xxx.com/img/coupon-2600.jpg
The request should be working fine,
but as pointed out, you really should check what version of the packages you are using.
For me ( perl 5.22 - LWP::UserAgent 6.26 ) the call $ua->req( $req ) needs to be $ua->request( $req->uri ). And the file to store the response in must be given with $ua->request( $req->uri, :content_file => $fn ) .
I think the original author assumed the content file would be created by giving the filename to $ua->request. With :content_file, that is what's happening for me. He put the while ( -s $src ) in place to ensure he got a filename that was not yet taken.
Thus, I believe the solution to your problem is to change
my $res = $ua->request($req, "$tmpdir/$src") to
$ua->request($req->uri, ':content_file' => "$tmpdir/$src" );
The docs mention that using :content_file means the actual response content is no longer stored in the object, so you may have to adapt your code.
If all you need is the file, I would adapt the final if to check for the file that should now exist ( which should, I believe, only happen upon successful response ).
If what you need is the $res object to use later, I would probably opt to check the response worked, then print the content to the file you want it in, so just do
my $res = $ua->request( $req->uri );
adapt the if accordingly and print content to file.
I would personally prefer the latter, as I can check the response for success, then print: this seems much clearer.

Perl: How can i test for a URL ( https ) accepting GET requests using "login" parameter

I have a CGI server side script that accepts GET and POST, with login parameters.
I want to test it to make sure it is not vulnerable. So the plan is to use Perl LWP, and send login parameters in GET and POST, and compare the results. the interface has been changed, so that only in POST we can send user-name and password in session cookies ( not sure if that is a great idea ) , so how do i test it ? Here is what i have so far:
#!/usr/bin/perl
use LWP;
print "This is libwww-perl-$LWP::VERSION\n";
# Create a user agent object
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->agent("MyApp/0.1 ");
# Create a request
#my $req = HTTP::Request->new(POST => 'http://search.cpan.org/search');
#my $req = HTTP::Request->new(GET => 'https://qa.co.net:443/cgi-bin/n-cu.cgi');
my $req = HTTP::Request->new(GET => 'https://qa.co.net:443/cgi-bin/n-cu.cgi?mode=frameset&JScript=1&remote_user&login=foo&password=foo HTTP/1.1');
$req->content_type('application/x-www-form-urlencoded');
$req->content('query=libwww-perl&mode=dist');
# Pass request to the user agent and get a response back
my $res = $ua->request($req);
# Check the outcome of the response
if ($res->is_success) {
print $res->content;
#print $res->code;
#print $res->message;
}
else {
print $res->status_line, "\n";
}
This is not going to do it, since it does not have the session cookie stuff. But might be a good start though. Is this the right way to test the GET and POST ?
Here is what was implemented in the cgi:
#cr_login for POST && login for GET -- leave GET param as it used to be.
if ($m eq 'GET' && defined($req->param('login'))) {
$msg = 'parameter "login" is invalid for this request type.';
+ my $seclog = $event_logging_directory . '/invalid_request.log';
+ open(S, ">>$seclog") or die $!;
+ my $logmsg = sprintf("%4d-%02d-%02d %02d:%02d:%02d",Today_and_Now())
+ . "|mode:" . $req->param('mode')
+ . "|login:" . $req->param('login')
+ . "|remote_addr:" . $ENV{REMOTE_ADDR}
+ . "|$msg\n";
+ print S $logmsg;
and :
POST request to n-cu.cgi should use parameter "cr_login". If the parameter "login" is passed in a post request, it should throw error and return to login screen.
GET request to n-cu.cgi should use the parameter "login". If the parameter "cr_login" is passed in a post request, it should throw error and return to login screen.
so here is how we do it:
Keep the session cookie and context alive :
my $browser = LWP::UserAgent->new(keep_alive => 10);
$browser->cookie_jar( {} );
$browser->agent('Mozilla/8.0');
#$browser->ssl_opts({ verify_hostname => 0 });
$browser->show_progress(1);
and later: print the response
print "Cookies:\n", Dumper($browser->cookie_jar()), "\n\n";
my $content = $response->as_string;
print "$content\n";
Sending password in a cookie? Nope.
Disallow GET for /login.
POST username and password to /login, over SSL.
In CGI, the GET/POST is indicated via the REQUEST_METHOD environment variable.
You cannot stop determined people from issuing a GET request to your server, but you can refuse to process it like so (untested code - you have to fill in details):
if ($ENV{REQUEST_METHOD} ne 'POST') {
# issue a redirect to a suitable error page, then return.
}
my $q = CGI->new();
my $user = $q->params('username');
my $password = $q->params('password');
my $encrypted_password = my_password_encryptor($password);
unless ( can_log_in($user, $encrypted_password) ) {
# issue an error message - redirect&return or fall-through...
}
else {
$session->set_user_logged_in();
}
Most people do not roll their own authentication or session handling. They mostly use one from CPAN, or one included with the larger app framework. If you're doing CGI, you can use CGI::Session.
You might give CGI::Application and/or its offspring a look. Those authors have already solved a bunch of the problems that you're encountering.

Posting to a WEB API from Perl results in a null value in [FromBody]

We have created a WEB API (in .NET framework 4.0) and gave the endpoint info to one of our clients. They created a program in Perl that posts to our endpoint.
Every post they have made so far arrives into our endpoint as null. When we initially started programming, we had that same issue in JQuery when posting by means of $.ajax. We solved it by adding a '=' at the beginning of the post data.
The Perl code they have submitted is the following:
sub _postPackages {
my ($self,$dataToSend) = #_;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->agent("integrationapp/1.0 ");
# Create a request
my $req = HTTP::Request->new(POST => $self->{postAddress} );
$req->content_type("application/json;charset=utf-8");
$req->content($dataToSend->{data});
#print Data::Dumper->Dump([$req]);
# Pass request to the user agent and get a response back
my $res = $ua->request($req);
where postAddress is our endpoint and $dataToSend is the message data. Is it possible that they need to add the '=' at the beginning of the $dataToSend message.
Any help will be greatly appreciated.
This is a bit of pseudo code here..
But I'm guessing you want to do something like this:
# some post sub
my ($self, $data) = #_;
my $ua = $self->get_user_agent();
my $json_xs = $self->get_json_xs();
my $json_encoded = $json_xs->utf8->encode($data);
$self->set_post_data($json_encoded);
$self->set_api_call();
my $response_body = $ua->post(
$self->get_api_call(),
'Content' => $self->get_post_data(),
'Content-type' => "application/json;charset=UTF-8"
);
print STDERR "POSTING NEW RESOURCE: " . Dumper($self);

www::curl - how to upload (post) large files

I use WWW::Curl to upload files:
use WWW::Curl::Easy 4.14;
use WWW::Curl::Form;
my $url = 'http://example.com/backups/?sid=12313qwed323';
my $params = {
name => 'upload',
action => 'keep',
backup1 => [ '/tmp/backup1.zip' ], # 1st file for upload
};
my $form = WWW::Curl::Form->new();
foreach my $k (keys %{$params}) {
if (ref $params->{$k}) {
$form->formaddfile(#{$params->{$k}}[0], $k, 'multipart/form-data');
} else {
$form->formadd($k, $params->{$k});
}
}
my $curl = WWW::Curl::Easy->new() or die $!;
$curl->setopt(CURLOPT_HTTPPOST, $form);
$curl->setopt(CURLOPT_URL, $url);
my $body;
$curl->setopt(CURLOPT_WRITEDATA, \$body);
my $retcode = $curl->perform();
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
nothing special here and this code works well.
I want to upload large files and I don't want to preload everything in the memory. At least that is what I heard that libcurl is doing.
CURLOPT_READFUNCTION accepts callbacks which returns parts of the content. That means that I cannot use WWW::Curl::Form to set POST parameters but that I have to return the whole content through this callback. Is that right?
I think that the code could look like this:
use WWW::Curl::Easy 4.14;
my $url = 'http://example.com/backups/?sid=12313qwed323'
my $params = {
name => 'upload',
action => 'keep',
backup1 => [ '/tmp/backup1.zip' ], # 1st file for upload
};
my $fields;
foreach my $k (keys %{$params}) {
$fields .= "$k=".(ref $params->{$k} ? '#'.#{$params->{$k}}[0] : uri_escape_utf8($params->{$k}))."&";
}
chop($fields);
my $curl = WWW::Curl::Easy->new() or die $!;
$curl->setopt(CURLOPT_POST, 1);
$curl->setopt(CURLOPT_POSTFIELDS, $fields); # is it needed with READFUNCTION??
$curl->setopt(CURLOPT_URL, $url);
my #header = ('Content-type: multipart/form-data', 'Transfer-Encoding: chunked');
$curl->setopt(CURLOPT_HTTPHEADER, \#header);
#$curl->setopt(CURLOPT_INFILESIZE, $size);
$curl->setopt(CURLOPT_READFUNCTION, sub {
# which data to return here?
# $params (without file) + file content?
return 0;
});
Which data does CURLOPT_READFUNCTION callback have to return? $params + File(s) content? In which format?
Do I really have to create the data (returned by CURLOPT_READFUNCTION) by myself or is there a simple way to create it in the right format?
Thanks
Test 16formpost.t is relevant. As you can see, it's completely disabled. This fact and my fruitless experiments with various return values for the callback function lets me believe the CURLOPT_READFUNCTION feature is known broken in the Perl binding.
I have to return the whole content through this callback. Is that right?
No, you can feed it the request body piecewise, suitable for chunked encoding. The callback will be necessarily called several times, according to the limit set in CURLOPT_INFILESIZE.
Which data does CURLOPT_READFUNCTION callback have to return?
A HTTP request body. Since you do a file upload, this means Content-Type multipart/form-data. Following is an example using HTTP::Message. CURLOPT_HTTPPOST is another way to construct this format.
use HTTP::Request::Common qw(POST);
use WWW::Curl::Easy 4.14;
my $curl = WWW::Curl::Easy->new or die $!;
$curl->setopt(CURLOPT_POST, 1);
$curl->setopt(CURLOPT_URL, 'http://localhost:5000');
$curl->setopt(CURLOPT_HTTPHEADER, [
'Content-type: multipart/form-data', 'Transfer-Encoding: chunked'
]);
$curl->setopt(CURLOPT_READFUNCTION, sub {
return POST(undef, Content_Type => 'multipart/form-data', Content => [
name => 'upload',
action => 'keep',
backup1 => [ '/tmp/backup1.zip' ], # 1st file for upload
])->content;
});
my $r = $curl->perform;
The CURLOPT_READFUNCTION callback is only used for chunked tranfer encoding. It may work, but I haven't been able to get it to and found that doing so wasn't necessary anyway.
My use case was for upload of data to AWS, where it's not ok to upload the data as multi-part form data. Instead, it's a straight POST of the data. It does require that you know how much data you're sending the server, though. This seems to work for me:
my $infile = 'file-to-upload.json';
my $size = -s $infile;
open( IN, $infile ) or die("Cannot open file - $infile. $! \n");
my $curl = WWW::Curl::Easy->new;
$curl->setopt(CURLOPT_HEADER, 1);
$curl->setopt(CURLOPT_NOPROGRESS, 1);
$curl->setopt(CURLOPT_POST, 1);
$curl->setopt(CURLOPT_URL, $myPostUrl);
$curl->setopt(CURLOPT_HTTPHEADER,
['Content-Type: application/json']); #For my use case
$curl->setopt(CURLOPT_POSTFIELDSIZE_LARGE, $size);
$curl->setopt(CURLOPT_READDATA, \*IN);
my $retcode = $curl->perform;
if ($retcode == 0) {
print("File upload success\n");
}
else {
print("An error happened: $retcode ".$curl->strerror($retcode)."\n");
}
The key is providing an open filehandle reference to CURLOPT_READDATA. After that, the core curl library handles the reads from it without any need for callbacks.