How to post array form using LWP - perl

I am having problems creating an array that I can pass as a form using LWP. Basic code is
my $ua = LWP::UserAgent->new();
my %form = { };
$form->{'Submit'} = '1';
$form->{'Action'} = 'check';
for (my $i=0; $i<1; $i++) {
$form->{'file_'.($i+1)} = [ './test.txt' ];
$form->{'desc_'.($i+1)} = '';
}
$resp = $ua->post('http://someurl/test.php', 'Content_Type' => 'multipart/form-data'
, 'Content => [ \%form ]');
if ($resp->is_success()) {
print "OK: ", $resp->content;
}
} else {
print $claimid->as_string;
}
I guess I am not creating the form array correctly or using the wrong type as when I check the _POST variables in test.php nothing has been set :(

The problem is that for some reason you've enclosed your form values in single quotes. You want to send the data structure. E.g.:
$resp = $ua->post('http://someurl/test.php',
'Content_Type' => 'multipart/form-data',
'Content' => \%form);
You want to either send the hash reference of %form, not the has reference contained within an array reference as you had ([ \%form ]). If you had wanted to send the data as an array reference, then you'd just use[ %form ]` which populates the array with the key/value pairs from the hash.
I'd suggest that you read the documentation for HTTP::Request::Common, the POST section in particular for a cleaner way of doing this.

Related

How do I do a chunked transfer-encoding upload with WWW:Mechanize?

I'm attempting to use a particular web service, and I can successfully perform the upload with the following command:
curl -X POST --header "Transfer-Encoding: chunked" -d #Downloads/file.pdf https://some.webservice/upload
I get back a json response indicate success.
However, I'm unable to figure out how to do the same with WWW::Mechanize.
$mech->post("https://" . $server . "/upload", Content_Type => 'multipart/form-data', Content => [upID => $upid, name => $dlfile, userID => 0, userK => 0, file_0 => [$dlfile]]);
This receives a similar json response with a big fat error message in it.
Do I need to explicitly set the Transfer-Encoding header first? Is there some other trick to it? Google's not shedding much light on this, Perlmonks neither, and the documentation's a little obtuse.
You can do it using HTTP::Request::StreamingUpload
my $starttime = time();
my $req = HTTP::Request::StreamingUpload->new(
POST => $url,
path => $file,
headers => HTTP::Headers->new(
'Transfer-Encoding' => 'chunked'
),
);
my $gen = $req->content;
die unless ref($gen) eq "CODE";
my $total = 0;
$req->content(sub {
my $chunk = &$gen();
$total += length($chunk);
print "\r$total / $size bytes ("
. int($total/$size*100)
. "%) sent, "
. int($total/1000/(time()-$starttime+1))
. " k / sec ";
return $chunk;
});
my $resp = $ua->request($req);
print "\n";
unless ($resp->is_success) {
die "Failed uploading the file: ", $resp->status_line;
}
my $con = $resp->content;
return $con;
Do you really need WWW::Mechanize? It is a subclass of LWP::UserAgent with additional functionality that gives browser-like functionality like filling in and submitting forms, clicking links, a page history with a "back" operation etc. If you don't need all of that then you may as well use LWP::UserAgent directly
Either way, the post method is inherited unchanged from LWP::UserAgent, and it's fine to use it directly as you have done
The way to send a chunked POST is to set the Content to a reference to a subroutine. The subroutine must return the next chunk of data each time it is called, and finally ann empty string or undef when there is no more to send
Is the data supposed to be a JSON string?
It's easiest to write a factory subroutine that returns a closure, like this
sub make_callback {
my ($data) = shift;
sub { substr($data, 0, 512, "") }
}
Then you can call post like this
my $payload = to_json(...);
$mech->post(
"https://$server/upload",
Content_Type => 'multipart/form-data',
Content => make_callback($payload)
);
Please be aware that all of this is untested

How to add multiple custom fields to a wp_query in a shortcode?

In a shortcode I can limit wp_query results by custom field values.
Example:
[my-shortcode meta_key=my-custom-field meta_value=100,200 meta_compare='IN']
And obviously it's possible to use multiple custom fields in a wp_query like WP_Query#Custom_Field_Parameters
But how can I use multiple custom fields in my shortcode? At the moment I do pass all the shortcode parameters with $atts.
On of a few different solutions might be to use a JSON encoded format for the meta values. Note that this isn't exactly user-friendly, but certainly would accomplish what you want.
You would of course need to generate the json encoded values first and ensure they are in the proper format. One way of doing that is just using PHP's built in functions:
// Set up your array of values, then json_encode them with PHP
$values = array(
array('key' => 'my_key',
'value' => 'my_value',
'operator' => 'IN'
),
array('key' => 'other_key',
'value' => 'other_value',
)
);
echo json_encode($values);
// outputs: [{"key":"my_key","value":"my_value","operator":"IN"},{"key":"other_key","value":"other_value"}]
Example usage in the shortcode:
[my-shortcode meta='[{"key":"my_key","value":"my_value","operator":"IN"},{"key":"other_key","value":"other_value"}]']
Which then you would parse out in your shortcode function, something like so:
function my_shortcode($atts ) {
$meta = $atts['meta'];
// decode it "quietly" so no errors thrown
$meta = #json_decode( $meta );
// check if $meta set in case it wasn't set or json encoded proper
if ( $meta && is_array( $meta ) ) {
foreach($meta AS $m) {
$key = $m->key;
$value = $m->value;
$op = ( ! empty($m->operator) ) ? $m->operator : '';
// put key, value, op into your meta query here....
}
}
}
Alternate Method
Another method would be to cause your shortcode to accept an arbitrary number of them, with matching numerical indexes, like so:
[my-shortcode meta-key1="my_key" meta-value1="my_value" meta-op1="IN
meta-key2="other_key" meta-value2="other_value"]
Then, in your shortcode function, "watch" for these values and glue them up yourself:
function my_shortcode( $atts ) {
foreach( $atts AS $name => $value ) {
if ( stripos($name, 'meta-key') === 0 ) {
$id = str_ireplace('meta-key', '', $name);
$key = $value;
$value = (isset($atts['meta-value' . $id])) ? $atts['meta-value' . $id] : '';
$op = (isset($atts['meta-op' . $id])) ? $atts['meta-op' . $id] : '';
// use $key, $value, and $op as needed in your meta query
}
}
}

POST array to LWP: only first entry getting posted

Here's section of code and i am trying to POST whole array using LWP but server is receiving only first row of array(0 index) while others are not getting sent to server, please guide what i am doing wrong
$data_post[0] = "text1";
$data_post[1] = "text2";
$data_post[2] = "texxt3";
$data_post[3] = "text4";
$data_post[4] ="text5";
my $ua= LWP::UserAgent->new();
my $response = $ua->post( $url, { 'istring' => #data_post} );
my $content = $response->decoded_content();
my $cgi = CGI->new();
print $cgi->header(), $content;
You can't assign an array to a hash key, only a scalar. Your attempt will expand the array and send this:
{ "istring" => "text1", "text2" => "texxt3", "text4" => "text5" }
Use an array ref instead by putting the "take a reference" operator in front of the array:
{ istring => \#data_post }

www::curl - how to upload (post) large files

I use WWW::Curl to upload files:
use WWW::Curl::Easy 4.14;
use WWW::Curl::Form;
my $url = 'http://example.com/backups/?sid=12313qwed323';
my $params = {
name => 'upload',
action => 'keep',
backup1 => [ '/tmp/backup1.zip' ], # 1st file for upload
};
my $form = WWW::Curl::Form->new();
foreach my $k (keys %{$params}) {
if (ref $params->{$k}) {
$form->formaddfile(#{$params->{$k}}[0], $k, 'multipart/form-data');
} else {
$form->formadd($k, $params->{$k});
}
}
my $curl = WWW::Curl::Easy->new() or die $!;
$curl->setopt(CURLOPT_HTTPPOST, $form);
$curl->setopt(CURLOPT_URL, $url);
my $body;
$curl->setopt(CURLOPT_WRITEDATA, \$body);
my $retcode = $curl->perform();
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
nothing special here and this code works well.
I want to upload large files and I don't want to preload everything in the memory. At least that is what I heard that libcurl is doing.
CURLOPT_READFUNCTION accepts callbacks which returns parts of the content. That means that I cannot use WWW::Curl::Form to set POST parameters but that I have to return the whole content through this callback. Is that right?
I think that the code could look like this:
use WWW::Curl::Easy 4.14;
my $url = 'http://example.com/backups/?sid=12313qwed323'
my $params = {
name => 'upload',
action => 'keep',
backup1 => [ '/tmp/backup1.zip' ], # 1st file for upload
};
my $fields;
foreach my $k (keys %{$params}) {
$fields .= "$k=".(ref $params->{$k} ? '#'.#{$params->{$k}}[0] : uri_escape_utf8($params->{$k}))."&";
}
chop($fields);
my $curl = WWW::Curl::Easy->new() or die $!;
$curl->setopt(CURLOPT_POST, 1);
$curl->setopt(CURLOPT_POSTFIELDS, $fields); # is it needed with READFUNCTION??
$curl->setopt(CURLOPT_URL, $url);
my #header = ('Content-type: multipart/form-data', 'Transfer-Encoding: chunked');
$curl->setopt(CURLOPT_HTTPHEADER, \#header);
#$curl->setopt(CURLOPT_INFILESIZE, $size);
$curl->setopt(CURLOPT_READFUNCTION, sub {
# which data to return here?
# $params (without file) + file content?
return 0;
});
Which data does CURLOPT_READFUNCTION callback have to return? $params + File(s) content? In which format?
Do I really have to create the data (returned by CURLOPT_READFUNCTION) by myself or is there a simple way to create it in the right format?
Thanks
Test 16formpost.t is relevant. As you can see, it's completely disabled. This fact and my fruitless experiments with various return values for the callback function lets me believe the CURLOPT_READFUNCTION feature is known broken in the Perl binding.
I have to return the whole content through this callback. Is that right?
No, you can feed it the request body piecewise, suitable for chunked encoding. The callback will be necessarily called several times, according to the limit set in CURLOPT_INFILESIZE.
Which data does CURLOPT_READFUNCTION callback have to return?
A HTTP request body. Since you do a file upload, this means Content-Type multipart/form-data. Following is an example using HTTP::Message. CURLOPT_HTTPPOST is another way to construct this format.
use HTTP::Request::Common qw(POST);
use WWW::Curl::Easy 4.14;
my $curl = WWW::Curl::Easy->new or die $!;
$curl->setopt(CURLOPT_POST, 1);
$curl->setopt(CURLOPT_URL, 'http://localhost:5000');
$curl->setopt(CURLOPT_HTTPHEADER, [
'Content-type: multipart/form-data', 'Transfer-Encoding: chunked'
]);
$curl->setopt(CURLOPT_READFUNCTION, sub {
return POST(undef, Content_Type => 'multipart/form-data', Content => [
name => 'upload',
action => 'keep',
backup1 => [ '/tmp/backup1.zip' ], # 1st file for upload
])->content;
});
my $r = $curl->perform;
The CURLOPT_READFUNCTION callback is only used for chunked tranfer encoding. It may work, but I haven't been able to get it to and found that doing so wasn't necessary anyway.
My use case was for upload of data to AWS, where it's not ok to upload the data as multi-part form data. Instead, it's a straight POST of the data. It does require that you know how much data you're sending the server, though. This seems to work for me:
my $infile = 'file-to-upload.json';
my $size = -s $infile;
open( IN, $infile ) or die("Cannot open file - $infile. $! \n");
my $curl = WWW::Curl::Easy->new;
$curl->setopt(CURLOPT_HEADER, 1);
$curl->setopt(CURLOPT_NOPROGRESS, 1);
$curl->setopt(CURLOPT_POST, 1);
$curl->setopt(CURLOPT_URL, $myPostUrl);
$curl->setopt(CURLOPT_HTTPHEADER,
['Content-Type: application/json']); #For my use case
$curl->setopt(CURLOPT_POSTFIELDSIZE_LARGE, $size);
$curl->setopt(CURLOPT_READDATA, \*IN);
my $retcode = $curl->perform;
if ($retcode == 0) {
print("File upload success\n");
}
else {
print("An error happened: $retcode ".$curl->strerror($retcode)."\n");
}
The key is providing an open filehandle reference to CURLOPT_READDATA. After that, the core curl library handles the reads from it without any need for callbacks.

Web::Scraper and Perl

I have the following script that scrapes my schools CS department to get a list of all the courses. I want to be able to extract the CRN (course number) and other important information to put into a database which I can let users browse through a web app.
Here is an example URL:
http://courses.illinois.edu/cis/2011/spring/schedule/CS/411.html
I would like to extract info from pages like this. The first level of the scraper just constructs the individual sites from a list of all of the courses. Once I'm at a course specific catalog page, I use the second scraper to attempt to get all of this info i want. For some reason, although the CRN's and Course Instructors are all 'td' elements. My scraper seems to be returning nothing when scraping. I tried to scrape specifically for 'div' instead and I get a bunch of info for each relevant page. So somehow I'm failing to get the 'td' element, but I'm scraping from the right page.
my $tweets = scraper {
# Parse all LIs with the class "status", store them into a resulting
# array 'tweets'. We embed another scraper for each tweet.
# process "h4.ws-ds-name.detail-title", "array[]" => 'TEXT';
process "div.ws-row", "array[]" => 'TEXT';
};
my $res = $tweets->scrape( URI- >new("http://courses.illinois.edu/cis/2011/spring/schedule/CS/index.html?skinId=2169") );
foreach my $elem (#{$res->{array}}){
my $coursenum = substr($elem,2,4);
my $secondLevel = scraper{
process "td.ws-row", "array2[]" => 'TEXT';
};
my $res2 = $secondLevel->scrape(URI- >new("http://courses.illinois.edu/cis/2011/spring/schedule/CS/$coursenum.html"));
my $num = #{$res2->{array2}};
print $num;
print "---------------------", "\n";
my #curr = #{$res2->{array2}};
foreach my $elem2 (#curr){
$num++;
print $elem2, " ", "\n";
}
print "---------------------", "\n";
}
Any ideas?
Thanks
Looks to me like
my $coursenum = substr($elem,2,4)
should be
my $coursenum = substr($elem,3,3)
The easiest way to go in this case is use
HTML::TableExtract
In case you are looking for data from the table only.
I played a bit with your problem. You can get course id, title and link to individual course page within initial scraper:
my $courses = scraper {
process 'div.ws-row',
'course[]' => scraper {
process 'div.ws-course-number', 'id' => 'TEXT';
process 'div.ws-course-title', 'title' => 'TEXT';
process 'div.ws-course-title a', 'link' => '#href';
};
result 'course';
};
The result of scraping is arrayref with hashrefs like this:
{ id => "CS 103",
title => "Introduction to Programming",
link => bless(do{\(my $o = "http://courses.illinois.edu/cis/2011/spring/schedule/CS/103.html?skinId=2169")}, "URI::http"),
},
....
Then you can do additional scraping for each course from their individual pages and add such information into original structure:
for my $course (#$res) {
my $crs_scraper = scraper {
process 'div.ws-description', 'desc' => 'TEXT';
# ... add more items here
};
my $additional_data = $crs_scraper->scrape(URI->new($course->{link}));
# slice assignment to add them into course definition
#{$course}{ keys %$additional_data } = values %$additional_data;
}
Source combined together is as follows:
use strict; use warnings;
use URI;
use Web::Scraper;
use Data::Dump qw(dump);
my $url = 'http://courses.illinois.edu/cis/2011/spring/schedule/CS/index.html?skinId=2169';
my $courses = scraper {
process 'div.ws-row',
'course[]' => scraper {
process 'div.ws-course-number', 'id' => 'TEXT';
process 'div.ws-course-title', 'title' => 'TEXT';
process 'div.ws-course-title a', 'link' => '#href';
};
result 'course';
};
my $res = $courses->scrape(URI->new($url));
for my $course (#$res) {
my $crs_scraper = scraper {
process 'div.ws-description', 'desc' => 'TEXT';
# ... add more items here
};
my $additional_data = $crs_scraper->scrape(URI->new($course->{link}));
# slice assignment to add them into course definition
#{$course}{ keys %$additional_data } = values %$additional_data;
}
dump $res;