LWP POST request not working - perl

The if statement is showing me that there is a response, but when I try to print the response I get nothing
use LWP::UserAgent;
use strict;
use warnings;
use HTTP::Request::Common;
# use this {"process": "mobileGps","phone": "9565551236"}
my $url = "the url goes here";
my $json = '{data :{"process" : "mobileGps", "phone" : "9565551236"}}';
my $req = HTTP::Request->new( POST => $url );
$req->header( 'Content-Type' => 'application/json' );
$req->content( $json );
my $ua = LWP::UserAgent->new;
my $res = $ua->request( $req );
if ( $res->is_success ) {
print "It worked";
print $res->decoded_content;
}
else {
print $res->code;
}
I do have the URL: I just took it out for the purpose of this example.
What am I missing?

Try debugging your script like this:
use strict;
use warnings;
use HTTP::Request::Common;
use LWP::ConsoleLogger::Easy qw( debug_ua );
use LWP::UserAgent;
# use this {"process": "mobileGps","phone": "9565551236"}
my $url = "the url goes here";
my $json = '{data :{"process" : "mobileGps", "phone" : "9565551236"}}';
my $req = HTTP::Request->new(POST => $url);
$req->header('Content-Type' =>'application/json');
$req->content($json);
my $ua = LWP::UserAgent->new;
debug_ua( $ua );
my $res = $ua->request($req);
if ($res->is_success) {
print "It worked";
print $res->decoded_content;
} else {
print $res->code;
}
That will (hopefully) give you a better idea of what's going on.

Can you not use the debugger, or add some print statements to see how your program is progressing?
If not then this is going to be another case of on-line turn-by-turn debugging, which benefits no one except the OP, and the ultimate diagnosis is that they should have learned the language first
The internet can be wise, but it will make many more artisans Pretender than craftsmen
Please don't ever expect to make a half-hearted attempt at a sketch, and then rope in the rest of the world to finish your job. It takes a huge amount of experience, aptitude, and understanding to get even a "What's your name" .. "Hello" program working, and things only get harder thereafter
If you don't like being careful and thorough, and would rather ask for people to do your stuff for you than discover a solution by experimentation, then you are a manager, not a programmer. I hope you will never try to advance a software career by getting great at delegating, because that doesn't work with software
Here. Use this as you will. The world is full of managers; it is good programmers that we need
use strict;
use warnings 'all';
use feature 'say';
use constant URL => 'http://example.com/';
use LWP;
my $ua = LWP::UserAgent->new;
my $json = '{}';
my $req = HTTP::Request->new( POST => URL );
$req->header( content_type => 'application/json' );
$req->content( $json );
my $res = $ua->request( $req );
say $res->as_string;

The code is fine. The problem must be with the server that is serving the request upon status code 200. You should check at server's end.

Related

HTML::TableExtract an HTTPS site

I've created a perl script to use HTML::TableExtract to scrape data from tables on a site.
It works great to dump out table data for unsecured sites (i.e. HTTP site), but when I try HTTPS sites, it doesn't work (the tables_report line just prints blank.. it should print a bunch of table data).
However, if I take the content of that HTTPS page, and save it to an html file and then post it on an unsecured HTTP site (and change my content to point to this HTTP page), this script works as expected.
Anyone know how I can get this to work over HTTPS?
#!/usr/bin/perl
use lib qw( ..);
use HTML::TableExtract;
use LWP::Simple;
use Data::Dumper;
# DOESN'T work:
my $content = get("https://datatables.net/");
# DOES work:
# my $content = get("http://www.w3schools.com/html/html_tables.asp");
my $te = HTML::TableExtract->new();
$te->parse($content);
print $te->tables_report(show_content=>1);
print "\n";
print "End\n";
The sites mentioned above for $content are just examples.. these aren't really the sites I'm extracting, but they work just like the site I'm really trying to scrape.
One option I guess is for me to use perl to download the page locally first and extract from there, but I'd rather not, if there's an easier way to do this (anyone that helps, please don't spend any crazy amount of time coming up with a complicated solution!).
The problem is related to the user agent that LWP::Simple uses, which is stopped at that site. Use LWP::UserAgent and set an allowed user agent, like this:
use strict;
use warnings;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $url = 'https://datatables.net/';
$ua->agent("Mozilla/5.0"); # set user agent
my $res = $ua->get($url); # send request
# check the outcome
if ($res->is_success) {
# ok -> I simply print the content in this example, you should parse it
print $res->decoded_content;
}
else {
# ko
print "Error: ", $res->status_line, "\n";
}
This is because datatables.net is blocking LWP::Simple requests. You can confirm this by using below code:
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
print is_success(getprint("https://datatables.net/"));
Output:
$ perl test.pl
403 Forbidden <URL:https://datatables.net/>
You could try using LWP::RobotUA. Below code works fine for me.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::RobotUA;
use HTML::TableExtract;
my $ua = LWP::RobotUA->new( 'bot_chankey/1.1', 'chankeypathak#stackoverflow.com' );
$ua->delay(5/60); # 5 second delay between requests
my $response = $ua->get('https://datatables.net/');
if ( $response->is_success ) {
my $te = HTML::TableExtract->new();
$te->parse($response->content);
print $te->tables_report(show_content=>1);
}
else {
die $response->status_line;
}
In the end, a combination of Miguel and Chankey's responses provided my solution. Miguel's made up most of my code, so I selected that as the answer, but here is my "final" code (got a lot more to do, but this is all I couldn't figure out.. the rest should be no problem).
I couldn't quite get either mentioned by Miguel/Chankey to work, but they got me 99% of the way.. then I just had to figure out how to get around the error "certificate verify failed". I found that answer with Miguel's method right away, so in the end, I mostly used his code, but both responses were great!
#!/usr/bin/perl
use lib qw( ..);
use strict;
use warnings;
use LWP::UserAgent;
use HTML::TableExtract;
use LWP::RobotUA;
use Data::Dumper;
my $ua = LWP::UserAgent->new(
ssl_opts => { SSL_verify_mode => 'SSL_VERIFY_PEER' },
);
my $url = 'https://WebsiteIUsedWasSomethingElse.com';
$ua->agent("Mozilla/5.0"); # set user agent
my $res = $ua->get($url); # send request
# check the outcome
if ($res->is_success)
{
my $te = HTML::TableExtract->new();
$te->parse($res->content);
print $te->tables_report(show_content=>1);
}
else {
# ko
print "Error: ", $res->status_line, "\n";
}
my $url = "https://ohsesfire01.summit.network/reports/slices";
my $user = 'xxxxxx';
my $pass = 'xxxxxx';
my $ua = new LWP::UserAgent;
my $request = new HTTP::Request GET=> $url;
# authenticate
$request->authorization_basic($user, $pass);
my $page = $ua->request($request);

How should I process HTML META tags with Mojo::UserAgent?

I have to play with some misconfigured web servers, so I started processing the HTML meta tags to feed information back into the web user-agent object. I tried a variety of ways of doing this in Mojolicious and settled on a looking for a "finish" event on the response. My goal was to make this mostly invisible to the rest of the code so the process wasn't even aware this was happening.
Still, this just doesn't sit right with me for a reason I can't quite put my finger on. Aside from the particular code in process_meta_options, is there a more Mojolicious way to do this? For example, Mojo::UserAgent get() with userdefined callback uses the read event, but I tend to think that might interfere with things. Or I could just be over-thinking it.
use v5.20;
use feature qw(signatures);
no warnings qw(experimental::signatures);
use Data::Dumper;
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new;
my $tx = $ua->build_tx( GET => 'http://blogs.perl.org' );
$tx->res->on(
finish => \&process_meta_options
);
$tx = $ua->start( $tx );
say "At end, charset is ", $tx->res->content->charset;
sub process_meta_options ( $res ) {
$res
->dom
->find( 'head meta[charset]' ) # HTML 5
->map( sub {
my $content_type = $res->headers->header( 'Content-type' );
return unless my $meta_charset = $_->{charset};
$content_type =~ s/;.*//;
$res->headers->header( 'Content-type', "$content_type; charset=$_->{charset}" );
} );
}
I think the answer is just what I came up with. I haven't found anything that I liked better.
use v5.20;
use feature qw(signatures);
no warnings qw(experimental::signatures);
use Data::Dumper;
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new;
my $tx = $ua->build_tx( GET => 'http://blogs.perl.org' );
$tx->res->on(
finish => \&process_meta_options
);
$tx = $ua->start( $tx );
say "At end, charset is ", $tx->res->content->charset;
sub process_meta_options ( $res ) {
$res
->dom
->find( 'head meta[charset]' ) # HTML 5
->map( sub {
my $content_type = $res->headers->header( 'Content-type' );
return unless my $meta_charset = $_->{charset};
$content_type =~ s/;.*//;
$res->headers->header( 'Content-type', "$content_type; charset=$_->{charset}" );
} );
}

Perl print the redirected url

I want to print the redirected url in perl.
Input url : http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv
output url : http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia
use LWP::UserAgent qw();
use CGI qw(:all);
print header();
my ($url) = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
print $res->request;
How to get this done in perl?
You need to examine the HTTP response to find the URL. The documentation of HTTP::Response gives full details of how to do this, but to summarise, you should do the following:
use strict;
use warnings;
use feature ':5.10'; # enables "say"
use LWP::UserAgent;
my $url = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
# you should add a check to ensure the response was actually successful:
if (! $res->is_success) {
say "GET failed! " . $res->status_line;
}
# show the base URI for the response:
say "Base URI: " . $res->base;
You can view redirects using HTTP::Response's redirects method:
if ($res->redirects) { # are there any redirects?
my #redirects = $res->redirects;
say join(", ", #redirects);
}
else {
say "No redirects.";
}
In this case, the base URI is the same as $url, and if you examine the contents of the page, you can see why.
# print out the contents of the response:
say $res->decoded_contents;
Right near the bottom of the page, there is the following code:
$(window).load(function() {
window.setTimeout(function() {
window.location = "http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia"
}, 300);
});
The redirect is handled by javascript, and so is not picked up by LWP::UserAgent. If you want to get this URL, you will need to extract it from the response contents (or use a different client that supports javascript).
On a different note, your script starts off like this:
use LWP::UserAgent qw();
The code following the module name, qw(), is used to import particular subroutines into your script so that you can use them by name (instead of having to refer to the module name and the subroutine name). If the qw() is empty, it's not doing anything, so you can just omit it.
To have LWP::UserAgent follow redirects, just set the max_redirects option:
use strict;
use warnings;
use LWP::UserAgent qw();
my $url = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new( max_redirect => 5 );
my $res = $ua->get($url);
if ( $res->is_success ) {
print $res->decoded_content; # or whatever
} else {
die $res->status_line;
}
However, that website is using a JavaScript redirect.
$(window).load(function() {
window.setTimeout(function() {
window.location = "http://www.snapdeal.com/product/vox-2-in-1-camcorder/1154987704?utm_source=aff_prog&utm_campaign=afts&offer_id=17&aff_id=1298&source=pricecheckindia"
}, 300);
});
This will not work unless you use a framework that enables JavaScript, like WWW::Mechanize::Firefox.
It will throw you an error for the last line $res - > request since it is returning hash and content from the response. So below is the code:
use LWP::UserAgent qw();
use CGI qw(:all);
print header();
my ($url) = "http://pricecheckindia.com/go/store/snapdeal/52517?ref=velusliv";
my $ua = LWP::UserAgent->new;
my $req = new HTTP::Request(GET => $url);
my $res = $ua->request($req);
print $res->content;

LWP::UserAgent and HTTP::Request for a POST request

In a certain script I tried to write this:
my $ua = LWP::UserAgent->new;
my $res = $ua->post($url, Content => $data);
and got "400 Bad Request".
After some reading I tried this:
my $ua = LWP::UserAgent->new;
my $req = HTTP::Request->new( 'POST', $url );
$req->content( $data );
my $res = $ua->request( $req );
and it worked, but I thought these two should do the same. What am I missing here?
Am I misunderstanding something in the documentation of HTTP::Request and LWP::UserAgent?
Is there a way to ask LWP::UserAgent to print what it is doing?
Here's one way to do it:
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
{
no strict "refs";
no warnings "redefine";
my $orig_sub = \&LWP::UserAgent::send_request;
*{"LWP::UserAgent::send_request"} = sub {
my ($self, $request) = #_;
print $request->as_string . "\n";
my $response = $orig_sub->(#_);
print $response->as_string . "\n";
return $response;
};
}
my $a = LWP::UserAgent->new;
my $response = $a->get("http://google.com");
It will print out all the requests and responses that LWP::UserAgent does.

How do I send POST data with LWP?

I want to make a program that communicates with http://www.md5crack.com/crackmd5.php. My goal is to send the site a hash (md5) and hopefully the site will be able to crack it. After, I would like to display the plaintext of the hash. My problem is sending the data to the site. I looked up articles about using LWP however I am still lost. Right now, the hash is not sending, some other junk data is. How would I go about sending a particular string of data to the site?
use HTTP::Request::Common qw(POST);
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
my $req = POST 'http://www.md5crack.com/crackmd5.php', [
maxlength=> '2048',
name=> 'term',
size=>'55',
title=>'md5 hash to crack',
value=> '098f6bcd4621d373cade4e832627b4f6',
name=>'crackbtn',
type=>'submit',
value=>'Crack that hash baby!',
];
$content = $ua->request($req)->as_string;
print "Content-type: text/html\n\n";
print $content;
You are POSTing the wrong data because you're taking the HTML to specify the widget and conflating it with the data it actually sends. The corrected data would be to just send the widget name and its value:
term: 098f6bcd4621d373cade4e832627b4f6
Instead, the data that is getting POSTed currently is:
maxlength: 2048
name: term
size: 55
title: md5 hash to crack
value: 098f6bcd4621d373cade4e832627b4f6
name: crackbtn
type: submit
value: Crack that hash baby!
Corrected program:
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Request::Common qw{ POST };
use CGI;
my $md5 = '098f6bcd4621d373cade4e832627b4f6';
my $url = 'http://www.md5crack.com/crackmd5.php';
my $ua = LWP::UserAgent->new();
my $request = POST( $url, [ 'term' => $md5 ] );
my $content = $ua->request($request)->as_string();
my $cgi = CGI->new();
print $cgi->header(), $content;
You can also use LWP::UserAgent's post() method:
use strict;
use warnings;
use LWP::UserAgent;
use CGI;
my $md5 = '098f6bcd4621d373cade4e832627b4f6';
my $url = 'http://www.md5crack.com/crackmd5.php';
my $ua = LWP::UserAgent->new();
my $response = $ua->post( $url, { 'term' => $md5 } );
my $content = $response->decoded_content();
my $cgi = CGI->new();
print $cgi->header(), $content;
Always remember to use strict and use warnings. It is considered good practice and will save your time.
It used to be that crackers would figure this sort of stuff out by reading. There are examples in HTTP::Request::Common, which LWP::UserAgent tells you to check out for sending POST data. You only need to send the form data, not the meta data that goes with it.
You might have an easier time using WWW::Mechanize since it has a much more human-centric interface.