use WWW::Mechanize;
my $mech = WWW::Mechanize->new;
$mech->get( $url );
say $mech->text;
How could I get the same result with Mojo::UserAgent?
I tried this, but it doesn't return the same:
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new;
say $ua->get( $url )->res->dom->all_text;
Simply repeat what method text does: see as_text in HTML::Element.
You can try
$ua->get( $url )->res->dom->all_text(0);
for untrimmed output. Or you may need some kind of traversal over child nodes.
Related
The if statement is showing me that there is a response, but when I try to print the response I get nothing
use LWP::UserAgent;
use strict;
use warnings;
use HTTP::Request::Common;
# use this {"process": "mobileGps","phone": "9565551236"}
my $url = "the url goes here";
my $json = '{data :{"process" : "mobileGps", "phone" : "9565551236"}}';
my $req = HTTP::Request->new( POST => $url );
$req->header( 'Content-Type' => 'application/json' );
$req->content( $json );
my $ua = LWP::UserAgent->new;
my $res = $ua->request( $req );
if ( $res->is_success ) {
print "It worked";
print $res->decoded_content;
}
else {
print $res->code;
}
I do have the URL: I just took it out for the purpose of this example.
What am I missing?
Try debugging your script like this:
use strict;
use warnings;
use HTTP::Request::Common;
use LWP::ConsoleLogger::Easy qw( debug_ua );
use LWP::UserAgent;
# use this {"process": "mobileGps","phone": "9565551236"}
my $url = "the url goes here";
my $json = '{data :{"process" : "mobileGps", "phone" : "9565551236"}}';
my $req = HTTP::Request->new(POST => $url);
$req->header('Content-Type' =>'application/json');
$req->content($json);
my $ua = LWP::UserAgent->new;
debug_ua( $ua );
my $res = $ua->request($req);
if ($res->is_success) {
print "It worked";
print $res->decoded_content;
} else {
print $res->code;
}
That will (hopefully) give you a better idea of what's going on.
Can you not use the debugger, or add some print statements to see how your program is progressing?
If not then this is going to be another case of on-line turn-by-turn debugging, which benefits no one except the OP, and the ultimate diagnosis is that they should have learned the language first
The internet can be wise, but it will make many more artisans Pretender than craftsmen
Please don't ever expect to make a half-hearted attempt at a sketch, and then rope in the rest of the world to finish your job. It takes a huge amount of experience, aptitude, and understanding to get even a "What's your name" .. "Hello" program working, and things only get harder thereafter
If you don't like being careful and thorough, and would rather ask for people to do your stuff for you than discover a solution by experimentation, then you are a manager, not a programmer. I hope you will never try to advance a software career by getting great at delegating, because that doesn't work with software
Here. Use this as you will. The world is full of managers; it is good programmers that we need
use strict;
use warnings 'all';
use feature 'say';
use constant URL => 'http://example.com/';
use LWP;
my $ua = LWP::UserAgent->new;
my $json = '{}';
my $req = HTTP::Request->new( POST => URL );
$req->header( content_type => 'application/json' );
$req->content( $json );
my $res = $ua->request( $req );
say $res->as_string;
The code is fine. The problem must be with the server that is serving the request upon status code 200. You should check at server's end.
I'm failing to get a node by its id.
The code is straight forward and should be self-explaining.
#!/usr/bin/perl
use Encode;
use utf8;
use LWP::UserAgent;
use URI::URL;
use Data::Dumper;
use HTML::TreeBuilder::XPath;
my $url = 'https://www.airbnb.com/rooms/1976460';
my $browser = LWP::UserAgent->new;
my $resp = $browser->get( $url, 'User-Agent' => 'Mozilla\/5.0' );
if ($resp->is_success) {
my $base = $resp->base || '';
print "-> base URL: $base\n";
my $data = $resp->decoded_content;
my $tree= HTML::TreeBuilder::XPath->new;
$tree->parse_content( $resp->decoded_content() );
binmode STDOUT, ":encoding(UTF-8)";
my $price_day = $tree->find('.//*[#id="price_amount"]/');
print Dumper($price_day);
$tree->delete();
}
The code above prints:
-> base URL: https://www.airbnb.com/rooms/1976460
$VAR1 = undef;
How can I select a node by its ID?
Thanks in advance.
Take that / off the end of that XPath.
.//*[#id="price_amount"]
should do. As it is, it's not valid XPath.
There is a trailing slash in your XPath, that you need to remove
my $price_day = $tree->find('.//*[#id="price_amount"]');
However, from my own testing, I believe that HTML::TreeBuilder::XPath is also having trouble parsing that specific URL. Perhaps because of the conditional comments?
As an alternative approach, I would recommend using Mojo::UserAgent and Mojo::DOM instead.
The following uses the css selector div#price_amount to easily find your desired element and print it out.
use strict;
use warnings;
use Mojo::UserAgent;
my $url = 'https://www.airbnb.com/rooms/1976460';
my $dom = Mojo::UserAgent->new->get($url)->res->dom;
my $price_day = $dom->at(q{div#price_amount})->all_text;
print $price_day, "\n";
Outputs:
$285
Note, there is a helpful 8 minute introductory video to this set of modules at Mojocast Episode 5.
I am running
#!/usr/bin/perl -w
use strict;
use LWP::Simple;
The variable $site has the html code.
Also you can use the function getstore to save the html data to a file, like:
my $http_code = getstore( 'http://www.google.com/', 'google.html' );
It would help you a lot if you could see the reason for the failure. I suggest you use the core LWP instead of the simple version. Like this:
#!/usr/bin/perl
use strict;
use warnings;
use LWP;
my $ua = LWP::UserAgent->new;
my $response = $ua->get('http://www.google.com/');
die 'Couldn't get it: ', $response->status_line unless $response->is_success;
my $site = $response->decoded_content;
print 'Got it.';
use LWP::Simple;
use HTML::LinkExtor;
use Data::Dumper;
#my $url = shift #ARGV;
my $content = get('example.com?GET=whateverIwant');
my $parser = HTML::LinkExtor->new(); #create LinkExtor object with no callbacks
$parser->parse($content); #parse content
now if I want to send POST and COOKIE info as well with the HTTP header how can I configure that with the get funciton? or do I have to customize my own method?
My main interest is Cookies! then Post!
LWP::Simple is for very simple HTTP GET requests. If you need to do anything more complex (like cookies), you have to upgrade to a full LWP::UserAgent. The cookie_jar is a HTTP::Cookies object, and you can use its set_cookie method to add a cookie.
use LWP::UserAgent;
my $ua = LWP::UserAgent->new(cookie_jar => {}); # create an empty cookie jar
$ua->cookie_jar->set_cookie(...);
my $rsp = $ua->get('example.com?GET=whateverIwant');
die $rsp->status_line unless $rsp->is_success;
my $content = $rsp->decoded_content;
...
The LWP::UserAgent also has a post method.
You might want to use WWW::Mechanize instead. It already glues together most of the stuff that you want:
use WWW::Mechanize;
my $mech = WWW::Mechanize->new;
$mech->cookie_jar->set_cookie(...);
$mech->get( ... );
my #links = $mech->links;
I want to make a program that communicates with http://www.md5crack.com/crackmd5.php. My goal is to send the site a hash (md5) and hopefully the site will be able to crack it. After, I would like to display the plaintext of the hash. My problem is sending the data to the site. I looked up articles about using LWP however I am still lost. Right now, the hash is not sending, some other junk data is. How would I go about sending a particular string of data to the site?
use HTTP::Request::Common qw(POST);
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
my $req = POST 'http://www.md5crack.com/crackmd5.php', [
maxlength=> '2048',
name=> 'term',
size=>'55',
title=>'md5 hash to crack',
value=> '098f6bcd4621d373cade4e832627b4f6',
name=>'crackbtn',
type=>'submit',
value=>'Crack that hash baby!',
];
$content = $ua->request($req)->as_string;
print "Content-type: text/html\n\n";
print $content;
You are POSTing the wrong data because you're taking the HTML to specify the widget and conflating it with the data it actually sends. The corrected data would be to just send the widget name and its value:
term: 098f6bcd4621d373cade4e832627b4f6
Instead, the data that is getting POSTed currently is:
maxlength: 2048
name: term
size: 55
title: md5 hash to crack
value: 098f6bcd4621d373cade4e832627b4f6
name: crackbtn
type: submit
value: Crack that hash baby!
Corrected program:
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Request::Common qw{ POST };
use CGI;
my $md5 = '098f6bcd4621d373cade4e832627b4f6';
my $url = 'http://www.md5crack.com/crackmd5.php';
my $ua = LWP::UserAgent->new();
my $request = POST( $url, [ 'term' => $md5 ] );
my $content = $ua->request($request)->as_string();
my $cgi = CGI->new();
print $cgi->header(), $content;
You can also use LWP::UserAgent's post() method:
use strict;
use warnings;
use LWP::UserAgent;
use CGI;
my $md5 = '098f6bcd4621d373cade4e832627b4f6';
my $url = 'http://www.md5crack.com/crackmd5.php';
my $ua = LWP::UserAgent->new();
my $response = $ua->post( $url, { 'term' => $md5 } );
my $content = $response->decoded_content();
my $cgi = CGI->new();
print $cgi->header(), $content;
Always remember to use strict and use warnings. It is considered good practice and will save your time.
It used to be that crackers would figure this sort of stuff out by reading. There are examples in HTTP::Request::Common, which LWP::UserAgent tells you to check out for sending POST data. You only need to send the form data, not the meta data that goes with it.
You might have an easier time using WWW::Mechanize since it has a much more human-centric interface.