I was trying to eliminate the logging in process to a website by reading the browser cookies (which I created by logging in using Firefox earlier). I exported it from Firefox using this Firefox addon. It gives a 200 OK response but returns the generic homepage instead of my custom 'logged in' home page. How do I make sure that cookie is passed to the server properly ?
#!/usr/bin/perl
use strict ;
use warnings;
use LWP::UserAgent;
use HTTP::Cookies::Netscape;
my #GHeader = (
'User-Agent' => 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.19) Gecko/2010040200 Ubuntu/8.04 (hardy) Firefox/3.0.19',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' => 'en-us,en;q=0.5',
'Accept-Charset' => 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Encoding' => 'gzip,deflate',
'Keep-Alive' => '300',
'Connection' => 'keep-alive'
);
my $cookie_jar = HTTP::Cookies::Netscape->new(
file => "cookies.txt",
);
my $Browser = LWP::UserAgent->new;
$Browser->cookie_jar( $cookie_jar );
my ($OutLine,$response)=();
my $URL = 'http://www.hanggliding.org/';
printf("Get [%s]\n",$URL);
$response = $Browser->get($URL,#GHeader);
if($response->is_success)
{
if($response->status_line ne "200 OK")
{
printf("%s\n", $response->status_line);
}
else
{
printf("%s\n", $response->status_line);
$OutLine =$response->decoded_content;
open(HTML,">out.html");printf HTML ("%s",$OutLine);close(HTML);
}
}
else
{
printf("Failed to get url [%s]\n", $response->status_line);
}
You can inject a handler to access or modify request/response data during processing.
Quoting LWP::UserAgent's docs:
Handlers are code that injected at various phases during the processing of requests. The following methods are provided to manage the active handlers:
$ua->add_handler( $phase => \&cb, %matchspec )
Add handler to be invoked in the given processing phase. For how to specify %matchspec see "Matching" in HTTP::Config.
...
request_send => sub { my($request, $ua, $h) = #_; ... }
This handler gets a chance of handling requests before they're sent to the protocol handlers. It should return an HTTP::Response object if it wishes to terminate the processing; otherwise it should return nothing.
From there, you can inject a handler which will analyze the request object, but otherwise do nothing:
use LWP::UserAgent;
use Data::Dumper;
sub dump_request {
my ($request, $ua, $h) = #_;
print Dumper($request);
return undef;
}
my $browser = LWP::UserAgent->new;
$browser->add_handler(
request_send => \&dump_request,
m_method => 'GET'
);
$browser->get('http://www.google.com');
Related
I'm using atom and testing out HTTP::Request::Generator PERL module. Code , below works on most part but I'm unable to send cookies or headers, it only displays default headers even when I have set in my code.
use strict;
use warnings;
use HTTP::Request::Generator 'generate_requests';
use LWP::UserAgent;
my $ua = 'LWP::UserAgent'->new;
my $gen = generate_requests(
method => 'GET',
host => [ 'https://abc.ai/' ],
pattern => 'https://abc.ai',
headers => {
"User-Agent" => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64',
"Cookie" => '_abc',
},
wrap => sub {
my ( $req ) = #_;
# Fix up some values
$req->{'headers'}{'Content-Length'} = 666;
},
wrap => \&HTTP::Request::Generator::as_http_request,
);
while ( my $req = $gen->() ) {
my $response = $ua->request( $req );
# print $response->protocol, ' ', $response->status_line, "\n";
print $req->headers->as_string, "\n";
print $req->as_string();
# Do something with $response here?
if ($response->is_success) {
# print $response->decoded_content;
print $response ->header('title');
}
else {
die $response->status_line;
}
}
Output
User-Agent: libwww-perl/6.31
Login
The title page indicate me I'm not logged in this cookie is fine and i have tested it using curl i can manually login and retrieve required resource. Why its failing for perl, how can access my header options in code above. Thanks.
Solution
body_params => {
comment => ['Some comment', 'Another comment, A++'],
},
Got it solved adding above code.
You can't provide the same option (wrap) twice:
wrap => sub {
my ( $req ) = #_;
# Fix up some values
$req->{'headers'}{'Content-Length'} = 666;
},
wrap => \&HTTP::Request::Generator::as_http_request,
This may work though:
wrap => sub {
my ( $req ) = #_;
# Fix up some values
$req->{'headers'}{'Content-Length'} = 666;
return HTTP::Request::Generator::as_http_request( $req );
},
Also the headers option appears to take an arrayref of hashrefs, like this:
headers => [
{
"User-Agent" => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
"Cookie" => '_abc',
},
],
I guess the reason for that is so you can provide alternative sets of headers:
headers => [
{
"User-Agent" => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
"Cookie" => '_abc',
},
{
"User-Agent" => 'Mozilla/1.0 (Hoover Vacuum Cleaner)',
"Cookie" => '_def',
},
],
That way your request generator can generate two requests for each page, using different User-Agent strings, or different cookies (so logged in as different users), or different Accept headers, or whatever.
Using the concurrent asynchronous URL example for Net::Async::HTTP, the first encounter of bad URL (timeout, doesn't exist, etc) error causes the program to fail and exit completely, without continuing to the next URL in the array. Is the problem my code or the module?
I tried setting fail_on_error to 0, and even 1, but it had no obvious results.
#!/bin/perl
use IO::Async::Loop;
use Net::Async::HTTP;
use Future::Utils qw(fmap_void);
use strict;
use warnings;
use feature 'say';
my $ua_string = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.81 Safari/537.36";
my $timeout = 10;
my $max_redirects = 10;
my $max_in_flight = 10;
my $max_connections_per_host = 10;
my $stall_timeout = 10;
my $max_recurse = "10";
my $max_per_host = "10";
my #URLs = ( "http://cnn.com", "http://google.com", "http://sdfsdfsdf24.com", "http://msn.net" );
my $loop = IO::Async::Loop->new();
my $http = Net::Async::HTTP->new();
$loop->add($http);
my $future = fmap_void {
my ( $url ) = #_;
$http->configure(user_agent => $ua_string);
$http->configure(timeout => $timeout );
$http->configure(max_redirects => $max_redirects);
$http->configure(max_in_flight => $max_in_flight);
$http->configure(max_connections_per_host => $max_connections_per_host);
$http->configure(stall_timeout => $stall_timeout);
$http->configure(fail_on_error => '0' );
$http->GET($url)->on_done(
sub {
my $response = shift;
say "Response: $response->code";
}
)->on_fail(
sub {
my $fail = shift;
say "Failed: $fail";
}
);
}
foreach => \#URLs;
$loop->await($future);
Your example really works well without any proxy, I tested and did some changes:
Fetching URL: '. $url;
$http->GET($url)->on_done(
sub {
my $response = shift;
say "Response: ".$response->code();
}
)->on_fail(
sub {
my $fail = shift;
say "Failed: " . $fail;
}
);
Output:
Fetching URL: http://cnn.com
Response: 200
Fetching URL: http://google.com
Response: 302
Fetching URL: http://sdfsdfsdf24.com
Response: 403
Fetching URL: http://msn.net
Response: 200
As this example is not doing async call's the URL's are on a queue and being processed one by one.
Behind the scenes when you are doing a request to a target in your case to some URL's, at the low level the connection is made through a socket connection.
If you have a proxy which is not configured between your script and the intenet, there is no connection and it will raise an exception and your script will die like:
Fetching URL: http://cnn.com
Failed: Timed out
The variable $! is set and the error "Operation now in progress" appears, in fact your request didn't established any connection it just tried to establish one without success.
There are some points which you can check for example:
1 - Is the proxy working ?
2 - Do I have internet connection ?
3 - Is the URL I am testing working ?
If you are having problems with proxy, your script need a small adjust that you can get more info in the docs:
$http->configure( proxy_host => 'xx.xx.xx.xx');
$http->configure( proxy_port => 1234);
Supposing that your proxy is configured, you can check if you have fully access to the internet and aim some target like that URL's.
Trying to access the URLs it will provide you a response code and depending on the code you can do something.
As an alternative solution you could use LWP::UserAgent to make simple requests and check the response code.
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->timeout(10);
$ua->env_proxy;
my $response = $ua->get('http://search.cpan.org/');
if ($response->is_success) {
print $response->decoded_content; # or whatever
}
else {
die $response->status_line;
}
And even with some bad stats like 4XX for example Net::Async::HTTP won't be friendly to use this module for a simple purpose as it can't handle the exceptions like you want.
I have managed to login with the code below. Now I can do it ony once a day.
And then I cant login, but get the login page in the response.
But when i print $reqstr from the code below and paste it to browser(like firefox), I can log in.
Wget doesnt work neiter. Only normal browser.
Soemtimes it seems , that Im logged in, but only get such content:
"<html>\cJ<head>\cJ\cI<meta http-equiv=\"content-type\" content=\"text/html; charset=ISO-8859-1\"><meta http-equiv=\"expires\" content=\"0\"><meta http-equiv=\"pragma\" content=\"no-cache\">\cJ\cI<meta http-equiv=\"refresh\" content=\"0; URL='https://www.address.com/'\">\cJ</head>\cJ</html>\cJ"
I also noticed, that while I cant login, Im getting this part in a debugger:
_uri_canonical' => URI::https=SCALAR(0x17dad28)
-> REUSED_ADDRESS
'handlers' => HASH(0x22dc0c0)
'response_data' => ARRAY(0x22ee8b8)
0 HASH(0x22d9a48)
'callback' => CODE(0x22dba30)
-> &LWP::UserAgent::__ANON__[/usr/lib/perl5/vendor_perl/5.10.0/LWP/UserAgent.pm:682] in /usr/lib/perl5/vendor_perl/5.10.0/LWP/UserAgent.pm:679-682
1 HASH(0x22eea08)
'callback' => CODE(0x22d9cb8)
-> &LWP::Protocol::__ANON__[/usr/lib/perl5/vendor_perl/5.10.0/LWP/Protocol.pm:138] in /usr/lib/perl5/vendor_perl/5.10.0/LWP/Protocol.pm:135-138
Any clue?
Here the code:
my $b = LWP::UserAgent->new(agent => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.5) Gecko/20060719 Firefox/31.2.0',);
my $cookie_jar = HTTP::Cookies->new(
file => 'lwp_cookies.txt',
autosave => 1,
ignore_discard => 1,
);
$cookie_jar->clear;
$cookie_jar->clear_temporary_cookies;
$b->cookie_jar($cookie_jar);
my $url = "https://www.address.com";
my $r = $b->get($url);
$r->decoded_content =~ /FORM ACTION="(.*?)" METHOD/msgi;
my $a = "$url$1";
print $a."\n";
my $reqstr = $a."&LoginAction=Login&Number=55555&KPassword=passw&UserID=uid";
my $req = HTTP::Request->new(POST => $reqstr);
$req->header('Host', 'www.address.com');
$req->header('User-Agent', 'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0');
$req->header('Connection', 'keep-alive');
$req->header('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8');
my $c = $b->request($req);
You need to re-request that page with the referrer added via referer() for LWP::UserAgent (or see my second answer if you aren't wedded to that module)
sub login { # Code not tested and not really compilable, just a stub for you
my (#other_args, $url, $referrer_url) = #_;
# Add your login code from the question, up to calling $b->request()
$req->referer($referrer_url) if $referrer_url;
my $c = $b->request($req);
return $c; # Or return the response?
}
my $result1 = login($original_login_url); #first try
# Obtain the redirect_url from the response.
# If it was a 301 redirect, you can do it via
# my #redirects = $response->redirects();
my $referrer_url = $original_login_url;
my $result2 = login($redirect_url, $referrer_url);
References:
http://forums.devshed.com/perl-programming-6/lwp-meta-refresh-tag-handling-63484.html
http://www.herongyang.com/Perl/LWP-UserAgent-Follow-HTTP-Redirects.html
If you aren't dead set on using LWP::UserAgent, use WWW::Mechanize instead.
Best approach: use WWW::Mechanize::Plugin::FollowMetaRedirect. The SYNOPSIS is pretty short and to the point:
use WWW::Mechanize;
use WWW::Mechanize::Plugin::FollowMetaRedirect;
my $mech = WWW::Mechanize->new;
$mech->get( $url );
$mech->follow_meta_redirect;
# Optionally, skip emulating the waiting time
$mech->follow_meta_redirect( ignore_wait => 1 );
If you don't have access to that module, you can create your own, similar to this: http://www.perlmonks.org/?node_id=487286
(Basically, parse the returned content using the regex shudder to extract the refresh URL, and get that URL. As per my other answer, you might need to add the referrer header)
use LWP::UserAgent;
use Data::Dumper;
my $ua = new LWP::UserAgent;
$ua->agent("AgentName/0.1 " . $ua->agent);
my $req = new HTTP::Request POST => 'http://example.com';
$req->content('port=8', 'target=64'); #problem
my $res = $ua->request($req);
print Dumper($res->content);
How can I send multiple pieces of content using $req->content? What kind of data does $req->content expect?
It only sends the last one.
Edit:
Found out if i format it like 'port=8&target=64' it works. Is there a better way?
my $ua = LWP::UserAgent->new();
my $request = POST( $url, [ 'port' => 8, 'target' => 64 ] );
my $content = $ua->request($request)->as_string();
The answer given didn't work for me. I still had the same problem as OP.
The documentation for LWP::UserAgent wants a hash or array reference.
This works:
my $url = 'https://www.google.com/recaptcha/api/siteverify';
my $ua = LWP::UserAgent->new();
my %form;
$form{'secret'}='xxxxxxxxxxxxxxxxxxxxxxx';
$form{'response'}=$captchaResponse;
my $response = $ua->post( $url, \%form );
my $content = $response->as_string();
Using together LWP::UserAgent and HTTP::Request as it is also quite common if not even more frequent practice , I was little puzzled that the standard POST and GET / request were almost not discussed at SO aside from json as them are in vast majority used.
POST
my $ua = LWP::UserAgent->new();
my $req = new HTTP::Request(
'POST' => "http://url/path",
['Content-Type' => 'application/x-www-form-urlencoded; charset=UTF-8'],
'par1=par1value&par2=par2value'
);
$ua->request($req);
similarly for the GET
my $ua = LWP::UserAgent->new();
my $req = new HTTP::Request(
'GET' => "http://url/path",
['Content-Type' => 'application/x-www-form-urlencoded; charset=UTF-8'],
'par1=par1value&par2=par2value' # or I presume attaching the query string directly to the url
);
$ua->request($req);
another format form , where the first two parameters (method and url) are not fused into a one, not like the previous, but separately
my $request = HTTP::Request->new( 'POST', $url, [ parameter1 => 'parameter1Value' ] );
request->header( 'Content-Type' => 'application/json' )
There is a similar question, but just regards LWP and Json, but it could be probably accomplished only by using both LWP and HTTP::Request together as suggested by that question chosen answer, and the POST and GET were missing there but it might not have been obvious
How can I make a JSON POST request with LWP?
Note:
I post this specially also, since the concrete/concise usage for POST/GET is not mentioned even in the documentation
https://metacpan.org/pod/HTTP::Request
so i'm having a problem trying to automatically login to a internal website. I'm able to send a post request but in the response I always get the Header Connection: close. I've tried to pass is through the post request but it still seems to respond with Connection: close. I want to be able to navigate through the website so I need the Connection: keep-alive so that i can send more request. Could anyone tell me what I'm doing wrong? here's the code:
#usr/bin/perl
#NetTelnet.pl
use strict; use warnings;
#Sign into cfxint Unix something...
use Net::Telnet;
# Create a new instance of Net::Telnet,
my $telnetCon = new Net::Telnet (Timeout => 10,
Prompt => '/bash\$ $/') or die "Could not make connection.";
my $hostname = 'cfxint';
# Connect to the host of the users choice
$telnetCon->open(Host => $hostname,
Port => 23) or die "Could not connect to $hostname.";
use WWW::Mechanize;
my $mech = WWW::Mechanize->new(cookie_jar => {});
&login_alfresco;
sub login_cxfint {
#get username and password from user
my $CXusername = '';
my $CXpassword = '';
# Recreate the login
# Wait for the login: message and then enter the username
$telnetCon->waitfor(match => '/login:/i');
# this method adds a \n to the end of the username, it mimics hitting the enter key after entering your username
$telnetCon->print($CXusername);
# does the same as the previous command but for the password
$telnetCon->print($CXpassword);
#Wait for the login successful message
$telnetCon->waitfor();
}
sub login_alfresco{
my $ALusername = '';
my $ALpassword = '';
$mech->get('http://documents.ifds.group:8080/alfresco/faces/jsp/login.jsp');
my $res = $mech->res;
my $idfaces = '';
if($res->is_success){
my $ff = $res->content;
if($ff =~ /id="javax.faces.ViewState" value="(.*?)"/){
$idfaces = $1;
}
else {
print "javax.faces /Regex error?\n";
die;
}
}
print $idfaces, "\n";
#Send the get request for Alfresco
$mech->post('http://documents.ifds.group:8080/alfresco/faces/jsp/login.jsp',[
'loginForm:rediretURL' =>,
'loginForm:user-name' => $ALusername,
'loginForm:user-password' => $ALpassword,
'loginForm:submit' => 'Login',
'loginForm_SUBMIT' => '1',
'loginForm:_idcl' => ,
'loginForm:_link_hidden_' => ,
'javax.faces.ViewState' => $idfaces], **'Connection' =>'keep-alive'**);
$res = $mech->res;
open ALF, ">Alfresco.html";
print ALF $mech->response->as_string;
if($res->is_success){
my $ff = $res->content;
if($ff =~ /id="javax.faces.ViewState" value="(.*?)"/){
$idfaces = $1;
}
else {
print "javax.faces /Regex error?\n";
die;
}
}
print $idfaces, "\n";
#Logout
$mech->post('http://documents.ifds.group:8080/alfresco/faces/jsp/extension/browse/browse.jsp', [
'browse:serach:_option' => '0',
'browse:search' => ,
'browse:spaces-pages' => '20',
'browse:content-pages' => '50',
'browse_SUBMIT' => '1',
'id' => ,
'browse:modelist' => '',
'ref'=>'',
'browse:spacesList:sort' => ,
'browse:_idJsp7' => ,
'browse:sidebar-body:navigator' => ,
'browse:contentRichList:sort' => ,
'browse:act' => 'browse:logout',
'outcome' => 'logout',
'browse:panel' => ,
'javax.faces.ViewState' => $idfaces,])
}
You can enable keep-alive by using a connection cache:
use LWP::ConnCache;
...
$mech->conn_cache(LWP::ConnCache->new);
All that header means is that the connection will be closed upon completion of the request, instead of being kept open for possible further requests. This is perfectly normal and should not interfere with sending the request.
EDIT: If you're sending a Connection:Keep-Alive and the server is still responding with Connection:Close, then the server configuration needs to be changed. The default for HTTP/1.1 is persistent connections, so the server must explicitly be configured to send Connection:Close. See Section 8 of RFC2616.