Perl - How to request HTTPS website - perl

I am trying to access a HTTPS website, but it gives me error. I tried with this scripts:
Script 1:
use strict;
use warnings;
use LWP::UserAgent;
my $B = new LWP::UserAgent (agent => 'Mozilla/5.0', cookie_jar =>{});
my $GET = $B->get('https://moz.com')->content;
print $GET;
Script 2:
use strict;
use warnings;
use LWP::UserAgent;
use Mozilla::CA;
my $B = new LWP::UserAgent (agent => 'Mozilla/5.0', cookie_jar =>{});
$B->ssl_opts( SSL_ca_file => Mozilla::CA::SSL_ca_file() );
$B->ssl_opts( verify_hostname => 1 );
my $GET = $B->get('https://moz.com')->content;
print $GET;
I get this error with both:
Can't connect to moz.com:443
LWP::Protocol::https::Socket: SSL connect attempt failed with unknown error error:00000000:lib(0):func(0):reason(0) at C:/Perl/site/lib/LWP/Protocol/http.pm line 47.
I'm using ActivePerl 5.16.1 Build 1601 on Windows 7 Ultimate.
Any idea how to access a HTTPS website using Perl?

https calls will fail if you do not have the LWP::Protocol::https module installed.

Related

Is it possible to add some debug output when LWP::UserAgent connected to proxy?

I have a very simple example Perl script
$ cat proxy.pl
use strict;
use warnings;
use LWP::UserAgent ();
my $ua = LWP::UserAgent->new;
$ua->timeout(10);
$ua->show_progress(1);
$ua->proxy([
[ 'http' ] => 'http://142.93.46.194:3128/',
]);
$ua->get('http://example.com');
which prints following output
$ perl proxy.pl
** GET http://example.com ==> 200 OK
Any ready-to-use module which adds to debug output one more string which shows proxy server connection status ?

Perl HTTPS over proxy using LWP::UserAgent

I wish to request a URL via a HTTPS proxy using perl's LWP::UserAgent module. There is quite a few reference around this, but nothing could help me get it work.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use Data::Dumper;
BEGIN {
$ENV{HTTPS_PROXY} = 'https://<IP>:<PORT>';
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
$ENV{HTTPS_PROXY_USERNAME} = '<API_KEY>';
$ENV{HTTPS_PROXY_PASSWORD} = '';
$ENV{HTTPS_DEBUG} = 1; #Add debug output
}
my $ua = LWP::UserAgent->new(ssl_opts => {verify_hostname => 0}, SSL_version => 'SSLv3', allowed_protocols => ['https', 'http']);
$ua->proxy(['https', 'http'], 'https://<IP>:<PORT>');
my $req = HTTP::Request->new('GET','https://<DOMAIN_URL>');
print STDERR Dumper($ua);
my $response = $ua->request($req);
print $response->code ."\n";
print STDERR Dumper($response);
I get this error:
SSL connect attempt failed error:140770FC:SSL
routines:SSL23_GET_SERVER_HELLO:unknown protocol at
/home/user/project/local/lib/perl5/LWP/Protocol/http.pm line 51.
Despite me specifying allowed_protocols in LWP, https scheme in proxy and the url scheme being https, it still goes to LWP::Protocol::HTTP above and not to LWP::Protocol::HTTPS.
I also verified that the version of LWP::Protocol::HTTPS is 6.06 which is the same as LWP::UserAgent (which was mentioned in one of the forums)
This worked for me:
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://<USER>:<PSWD>#<IP>:<PORT>/');
$ua->get('https://www.somesslsite.com');
HTTPS Proxy and LWP::UserAgent
Note: The environment credentials (HTTPS_PROXY_USERNAME) didnt work for me. I had to enter it in the URL like above.
The correct setup with recent versions of LWP (starting with version 6.06 which you have) is to just use the same syntax as found in other applications together with the proxy function:
my $ua = LWP::UserAgent->new;
$ua->proxy(https => 'http://user:pass#proxy');
$ua->get('https://server');
Alternatively you could set the environment variable https_proxy to the same value, i.e. http://user:pass#proxy.
Before 6.06 proxy support was broken at least when used together with IO::Socket::SSL (default since version 6.0). The syntax you have in your question is for the old backend Crypt::SSLeay which is no longer recommended because it does not check the certificates properly.

LWP HTTPS proxy timing out

I am trying to connect via HTTPS to a site using LWP::Simple. I have set the environment variables both inside of the script and in the shell. I continue to get a 500 Connection Timed Out. I can connect to a HTTP site just fine.
The proxy is letting communication through. I can connect to the HTTPS site through the proxy using curl without a problem.
Any suggestions?
#!/usr/bin/perl
use warnings;
use Net::SSL;
use LWP::UserAgent;
use LWP::Debug qw(+);
use Data::Dumper;
$ENV{PERL_NET_HTTPS_SSL_SOCKET_CLASS} = "Net::SSL";
$ENV{HTTPS_DEBUG} = 1;
$ENV{HTTPS_VERSION} = 2;
my $ua = LWP::UserAgent->new (verify_hostname => 0);
$ua->ssl_opts(verify_hostname => 0,
SSL_verify_mode => 0x00);
$ua->proxy('https' => 'http://x.x.x.x:3128');
print $ua->proxy('https');
print Dumper($ua);
my $response = $ua->get('https://qualys.com/');
print Dumper ($response);
if ($response->is_success) {
print $response->decoded_content; # or whatever
exit(0);
}
else {
print "\nFail:\n";
print $response->status_line ."\n";
exit(1);
}
HTTPS proxy support for LWP is (or at least was until some days ago) broken, at least if you use IO::Socket::SSL, which is the default for LWP versions >=6.0 because it provides better security. It should be much better now with LWP 6.0.6 and LWP::Protocol::https 6.0.6 which were released few days ago. If you cannot upgrade you might try using Net::SSLGlue::LWP which patches LWP for better https proxy support.

Why might Perl allow for http websites using TOR but not https?

I am having difficulty using perl to visit a website via TOR if it is an https site but not if it is an http site.
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http', 'https'], 'socks://localhost:9150');
$mech->get("https://www.google.com");
I am receiving the error message: Error GETing https://www.google.com: Status read failed: Bad file descriptor at line 10," where line i10 is the last line of the program.
In the TOR browser, I can successfully view: "https://www.google.com" with a port of 9150.
I am using ActivePerl 5.16.2; Vadalia 0.2.21 and Tor 0.2.3.25.
I have a Windows machine and my primary internet browser is Mozilla.
I have tried installing packages with the commands:
cpan LWP::UserAgent
ppm install LWP::Protocol::https
cpan LWP::Protocol::https
ppm install LWP::Protocol::socks
cpan LWP::Protocol::socks
ppm install Mozilla::CA
ppm install IO::Socket::SSL
ppm install Crypt::SSLeay
cpan Crypt::SSLeay
Thank you for any help! Please let me know whether there is any further information that I can provide.
Time ago, i'd found the way to go throught https sites with Tor using WWW::Curl::Easy to fetch those kind of sites, because using LWP i found the same problems.
After that i save all html in files and parsing them using WWW::Mechanzie or HTML::TreeBuilder.
If you want more interactivity with site like post forms , etc. This solutions may be more tedious because you'll need to interact with curl.
package Curl;
use warnings;
use WWW::Curl::Easy;
use WWW::UserAgent::Random;
my $curl = WWW::Curl::Easy->new;
my $useragent = rand_ua("browsers");
my $host = 'localhost';
my $port = '9070';
my $timeout = '20';
my $connectTimeOut= '20';
&init;
sub get
{
my $url = shift;
$curl->setopt(CURLOPT_URL, $url);
my $response_body;
$curl->setopt(CURLOPT_WRITEDATA,\$response_body);
my $retcode = $curl->perform;
if ($retcode == 0) {
print("Transfer went ok Http::Code = ".$curl->strerror($retcode)."\n");
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
# judge result and next action based on $response_code
return \$response_body;
} else {
# Error code, type of error, error message
print("An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n");
return 0;
}
}
sub init
{
#setejem el proxy
$curl->setopt(CURLOPT_PROXY,"$host:".$port);
$curl->setopt(CURLOPT_PROXYTYPE,CURLPROXY_SOCKS4);
#posem les altres dades
$curl->setopt(CURLOPT_USERAGENT, $useragent);
$curl->setopt(CURLOPT_CONNECTTIMEOUT, $connectTimeOut);
$curl->setopt(CURLOPT_TIMEOUT, $timeout);
$curl->setopt(CURLOPT_SSL_VERIFYPEER,0);
$curl->setopt(CURLOPT_HEADER,0);
}
Hope this will help you!
Maybe the proxy that you are using is already an HTTPS proxy (ie. CONNECT proxy). In that case this should work (untested):
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http'], 'socks://localhost:9150');
$mech->proxy(['https'], 'https://localhost:9150'); ### <-- make https go over https-connect proxy
$mech->get("https://www.google.com");
I cannot find the origin but I fought with that a long time ago. Basically the problem I had was with the imlpementation that LWP::UserAgent used for the https requests.
Possibly this question can help you: How do I force LWP to use Crypt::SSLeay for HTTPS requests?

Why does LWP::UserAgent GET request fail with HTTPS?

Here's my code
#!/path/to/perl
use strict;
use LWP::UserAgent;
use HTTP::Request::Common;
use Crypt::SSLeay;
$ENV{HTTPS_PROXY} = 'http://proxy:8080/';
$ENV{HTTPS_DEBUG} = 1;
my $myurl = "https://www.redhat.com";
my $ua = new LWP::UserAgent;
$ua->cookie_jar( {} );
$ua->protocols_allowed( [ 'http','https'] );
$ua->proxy(['http', 'https'], 'http://proxy:8080/');
my $page = $ua->get($myurl);
die "Error $myurl\n ", $page->status_line, "\n Aborting"
unless $page->is_success;
print "Success", $page1->content_type, " document!\n";
It returns
Error at https://www.redhat.com
400 Bad Request
Aborting at test.pl line 30.
what's wrong?
Edit:
Apparently, Its a bug. But the workaround doesn't work for me.
Ha! I got the answer!
1) remove the '/' after the port of ENV{HTTPS_PROXY}
2) Apparently, LWP's proxy system send 'GET' requests instead of CONNECT requests so use Crypt::SSLeay's proxy system by just setting the environment variable and remove the proxy command.
On some systems, e.g. Debian, you need to install the appropriate SSL library for this to work. The error messages on theses systems can sometimes be at bit missleading. I think the Debian package would be libnet-ssleay-perl.
I just uploaded the LWP::Protocol::connect module to CPAN. This module adds the missing HTTP/CONNECT method support to LWP.
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://proxyhost.domain:3128/');
$ua->get('https://www.somesslsite.com');
With this module you can use the regular IO::Socket::SSL implementation for LWP >=6.00.
It looks like your proxy server does not accept HTTPS connections. Have you tried setting it up in your favorite browser and viewing the URL?