HTTPS Proxy and LWP::UserAgent - perl

I have read a number of threads on a number of sites and am still unable to make this work.
I have a client machine (OSX) with OpenSSL 0.9.8r running perl 5.12.4, with LWP 6.0.4, updated Crypt::SSLeay, Net::SSL etc. I am trying to connect to an HTTPS site (https://github.com in the example) via a WinGate proxy that I have running on a Windows VM. Note that my actual application is attaching to an SSL webservice that I have no control over.
From Firefox, pointed to the proxy everything is copacetic. The page loads successfully and I see the connections in the Proxy software Activity monitor. I'll be darned if I can make it work in Perl though. I've started with the code from this Stack Overflow Question : How do I force LWP to use Crypt::SSLeay for HTTPS requests? And added some debugging and additional output. Here's were I stand now:
#!/usr/bin/perl
use strict;
use warnings;
use Net::SSL (); # From Crypt-SSLeay
BEGIN {
$Net::HTTPS::HTTPS_SSL_SOCKET_CLASS = "Net::SSL"; # Force use of Net::SSL
$ENV{HTTPS_PROXY} = 'https://192.168.1.11:80';
# $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
$ENV{HTTPS_DEBUG} = 1; #Add debug output
}
use LWP::UserAgent;
my $ua = LWP::UserAgent->new();
my $req = HTTP::Request->new('GET','https://github.com/');
my $response = $ua->request($req);
print "--\n";
print "$_\n" for grep { $_ =~ /SSL/ } keys %INC;
print "--\n";
if ($response->is_success) {
print $response->decoded_content; # or whatever
exit(0);
}
else {
print "\nFail:\n";
print $response->status_line ."\n";
exit(1);
}
Here's the output from this code:
--
Crypt/SSLeay.pm
Crypt/SSLeay/X509.pm
Net/SSL.pm
--
Fail:
500 Can't connect to github.com:443 (Crypt-SSLeay can't verify hostnames)
If I then uncomment $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;, I do see a single connect to github.com:443 on the proxy and then nothing. (Note it works great from a web browser through the proxy). After much hanging I get the following output from the script:
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:before/connect initialization
SSL_connect:SSLv3 write client hello A
SSL_connect:failed in SSLv3 read server hello A
SSL_connect:before/connect initialization
SSL_connect:SSLv2 write client hello A
SSL_connect:failed in SSLv2 read server hello A
--
Crypt/SSLeay.pm
Crypt/SSLeay/X509.pm
Net/SSL.pm
Crypt/SSLeay/CTX.pm
Crypt/SSLeay/MainContext.pm
--
Fail:
500 SSL negotiation failed:
If anyone can provide some direction here I would greatly appreciate it!

I just uploaded the LWP::Protocol::connect module to CPAN.
This module adds the missing HTTP/CONNECT method support to LWP.
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://proxyhost.domain:3128/');
$ua->get('https://www.somesslsite.com');
With this module you can use the regular IO::Socket::SSL implementation for LWP >=6.00.

Why would you want "Force use of Net::SSL".
Try
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
BEGIN {
$ENV{HTTPS_PROXY} = 'https://192.168.1.11:80';
# $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
$ENV{HTTPS_DEBUG} = 1; #Add debug output
}
my $ua = LWP::UserAgent->new();
my $req = HTTP::Request->new('GET','https://github.com/');
my $response = $ua->request($req);
print $response->code ."\n";
An out put of 200 should mean that there were no errors.
A below sample code of mine works perfectly
#!/usr/bin/perl
use warnings;
use LWP::UserAgent;
BEGIN {
$ENV{HTTPS_PROXY} = 'https://176.9.209.113:8080'; #Valid HTTPS proxy taken from http://hidemyass.com/proxy-list/
$ENV{HTTPS_DEBUG} = 1;
}
my $ua = new LWP::UserAgent;
my $req = new HTTP::Request('GET', 'https://www.nodeworks.com');
my $res = $ua->request($req);
print $res->code, "\n";
Output-
200
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:SSLv3 read server hello A
SSL_connect:SSLv3 read server certificate A
SSL_connect:SSLv3 read server key exchange A
SSL_connect:SSLv3 read server done A
SSL_connect:SSLv3 write client key exchange A
SSL_connect:SSLv3 write change cipher spec A
SSL_connect:SSLv3 write finished A
SSL_connect:SSLv3 flush data
SSL_connect:SSLv3 read finished A
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:SSLv3 read server hello A
SSL_connect:SSLv3 read server certificate A
SSL_connect:SSLv3 read server key exchange A
SSL_connect:SSLv3 read server done A
SSL_connect:SSLv3 write client key exchange A
SSL_connect:SSLv3 write change cipher spec A
SSL_connect:SSLv3 write finished A
SSL_connect:SSLv3 flush data
SSL_connect:SSLv3 read finished A
Tool completed successfully
With https://github.com/ the output is-
200
SSL_connect:before/connect initialization
SSL_connect:SSLv2/v3 write client hello A
SSL_connect:SSLv3 read server hello A
SSL_connect:SSLv3 read server certificate A
SSL_connect:SSLv3 read server done A
SSL_connect:SSLv3 write client key exchange A
SSL_connect:SSLv3 write change cipher spec A
SSL_connect:SSLv3 write finished A
SSL_connect:SSLv3 flush data
SSL_connect:SSLv3 read finished A
Tool completed successfully
So having said all this. Your code version (below) should work fine-
use warnings;
use LWP::UserAgent;
BEGIN {
$ENV{HTTPS_PROXY} = 'https://176.9.209.113:8080';
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0; #works even with this
$ENV{HTTPS_DEBUG} = 1; #Add debug output
}
my $ua = new LWP::UserAgent;
my $req = new HTTP::Request('GET', 'https://github.com/');
my $res = $ua->request($req);
print $res->code, "\n";
if ($res->is_success) {
print $res->decoded_content; # or whatever
exit(0);
}
else {
print "\nFail:\n";
print $res->status_line ."\n";
exit(1);
}

I pretty much ran into the same problem. Here are the things that fixed it for me:
make sure you use Crypt::SSLeay 0.58 or lower. At least 0.64 blocked endlessly for me (as described at http://cpanforum.com/posts/13808 ), while 0.57/0.58 worked.
make sure Net::HTTP has been updated to at least 6.03 (to avoid https://rt.cpan.org/Public/Bug/Display.html?id=72790 )

Instead of using Net::SSL which does not provide much host verification (and no SNI) you can use Net::SSLGlue::LWP. This monkey-patches LWP so that https_proxy can be used with the default SSL backend IO::Socket::SSL:
use Net::SSLGlue::LWP; # do this first
use LWP::Simple;
... continue with normal LWP stuff..

I know this may be a dead question, but if anyone else hits it I've another angle... I can't promise any answers, but we've faced a long-standing problem at $work in this area, but with the Squid proxy, and maybe specific to use of X509 client certs.
The use of the Net::SSL override is part of the solution, but I would fear that WinGate could be the problem (and not something I can help with) although in our case we contact the proxy over http (not sure how LWP deals with proxy+https).
For the record, here's an example of the precise form of code we use:
use Net::SSL;
$ENV{PERL_NET_HTTPS_SSL_SOCKET_CLASS}="Net::SSL";
use LWP::UserAgent;
use LWP::Protocol::https;
my $ua = LWP::UserAgent->new;
$ENV{HTTPS_PROXY} = 'http://cache.local.employer.co.uk:80';
$ua->get("https://example.com/");
This is Perl 5.8.8 with recent CPAN installs (hence separation of L:P:https), so we have a fresh Net::HTTP.
I was going to mention some versions of Net::HTTP are crocked, but I just realised that was my CPAN bug in Martin's reply :)
Sorry if this doesn't add anything.

I've sent a pull-request on libwww-perl repository to fix (or may be workaround ...) the issue.
The comment of this PR shows a simple program that connects with https to github.com through a proxy. With this patch there's no need to mess around with %ENV in your program.
Another advantage is that you can re-use the usual https_proxy setup.

There has been a error in earl 5.8 and some further module where environment variable HTTP_PROXY does not appropriately sets the proxy connection.
Your case has a issue where the bug reported is as mentioned here https://bugzilla.redhat.com/show_bug.cgi?id=1094440
Better way to use this is without environment variable and use LWP UserAgent
`use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://proxyhost.domain:3128/');`

#!/usr/bin/env perl
#
# mimvp.com
# 2017-03-28
use CGI;
use strict;
use LWP::UserAgent;
our %proxy_https = ("https", "connect://173.233.55.118:443");
our $mimvp_url = "https://proxy.mimvp.com/exist.php";
## https
## 1. download LWP-Protocol-connect (wget http://search.cpan.org/CPAN/authors/id/B/BE/BENNING/LWP-Protocol-connect-6.09.tar.gz)
## 2. tar zxvf LWP-Protocol-connect-6.09.tar.gz
## cd LWP-Protocol-connect-6.09
## perl Makefile.PL
## make
## sudo make install
sub test_connect {
my ($url, %proxy) = #_;
print "proxy : $proxy{'http'}\n";
print "https : $proxy{'https'}\n";
print "socks4 : $proxy{'socks4'}\n";
print "socks5 : $proxy{'socks5'}\n";
print "url : $url\n";
my $browser = LWP::UserAgent->new();
$browser->env_proxy();
# # 设置的代理格式
$browser->proxy(%proxy);
$browser->timeout(30);
$browser->agent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36');
# my $req = new HTTP::Request('GET', $url);
# my $response = $browser->request($req);
my $response = $browser->get($url); # 爬取的网址
my $is_success = $response->is_success(); # 1
my $content_type = $response->content_type(); # text/html
my $content = $response->content(); # 网页正文
my $content_length = length($content); # 网页正文长度
print "$is_success\n";
print "$content_type\n";
print "$content_length\n";
print "$content\n";
}
test_connect($mimvp_url, %proxy_https); # https
## perl mimvp-proxy-perl.pl

Related

Perl HTTPS over proxy using LWP::UserAgent

I wish to request a URL via a HTTPS proxy using perl's LWP::UserAgent module. There is quite a few reference around this, but nothing could help me get it work.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use Data::Dumper;
BEGIN {
$ENV{HTTPS_PROXY} = 'https://<IP>:<PORT>';
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
$ENV{HTTPS_PROXY_USERNAME} = '<API_KEY>';
$ENV{HTTPS_PROXY_PASSWORD} = '';
$ENV{HTTPS_DEBUG} = 1; #Add debug output
}
my $ua = LWP::UserAgent->new(ssl_opts => {verify_hostname => 0}, SSL_version => 'SSLv3', allowed_protocols => ['https', 'http']);
$ua->proxy(['https', 'http'], 'https://<IP>:<PORT>');
my $req = HTTP::Request->new('GET','https://<DOMAIN_URL>');
print STDERR Dumper($ua);
my $response = $ua->request($req);
print $response->code ."\n";
print STDERR Dumper($response);
I get this error:
SSL connect attempt failed error:140770FC:SSL
routines:SSL23_GET_SERVER_HELLO:unknown protocol at
/home/user/project/local/lib/perl5/LWP/Protocol/http.pm line 51.
Despite me specifying allowed_protocols in LWP, https scheme in proxy and the url scheme being https, it still goes to LWP::Protocol::HTTP above and not to LWP::Protocol::HTTPS.
I also verified that the version of LWP::Protocol::HTTPS is 6.06 which is the same as LWP::UserAgent (which was mentioned in one of the forums)
This worked for me:
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://<USER>:<PSWD>#<IP>:<PORT>/');
$ua->get('https://www.somesslsite.com');
HTTPS Proxy and LWP::UserAgent
Note: The environment credentials (HTTPS_PROXY_USERNAME) didnt work for me. I had to enter it in the URL like above.
The correct setup with recent versions of LWP (starting with version 6.06 which you have) is to just use the same syntax as found in other applications together with the proxy function:
my $ua = LWP::UserAgent->new;
$ua->proxy(https => 'http://user:pass#proxy');
$ua->get('https://server');
Alternatively you could set the environment variable https_proxy to the same value, i.e. http://user:pass#proxy.
Before 6.06 proxy support was broken at least when used together with IO::Socket::SSL (default since version 6.0). The syntax you have in your question is for the old backend Crypt::SSLeay which is no longer recommended because it does not check the certificates properly.

LWP HTTPS proxy timing out

I am trying to connect via HTTPS to a site using LWP::Simple. I have set the environment variables both inside of the script and in the shell. I continue to get a 500 Connection Timed Out. I can connect to a HTTP site just fine.
The proxy is letting communication through. I can connect to the HTTPS site through the proxy using curl without a problem.
Any suggestions?
#!/usr/bin/perl
use warnings;
use Net::SSL;
use LWP::UserAgent;
use LWP::Debug qw(+);
use Data::Dumper;
$ENV{PERL_NET_HTTPS_SSL_SOCKET_CLASS} = "Net::SSL";
$ENV{HTTPS_DEBUG} = 1;
$ENV{HTTPS_VERSION} = 2;
my $ua = LWP::UserAgent->new (verify_hostname => 0);
$ua->ssl_opts(verify_hostname => 0,
SSL_verify_mode => 0x00);
$ua->proxy('https' => 'http://x.x.x.x:3128');
print $ua->proxy('https');
print Dumper($ua);
my $response = $ua->get('https://qualys.com/');
print Dumper ($response);
if ($response->is_success) {
print $response->decoded_content; # or whatever
exit(0);
}
else {
print "\nFail:\n";
print $response->status_line ."\n";
exit(1);
}
HTTPS proxy support for LWP is (or at least was until some days ago) broken, at least if you use IO::Socket::SSL, which is the default for LWP versions >=6.0 because it provides better security. It should be much better now with LWP 6.0.6 and LWP::Protocol::https 6.0.6 which were released few days ago. If you cannot upgrade you might try using Net::SSLGlue::LWP which patches LWP for better https proxy support.

Why might Perl allow for http websites using TOR but not https?

I am having difficulty using perl to visit a website via TOR if it is an https site but not if it is an http site.
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http', 'https'], 'socks://localhost:9150');
$mech->get("https://www.google.com");
I am receiving the error message: Error GETing https://www.google.com: Status read failed: Bad file descriptor at line 10," where line i10 is the last line of the program.
In the TOR browser, I can successfully view: "https://www.google.com" with a port of 9150.
I am using ActivePerl 5.16.2; Vadalia 0.2.21 and Tor 0.2.3.25.
I have a Windows machine and my primary internet browser is Mozilla.
I have tried installing packages with the commands:
cpan LWP::UserAgent
ppm install LWP::Protocol::https
cpan LWP::Protocol::https
ppm install LWP::Protocol::socks
cpan LWP::Protocol::socks
ppm install Mozilla::CA
ppm install IO::Socket::SSL
ppm install Crypt::SSLeay
cpan Crypt::SSLeay
Thank you for any help! Please let me know whether there is any further information that I can provide.
Time ago, i'd found the way to go throught https sites with Tor using WWW::Curl::Easy to fetch those kind of sites, because using LWP i found the same problems.
After that i save all html in files and parsing them using WWW::Mechanzie or HTML::TreeBuilder.
If you want more interactivity with site like post forms , etc. This solutions may be more tedious because you'll need to interact with curl.
package Curl;
use warnings;
use WWW::Curl::Easy;
use WWW::UserAgent::Random;
my $curl = WWW::Curl::Easy->new;
my $useragent = rand_ua("browsers");
my $host = 'localhost';
my $port = '9070';
my $timeout = '20';
my $connectTimeOut= '20';
&init;
sub get
{
my $url = shift;
$curl->setopt(CURLOPT_URL, $url);
my $response_body;
$curl->setopt(CURLOPT_WRITEDATA,\$response_body);
my $retcode = $curl->perform;
if ($retcode == 0) {
print("Transfer went ok Http::Code = ".$curl->strerror($retcode)."\n");
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
# judge result and next action based on $response_code
return \$response_body;
} else {
# Error code, type of error, error message
print("An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n");
return 0;
}
}
sub init
{
#setejem el proxy
$curl->setopt(CURLOPT_PROXY,"$host:".$port);
$curl->setopt(CURLOPT_PROXYTYPE,CURLPROXY_SOCKS4);
#posem les altres dades
$curl->setopt(CURLOPT_USERAGENT, $useragent);
$curl->setopt(CURLOPT_CONNECTTIMEOUT, $connectTimeOut);
$curl->setopt(CURLOPT_TIMEOUT, $timeout);
$curl->setopt(CURLOPT_SSL_VERIFYPEER,0);
$curl->setopt(CURLOPT_HEADER,0);
}
Hope this will help you!
Maybe the proxy that you are using is already an HTTPS proxy (ie. CONNECT proxy). In that case this should work (untested):
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http'], 'socks://localhost:9150');
$mech->proxy(['https'], 'https://localhost:9150'); ### <-- make https go over https-connect proxy
$mech->get("https://www.google.com");
I cannot find the origin but I fought with that a long time ago. Basically the problem I had was with the imlpementation that LWP::UserAgent used for the https requests.
Possibly this question can help you: How do I force LWP to use Crypt::SSLeay for HTTPS requests?

Why does LWP::UserAgent GET request fail with HTTPS?

Here's my code
#!/path/to/perl
use strict;
use LWP::UserAgent;
use HTTP::Request::Common;
use Crypt::SSLeay;
$ENV{HTTPS_PROXY} = 'http://proxy:8080/';
$ENV{HTTPS_DEBUG} = 1;
my $myurl = "https://www.redhat.com";
my $ua = new LWP::UserAgent;
$ua->cookie_jar( {} );
$ua->protocols_allowed( [ 'http','https'] );
$ua->proxy(['http', 'https'], 'http://proxy:8080/');
my $page = $ua->get($myurl);
die "Error $myurl\n ", $page->status_line, "\n Aborting"
unless $page->is_success;
print "Success", $page1->content_type, " document!\n";
It returns
Error at https://www.redhat.com
400 Bad Request
Aborting at test.pl line 30.
what's wrong?
Edit:
Apparently, Its a bug. But the workaround doesn't work for me.
Ha! I got the answer!
1) remove the '/' after the port of ENV{HTTPS_PROXY}
2) Apparently, LWP's proxy system send 'GET' requests instead of CONNECT requests so use Crypt::SSLeay's proxy system by just setting the environment variable and remove the proxy command.
On some systems, e.g. Debian, you need to install the appropriate SSL library for this to work. The error messages on theses systems can sometimes be at bit missleading. I think the Debian package would be libnet-ssleay-perl.
I just uploaded the LWP::Protocol::connect module to CPAN. This module adds the missing HTTP/CONNECT method support to LWP.
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://proxyhost.domain:3128/');
$ua->get('https://www.somesslsite.com');
With this module you can use the regular IO::Socket::SSL implementation for LWP >=6.00.
It looks like your proxy server does not accept HTTPS connections. Have you tried setting it up in your favorite browser and viewing the URL?

Why can't I connect to my CAS server with Perl's AuthCAS?

I'm attempting to use an existing CAS server to authenticate login for a Perl CGI web script and am using the AuthCAS Perl module (v 1.3.1). I can connect to the CAS server to get the service ticket but when I try to connect to validate the ticket my script returns with the following error from the IO::Socket::SSL module:
500 Can't connect to [CAS Server]:443 (Bad hostname '[CAS Server]')
([CAS Server] substituted for real server name)
Symptoms/Tests:
If I type the generated URL for the authentication into the web browser's location bar it returns just fine with the expected XML snippet. So it is not a bad host name.
If I generate a script without using the AuthCAS module but using the IO::Socket::SSL module directly to query the CAS server for validation on the generated service ticket the Perl script will run fine from the command line but not in the browser.
If I add the AuthCAS module into the script in item 2, the script no longer works on the command line and still doesn't work in the browser.
Here is the bare-bones script that produces the error:
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use AuthCAS;
use CGI::Carp qw( fatalsToBrowser );
my $id = $ENV{QUERY_STRING};
my $q = new CGI;
my $target = "http://localhost/cgi-bin/testCAS.cgi";
my $cas = new AuthCAS(casUrl => 'https://cas_server/cas');
if ($id eq ""){
my $login_url = $cas->getServerLoginURL($target);
printf "Location: $login_url\n\n";
exit 0;
} else {
print $q->header();
print "CAS TEST<br>\n";
## When coming back from the CAS server a ticket is provided in the QUERY_STRING
print "QUERY_STRING = " . $id . "</br>\n";
## $ST should contain the received Service Ticket
my $ST = $q->param('ticket');
my $user = $cas->validateST($target, $ST); #### This is what fails
printf "Error: %s\n", &AuthCAS::get_errors() unless (defined $user);
}
Any ideas on where the conflict might be?
The error is coming from the line directly above the snippet Cebjyre quoted namely
$ssl_socket = new IO::Socket::SSL(%ssl_options);
namely the socket creation. All of the input parameters are correct. I had edited the module to put in debug statements and print out all the parameters just before that call and they are all fine. Looks like I'm going to have to dive deeper into the IO::Socket::SSL module.
As usually happens when I post questions like this, I found the problem. It turns out the Crypt::SSLeay module was not installed or at least not up to date. Of course the error messages didn't give me any clues. Updating it and all the problems go away and things are working fine now.
Well, from the module source it looks like that IO::Socket error is coming from get_https2
[...]
unless ($ssl_socket) {
$errors = sprintf "error %s unable to connect https://%s:%s/\n",&IO::Socket::SSL::errstr,$host,$port;
return undef;
}
[...]
which is called by callCAS, which is called by validateST.
One option is to temporarily edit the module file to put some debug statements in if you can, but if I had to guess, I'd say the casUrl you are supplying isn't matching up to the _parse_url regex properly - maybe you have three slashes after the https?