Here's my code
#!/path/to/perl
use strict;
use LWP::UserAgent;
use HTTP::Request::Common;
use Crypt::SSLeay;
$ENV{HTTPS_PROXY} = 'http://proxy:8080/';
$ENV{HTTPS_DEBUG} = 1;
my $myurl = "https://www.redhat.com";
my $ua = new LWP::UserAgent;
$ua->cookie_jar( {} );
$ua->protocols_allowed( [ 'http','https'] );
$ua->proxy(['http', 'https'], 'http://proxy:8080/');
my $page = $ua->get($myurl);
die "Error $myurl\n ", $page->status_line, "\n Aborting"
unless $page->is_success;
print "Success", $page1->content_type, " document!\n";
It returns
Error at https://www.redhat.com
400 Bad Request
Aborting at test.pl line 30.
what's wrong?
Edit:
Apparently, Its a bug. But the workaround doesn't work for me.
Ha! I got the answer!
1) remove the '/' after the port of ENV{HTTPS_PROXY}
2) Apparently, LWP's proxy system send 'GET' requests instead of CONNECT requests so use Crypt::SSLeay's proxy system by just setting the environment variable and remove the proxy command.
On some systems, e.g. Debian, you need to install the appropriate SSL library for this to work. The error messages on theses systems can sometimes be at bit missleading. I think the Debian package would be libnet-ssleay-perl.
I just uploaded the LWP::Protocol::connect module to CPAN. This module adds the missing HTTP/CONNECT method support to LWP.
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://proxyhost.domain:3128/');
$ua->get('https://www.somesslsite.com');
With this module you can use the regular IO::Socket::SSL implementation for LWP >=6.00.
It looks like your proxy server does not accept HTTPS connections. Have you tried setting it up in your favorite browser and viewing the URL?
Related
I have migrated my scrilpts from CentOS 7 to 8 and there's a new Perl version. I have the folowing snippet that uses head to check if a URL exists:
#!/bin/perl
use strict;
use warnings;
use LWP::Simple;
my $sitemapurl = "https://www.prosinger.net";
if (head($sitemapurl)) {
...
}
else {
print "The $sitemapurl doesn't exist\n";
exit(1);
}
It now always returns that the URL doesn't exist. I'm quite sure that this has to do something with https (I have perl-LWP-Protocol-https installed), but I'm not sure how to get any feedback information from head method to check what the error code is.
Any ideas?
You can use LWP::UserAgent instead of LWP::Simple, which allows you to get an error message:
my $ua = LWP::UserAgent->new;
my $sitemapurl = "https://www.prosinger.net";
my $req = $ua->head($sitemapurl);
if ($req->is_success) {
...
} else {
die "Could not head($sitemapurl): " . $req->status_line;
}
Running this code prints:
Could not head(https://www.prosinger.net): 403 Forbidden at head.pl line 15.
You can fix this (for this specific website; this will not work for all website) by setting a User-Agent in your LWP::UserAgent object:
my $ua = LWP::UserAgent->new( agent => 'anything seems to work' );
Of interest is the decoded_content method of HTTP::Response that allows you to get the content of the request (you don't need it in that case, but you might later):
my $req = $ua->get(...);
if ($req->is_success) {
my $content = $req->decoded_content;
...
}
Your code that uses LWP::Simple and Dada's version that switches to LWP::UserAgent are basically doing the same thing, except that you can get details of the error when using LWP::UserAgent.
Running the LWP::UserAgent version gives this error:
Could not head(https://www.prosinger.net): 500 Can't connect to www.prosinger.net:443 (SSL connect attempt failed error:2707307E:OCSP routines:OCSP_check_validity:status not yet valid)
And Googling that error message gives this SO answer as the first result. Is it possible that the clocks on your your client machine and the server are out of sync?
I am using LWP::Simple::get($url) library to access data from web pages. The problem is that the get function is not working for the below url.
Below is the code snippet:
#!/usr/bin/perl
use LWP::Simple;
use JSON;
use Data::Dumper;
my $url = "https://www.cryptopia.co.nz/api/GetCurrencies";
my $json = get( $url);
die "Could not get $url!" unless defined $json;
my $decoded_json = decode_json($json);
print Dumper($decoded_json);
After running this code it gives the below error:
Could not get https://www.cryptopia.co.nz/api/GetCurrencies!
When I replace the url with : $url = "https://api.coinmarketcap.com/v1/ticker/" it works fine.
Please can you tell me what is the root cause and how I can fix it.
Also the url mentioned in the code snippet worked once and now suddenly it does not work.
This is essentially the same question as this one, only for Perl LWP instead of NodeJS.
The problem is that the target site has a broken setup and is missing an intermediate certificate. This can be seen from the SSLabs report:
This server's certificate chain is incomplete. Grade capped to B.
To work around this you have two secure and one insecure option. The insecure but, unfortunately, often proposed option is to disable all certificate validation. Don't do this because it makes your application vulnerable against man-in-the-middle attacks.
The next option is to have your own trust store which includes all necessary certificates, i.e. both the root CA and also the missing intermediate CA:
use strict;
use warnings;
use LWP::UserAgent;
use IO::Socket::SSL;
my $ua = LWP::UserAgent->new;
$ua->ssl_opts(
SSL_ca_file => 'myca.pem',
# disable OCSP stapling since it results in problems with this site
SSL_ocsp_mode => SSL_OCSP_NO_STAPLE
);
my $resp = $ua->get('https://www.cryptopia.co.nz/api/GetCurrencies');
print $resp->decoded_content;
myca.pem is in this case the concatenation of the PEM representation for the certificates of the intermediate "COMODO RSA Extended Validation Secure Server CA" and the root "COMODO RSA Certification Authority". I've provided it at Pastebin, here.
The third option is to trust this specific leaf certificate only. When used this will trust a server using this certificate, no matter if the certificate is expired, revoked, the hostname does not match the URL or (as in this case) the trust chain cannot be built because of a missing intermediate certificate:
use strict;
use warnings;
use LWP::UserAgent;
use IO::Socket::SSL;
my $ua = LWP::UserAgent->new;
$ua->ssl_opts(
SSL_fingerprint => 'sha256$70bca153ac950b8fa92d20f04dceca929852c42dc1d51bdc3c290df256ae05d3',
SSL_ocsp_mode => SSL_OCSP_NO_STAPLE,
);
my $resp = $ua->get('https://www.cryptopia.co.nz/api/GetCurrencies');
print $resp->decoded_content;
The fingerprint you see here is the one you can also see in the browser when looking at the certificate.
I wish to request a URL via a HTTPS proxy using perl's LWP::UserAgent module. There is quite a few reference around this, but nothing could help me get it work.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use Data::Dumper;
BEGIN {
$ENV{HTTPS_PROXY} = 'https://<IP>:<PORT>';
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
$ENV{HTTPS_PROXY_USERNAME} = '<API_KEY>';
$ENV{HTTPS_PROXY_PASSWORD} = '';
$ENV{HTTPS_DEBUG} = 1; #Add debug output
}
my $ua = LWP::UserAgent->new(ssl_opts => {verify_hostname => 0}, SSL_version => 'SSLv3', allowed_protocols => ['https', 'http']);
$ua->proxy(['https', 'http'], 'https://<IP>:<PORT>');
my $req = HTTP::Request->new('GET','https://<DOMAIN_URL>');
print STDERR Dumper($ua);
my $response = $ua->request($req);
print $response->code ."\n";
print STDERR Dumper($response);
I get this error:
SSL connect attempt failed error:140770FC:SSL
routines:SSL23_GET_SERVER_HELLO:unknown protocol at
/home/user/project/local/lib/perl5/LWP/Protocol/http.pm line 51.
Despite me specifying allowed_protocols in LWP, https scheme in proxy and the url scheme being https, it still goes to LWP::Protocol::HTTP above and not to LWP::Protocol::HTTPS.
I also verified that the version of LWP::Protocol::HTTPS is 6.06 which is the same as LWP::UserAgent (which was mentioned in one of the forums)
This worked for me:
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://<USER>:<PSWD>#<IP>:<PORT>/');
$ua->get('https://www.somesslsite.com');
HTTPS Proxy and LWP::UserAgent
Note: The environment credentials (HTTPS_PROXY_USERNAME) didnt work for me. I had to enter it in the URL like above.
The correct setup with recent versions of LWP (starting with version 6.06 which you have) is to just use the same syntax as found in other applications together with the proxy function:
my $ua = LWP::UserAgent->new;
$ua->proxy(https => 'http://user:pass#proxy');
$ua->get('https://server');
Alternatively you could set the environment variable https_proxy to the same value, i.e. http://user:pass#proxy.
Before 6.06 proxy support was broken at least when used together with IO::Socket::SSL (default since version 6.0). The syntax you have in your question is for the old backend Crypt::SSLeay which is no longer recommended because it does not check the certificates properly.
I am trying to connect via HTTPS to a site using LWP::Simple. I have set the environment variables both inside of the script and in the shell. I continue to get a 500 Connection Timed Out. I can connect to a HTTP site just fine.
The proxy is letting communication through. I can connect to the HTTPS site through the proxy using curl without a problem.
Any suggestions?
#!/usr/bin/perl
use warnings;
use Net::SSL;
use LWP::UserAgent;
use LWP::Debug qw(+);
use Data::Dumper;
$ENV{PERL_NET_HTTPS_SSL_SOCKET_CLASS} = "Net::SSL";
$ENV{HTTPS_DEBUG} = 1;
$ENV{HTTPS_VERSION} = 2;
my $ua = LWP::UserAgent->new (verify_hostname => 0);
$ua->ssl_opts(verify_hostname => 0,
SSL_verify_mode => 0x00);
$ua->proxy('https' => 'http://x.x.x.x:3128');
print $ua->proxy('https');
print Dumper($ua);
my $response = $ua->get('https://qualys.com/');
print Dumper ($response);
if ($response->is_success) {
print $response->decoded_content; # or whatever
exit(0);
}
else {
print "\nFail:\n";
print $response->status_line ."\n";
exit(1);
}
HTTPS proxy support for LWP is (or at least was until some days ago) broken, at least if you use IO::Socket::SSL, which is the default for LWP versions >=6.0 because it provides better security. It should be much better now with LWP 6.0.6 and LWP::Protocol::https 6.0.6 which were released few days ago. If you cannot upgrade you might try using Net::SSLGlue::LWP which patches LWP for better https proxy support.
I am having difficulty using perl to visit a website via TOR if it is an https site but not if it is an http site.
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http', 'https'], 'socks://localhost:9150');
$mech->get("https://www.google.com");
I am receiving the error message: Error GETing https://www.google.com: Status read failed: Bad file descriptor at line 10," where line i10 is the last line of the program.
In the TOR browser, I can successfully view: "https://www.google.com" with a port of 9150.
I am using ActivePerl 5.16.2; Vadalia 0.2.21 and Tor 0.2.3.25.
I have a Windows machine and my primary internet browser is Mozilla.
I have tried installing packages with the commands:
cpan LWP::UserAgent
ppm install LWP::Protocol::https
cpan LWP::Protocol::https
ppm install LWP::Protocol::socks
cpan LWP::Protocol::socks
ppm install Mozilla::CA
ppm install IO::Socket::SSL
ppm install Crypt::SSLeay
cpan Crypt::SSLeay
Thank you for any help! Please let me know whether there is any further information that I can provide.
Time ago, i'd found the way to go throught https sites with Tor using WWW::Curl::Easy to fetch those kind of sites, because using LWP i found the same problems.
After that i save all html in files and parsing them using WWW::Mechanzie or HTML::TreeBuilder.
If you want more interactivity with site like post forms , etc. This solutions may be more tedious because you'll need to interact with curl.
package Curl;
use warnings;
use WWW::Curl::Easy;
use WWW::UserAgent::Random;
my $curl = WWW::Curl::Easy->new;
my $useragent = rand_ua("browsers");
my $host = 'localhost';
my $port = '9070';
my $timeout = '20';
my $connectTimeOut= '20';
&init;
sub get
{
my $url = shift;
$curl->setopt(CURLOPT_URL, $url);
my $response_body;
$curl->setopt(CURLOPT_WRITEDATA,\$response_body);
my $retcode = $curl->perform;
if ($retcode == 0) {
print("Transfer went ok Http::Code = ".$curl->strerror($retcode)."\n");
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
# judge result and next action based on $response_code
return \$response_body;
} else {
# Error code, type of error, error message
print("An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n");
return 0;
}
}
sub init
{
#setejem el proxy
$curl->setopt(CURLOPT_PROXY,"$host:".$port);
$curl->setopt(CURLOPT_PROXYTYPE,CURLPROXY_SOCKS4);
#posem les altres dades
$curl->setopt(CURLOPT_USERAGENT, $useragent);
$curl->setopt(CURLOPT_CONNECTTIMEOUT, $connectTimeOut);
$curl->setopt(CURLOPT_TIMEOUT, $timeout);
$curl->setopt(CURLOPT_SSL_VERIFYPEER,0);
$curl->setopt(CURLOPT_HEADER,0);
}
Hope this will help you!
Maybe the proxy that you are using is already an HTTPS proxy (ie. CONNECT proxy). In that case this should work (untested):
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http'], 'socks://localhost:9150');
$mech->proxy(['https'], 'https://localhost:9150'); ### <-- make https go over https-connect proxy
$mech->get("https://www.google.com");
I cannot find the origin but I fought with that a long time ago. Basically the problem I had was with the imlpementation that LWP::UserAgent used for the https requests.
Possibly this question can help you: How do I force LWP to use Crypt::SSLeay for HTTPS requests?