Get reason for failed LWP::Simple head request - perl

I have migrated my scrilpts from CentOS 7 to 8 and there's a new Perl version. I have the folowing snippet that uses head to check if a URL exists:
#!/bin/perl
use strict;
use warnings;
use LWP::Simple;
my $sitemapurl = "https://www.prosinger.net";
if (head($sitemapurl)) {
...
}
else {
print "The $sitemapurl doesn't exist\n";
exit(1);
}
It now always returns that the URL doesn't exist. I'm quite sure that this has to do something with https (I have perl-LWP-Protocol-https installed), but I'm not sure how to get any feedback information from head method to check what the error code is.
Any ideas?

You can use LWP::UserAgent instead of LWP::Simple, which allows you to get an error message:
my $ua = LWP::UserAgent->new;
my $sitemapurl = "https://www.prosinger.net";
my $req = $ua->head($sitemapurl);
if ($req->is_success) {
...
} else {
die "Could not head($sitemapurl): " . $req->status_line;
}
Running this code prints:
Could not head(https://www.prosinger.net): 403 Forbidden at head.pl line 15.
You can fix this (for this specific website; this will not work for all website) by setting a User-Agent in your LWP::UserAgent object:
my $ua = LWP::UserAgent->new( agent => 'anything seems to work' );
Of interest is the decoded_content method of HTTP::Response that allows you to get the content of the request (you don't need it in that case, but you might later):
my $req = $ua->get(...);
if ($req->is_success) {
my $content = $req->decoded_content;
...
}

Your code that uses LWP::Simple and Dada's version that switches to LWP::UserAgent are basically doing the same thing, except that you can get details of the error when using LWP::UserAgent.
Running the LWP::UserAgent version gives this error:
Could not head(https://www.prosinger.net): 500 Can't connect to www.prosinger.net:443 (SSL connect attempt failed error:2707307E:OCSP routines:OCSP_check_validity:status not yet valid)
And Googling that error message gives this SO answer as the first result. Is it possible that the clocks on your your client machine and the server are out of sync?

Related

Certificates issue with a perl script

I need to use an url to connect with OSTR to create a ticket via webservice, but I have a problem with certificates.
The error (picture here):
500 Can't connect to (certificate verify failed)
What I've tried
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
my $UA = LWP::UserAgent->new( ssl_opts => { verify_hostname => 0, } );
I've also tried to add a new certificate in C:\otrs\Kernel\cpan-lib\Mozilla\CA\cacert.pem.
CODE
Code (Picture here)
Well, please try this code and add/remove the line which sets the env variable, for me this just works.
use strict;
use warnings;
use SOAP::Lite;
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
my $URL = 'https://self-signed.badssl.com/foo';
my $NameSpace = 'https://self-signed.badssl.com/';
my $Operation = 'TicketCreate';
my $XMLData = <<'XML';
<UserLogin>login</UserLogin>
<Password>pass</Password>
<Ticket>
<Title>Title</Title>
</Ticket>
XML
my $SOAPObject = SOAP::Lite->uri($NameSpace)->proxy($URL)->$Operation($XMLData);
Without the $ENV{} line it gives 500 Can't connect ... (certificate verify failed) and with it it gives 405 Not Allowed (since the test URL does not actually run the SOAP web service).
Also, please place actual code snippets on SO, in text and not in a screenshot, and please try and keep them short.
Oh, and by the way, I wrote a small script to create tickets in OTRS using the web service, it's here, maybe it's useful for you -> https://github.com/mbeijen/App-OTRS-CreateTicket

Perl HTTPS over proxy using LWP::UserAgent

I wish to request a URL via a HTTPS proxy using perl's LWP::UserAgent module. There is quite a few reference around this, but nothing could help me get it work.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use Data::Dumper;
BEGIN {
$ENV{HTTPS_PROXY} = 'https://<IP>:<PORT>';
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
$ENV{HTTPS_PROXY_USERNAME} = '<API_KEY>';
$ENV{HTTPS_PROXY_PASSWORD} = '';
$ENV{HTTPS_DEBUG} = 1; #Add debug output
}
my $ua = LWP::UserAgent->new(ssl_opts => {verify_hostname => 0}, SSL_version => 'SSLv3', allowed_protocols => ['https', 'http']);
$ua->proxy(['https', 'http'], 'https://<IP>:<PORT>');
my $req = HTTP::Request->new('GET','https://<DOMAIN_URL>');
print STDERR Dumper($ua);
my $response = $ua->request($req);
print $response->code ."\n";
print STDERR Dumper($response);
I get this error:
SSL connect attempt failed error:140770FC:SSL
routines:SSL23_GET_SERVER_HELLO:unknown protocol at
/home/user/project/local/lib/perl5/LWP/Protocol/http.pm line 51.
Despite me specifying allowed_protocols in LWP, https scheme in proxy and the url scheme being https, it still goes to LWP::Protocol::HTTP above and not to LWP::Protocol::HTTPS.
I also verified that the version of LWP::Protocol::HTTPS is 6.06 which is the same as LWP::UserAgent (which was mentioned in one of the forums)
This worked for me:
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://<USER>:<PSWD>#<IP>:<PORT>/');
$ua->get('https://www.somesslsite.com');
HTTPS Proxy and LWP::UserAgent
Note: The environment credentials (HTTPS_PROXY_USERNAME) didnt work for me. I had to enter it in the URL like above.
The correct setup with recent versions of LWP (starting with version 6.06 which you have) is to just use the same syntax as found in other applications together with the proxy function:
my $ua = LWP::UserAgent->new;
$ua->proxy(https => 'http://user:pass#proxy');
$ua->get('https://server');
Alternatively you could set the environment variable https_proxy to the same value, i.e. http://user:pass#proxy.
Before 6.06 proxy support was broken at least when used together with IO::Socket::SSL (default since version 6.0). The syntax you have in your question is for the old backend Crypt::SSLeay which is no longer recommended because it does not check the certificates properly.

Why might Perl allow for http websites using TOR but not https?

I am having difficulty using perl to visit a website via TOR if it is an https site but not if it is an http site.
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http', 'https'], 'socks://localhost:9150');
$mech->get("https://www.google.com");
I am receiving the error message: Error GETing https://www.google.com: Status read failed: Bad file descriptor at line 10," where line i10 is the last line of the program.
In the TOR browser, I can successfully view: "https://www.google.com" with a port of 9150.
I am using ActivePerl 5.16.2; Vadalia 0.2.21 and Tor 0.2.3.25.
I have a Windows machine and my primary internet browser is Mozilla.
I have tried installing packages with the commands:
cpan LWP::UserAgent
ppm install LWP::Protocol::https
cpan LWP::Protocol::https
ppm install LWP::Protocol::socks
cpan LWP::Protocol::socks
ppm install Mozilla::CA
ppm install IO::Socket::SSL
ppm install Crypt::SSLeay
cpan Crypt::SSLeay
Thank you for any help! Please let me know whether there is any further information that I can provide.
Time ago, i'd found the way to go throught https sites with Tor using WWW::Curl::Easy to fetch those kind of sites, because using LWP i found the same problems.
After that i save all html in files and parsing them using WWW::Mechanzie or HTML::TreeBuilder.
If you want more interactivity with site like post forms , etc. This solutions may be more tedious because you'll need to interact with curl.
package Curl;
use warnings;
use WWW::Curl::Easy;
use WWW::UserAgent::Random;
my $curl = WWW::Curl::Easy->new;
my $useragent = rand_ua("browsers");
my $host = 'localhost';
my $port = '9070';
my $timeout = '20';
my $connectTimeOut= '20';
&init;
sub get
{
my $url = shift;
$curl->setopt(CURLOPT_URL, $url);
my $response_body;
$curl->setopt(CURLOPT_WRITEDATA,\$response_body);
my $retcode = $curl->perform;
if ($retcode == 0) {
print("Transfer went ok Http::Code = ".$curl->strerror($retcode)."\n");
my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
# judge result and next action based on $response_code
return \$response_body;
} else {
# Error code, type of error, error message
print("An error happened: $retcode ".$curl->strerror($retcode)." ".$curl->errbuf."\n");
return 0;
}
}
sub init
{
#setejem el proxy
$curl->setopt(CURLOPT_PROXY,"$host:".$port);
$curl->setopt(CURLOPT_PROXYTYPE,CURLPROXY_SOCKS4);
#posem les altres dades
$curl->setopt(CURLOPT_USERAGENT, $useragent);
$curl->setopt(CURLOPT_CONNECTTIMEOUT, $connectTimeOut);
$curl->setopt(CURLOPT_TIMEOUT, $timeout);
$curl->setopt(CURLOPT_SSL_VERIFYPEER,0);
$curl->setopt(CURLOPT_HEADER,0);
}
Hope this will help you!
Maybe the proxy that you are using is already an HTTPS proxy (ie. CONNECT proxy). In that case this should work (untested):
#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use LWP::Protocol::socks;
use LWP::Protocol::https;
use utf8;
my $mech = WWW::Mechanize->new(timeout => 60*5);
$mech->proxy(['http'], 'socks://localhost:9150');
$mech->proxy(['https'], 'https://localhost:9150'); ### <-- make https go over https-connect proxy
$mech->get("https://www.google.com");
I cannot find the origin but I fought with that a long time ago. Basically the problem I had was with the imlpementation that LWP::UserAgent used for the https requests.
Possibly this question can help you: How do I force LWP to use Crypt::SSLeay for HTTPS requests?

Why does LWP::UserAgent GET request fail with HTTPS?

Here's my code
#!/path/to/perl
use strict;
use LWP::UserAgent;
use HTTP::Request::Common;
use Crypt::SSLeay;
$ENV{HTTPS_PROXY} = 'http://proxy:8080/';
$ENV{HTTPS_DEBUG} = 1;
my $myurl = "https://www.redhat.com";
my $ua = new LWP::UserAgent;
$ua->cookie_jar( {} );
$ua->protocols_allowed( [ 'http','https'] );
$ua->proxy(['http', 'https'], 'http://proxy:8080/');
my $page = $ua->get($myurl);
die "Error $myurl\n ", $page->status_line, "\n Aborting"
unless $page->is_success;
print "Success", $page1->content_type, " document!\n";
It returns
Error at https://www.redhat.com
400 Bad Request
Aborting at test.pl line 30.
what's wrong?
Edit:
Apparently, Its a bug. But the workaround doesn't work for me.
Ha! I got the answer!
1) remove the '/' after the port of ENV{HTTPS_PROXY}
2) Apparently, LWP's proxy system send 'GET' requests instead of CONNECT requests so use Crypt::SSLeay's proxy system by just setting the environment variable and remove the proxy command.
On some systems, e.g. Debian, you need to install the appropriate SSL library for this to work. The error messages on theses systems can sometimes be at bit missleading. I think the Debian package would be libnet-ssleay-perl.
I just uploaded the LWP::Protocol::connect module to CPAN. This module adds the missing HTTP/CONNECT method support to LWP.
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
$ua->proxy('https', 'connect://proxyhost.domain:3128/');
$ua->get('https://www.somesslsite.com');
With this module you can use the regular IO::Socket::SSL implementation for LWP >=6.00.
It looks like your proxy server does not accept HTTPS connections. Have you tried setting it up in your favorite browser and viewing the URL?

Why can't I connect to my CAS server with Perl's AuthCAS?

I'm attempting to use an existing CAS server to authenticate login for a Perl CGI web script and am using the AuthCAS Perl module (v 1.3.1). I can connect to the CAS server to get the service ticket but when I try to connect to validate the ticket my script returns with the following error from the IO::Socket::SSL module:
500 Can't connect to [CAS Server]:443 (Bad hostname '[CAS Server]')
([CAS Server] substituted for real server name)
Symptoms/Tests:
If I type the generated URL for the authentication into the web browser's location bar it returns just fine with the expected XML snippet. So it is not a bad host name.
If I generate a script without using the AuthCAS module but using the IO::Socket::SSL module directly to query the CAS server for validation on the generated service ticket the Perl script will run fine from the command line but not in the browser.
If I add the AuthCAS module into the script in item 2, the script no longer works on the command line and still doesn't work in the browser.
Here is the bare-bones script that produces the error:
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use AuthCAS;
use CGI::Carp qw( fatalsToBrowser );
my $id = $ENV{QUERY_STRING};
my $q = new CGI;
my $target = "http://localhost/cgi-bin/testCAS.cgi";
my $cas = new AuthCAS(casUrl => 'https://cas_server/cas');
if ($id eq ""){
my $login_url = $cas->getServerLoginURL($target);
printf "Location: $login_url\n\n";
exit 0;
} else {
print $q->header();
print "CAS TEST<br>\n";
## When coming back from the CAS server a ticket is provided in the QUERY_STRING
print "QUERY_STRING = " . $id . "</br>\n";
## $ST should contain the received Service Ticket
my $ST = $q->param('ticket');
my $user = $cas->validateST($target, $ST); #### This is what fails
printf "Error: %s\n", &AuthCAS::get_errors() unless (defined $user);
}
Any ideas on where the conflict might be?
The error is coming from the line directly above the snippet Cebjyre quoted namely
$ssl_socket = new IO::Socket::SSL(%ssl_options);
namely the socket creation. All of the input parameters are correct. I had edited the module to put in debug statements and print out all the parameters just before that call and they are all fine. Looks like I'm going to have to dive deeper into the IO::Socket::SSL module.
As usually happens when I post questions like this, I found the problem. It turns out the Crypt::SSLeay module was not installed or at least not up to date. Of course the error messages didn't give me any clues. Updating it and all the problems go away and things are working fine now.
Well, from the module source it looks like that IO::Socket error is coming from get_https2
[...]
unless ($ssl_socket) {
$errors = sprintf "error %s unable to connect https://%s:%s/\n",&IO::Socket::SSL::errstr,$host,$port;
return undef;
}
[...]
which is called by callCAS, which is called by validateST.
One option is to temporarily edit the module file to put some debug statements in if you can, but if I had to guess, I'd say the casUrl you are supplying isn't matching up to the _parse_url regex properly - maybe you have three slashes after the https?