Perl Mechanize with NTLM authentication - perl

I haven't seen examples of NTLM support with Mechanize. Does anyone know how we can make use of NTLM libraries with Mechanize? If not, does LWP do the job?
my $mech = WWW::Mechanize->new( 'keepalive' => 1 );
my $url = "http://URL/I/NEED/TO/ACCESS"; <---Requires NTLM authorization
my $username = "username";
my $password = "password";
$mech->credentials($username, $password);
$mech->get($url);

According to this
Using WWW::Mechanize Credentials
you need to make sure you;re using v1.5.2 or later of WWW::Mechanize and ultimately it will use Authen::NTLM
If you have a backslash in your real username then you will have to use single quotes or escape it
my $username = "domain\\user";

Related

Unable to get page via HTTPS with LWP::Simple in Perl

I try to download a page from an HTTPS URL with Perl:
use LWP::Simple;
my $url = 'https://www.ferc.gov/xml/whats-new.xml';
my $content = get $url or die "Unable to get $url\n";
print $content;
There seems to be a problem. Just can't figure out the error. I can't get the page. Is the get request improperly coded? Do I need to use a user agent?
LWP::Protocol::https is needed to make HTTPS requests with LWP. It needs to be installed separately from the rest of LWP. It looks like you installed LWP, but not LWP::Protocol::https, so simply install it now.

Perl url encoding using Curl

I am having an issue where I am using cURL inside a perl script to execute a http request. I believe my issue is related to special characters in the URL string but I cannot figure out how to make it work.
I can confirm that the URL is correct as I can run it from my browser.
My perl script is
#!/usr/bin/perl
use strict;
use warnings;
$url = "http://machine/callResync?start=2017-02-01 00.00.00.000&end=2017-02-01 23.23.999";
system "curl $url
It fails when it reaches the first whitespace. I tired to escape that using %20.
After that I put in %26 to escape the & but then I get another issue. I have tired a number of different combinations but it keeps failing.
Any idea's.
Use the URI module to correctly build a URL, and rather than shelling out to cURL you should use a Perl library like LWP::Simple to access the page
The disadvantage of LWP::Simple is that it may be too simple in that it provides no diagnostics if the transaction fails. If you find you need something more elaborate then you should look at
HTTP::Tiny,
LWP::UserAgent, or
Mojo::UserAgent.
If you need help with these then please ask
use strict;
use warnings 'all';
use URI;
use LWP::Simple 'get';
my $url = URI->new('http://machine/callResync');
$url->query_form(
start => '2017-02-01 00.00.00.000',
end => '2017-02-01 23.23.999',
);
my $content = get($url) or die "Failed to access URL";
Problem number 1: You used an invalid URL. Spaces can't appear in URLs.
my $url = "http://machine/callResync?start=2017-02-01%2000.00.00.000&end=2017-02-01%2023.23.999";
Problem number 2: Shell injection error. You didn't correctly form your shell command.
system('curl', $url);
or
use String::ShellQuote qw( shell_quote );
my $cmd = shell_quote('curl', $url);
system($cmd);

Perl LWP::Simple won't "get" a webpage when running from remote server

I'm trying to use Perl to scrape a publications list as follows:
use XML::XPath;
use XML::XPath::XMLParser;
use LWP::Simple;
my $url = "https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxxx/xxxxxx.rdf";
my $content = get($url);
die "Couldn't get publications!" unless defined $content;
When I run it on my local (Windows 7) machine it works fine. When I try to run it on the linux server where we are hosting some websites, it dies. I installed XML and LWP using cpan so those should be there. I'm wondering if the problem could be some sort of security or permissions on the server (keeping it from accessing an external website), but I don't even know where to start with that. Any ideas?
Turns out I didn't have LWP::Protocol::https" installed. I found this out by switching
LWP::Simple
to
LWP::UserAgent
and adding the following:
my $ua = LWP::UserAgent->new;
my $resp = $ua->get('https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxx/xxxxxxx.rdf' );
print $resp;
It then returned an error telling me it didn't have the protocol to access the https without LWP::Protocol::https, so I installed it with
cpan LWP::Protocol::https
and all was good.

Perl mechanize response is only "<HTML></HTML>" with https

I'm kind of new with perl, even newer to Mechanize. So far, when I tried to fetch a site via http, it's no problem.
Now I need to fetch a site with https. I've installed Crypt::SSLeay via PPM.
When I use $mech->get($url), though, this is the only response I get:
"<HTML></HTML>"
I checked the status and success, both were OK (200 and 1).
Here's my code:
use strict;
use warnings;
use WWW::Mechanize;
use Crypt::SSLeay;
$ENV{HTTPS_PROXY} = 'http://username:pw#host:port';
//I have the https_proxy env variable set globally too.
my $url = 'https://google.com';
//Every https site has the same response,
//so I don't think google would cause problems.
my $mech = WWW::Mechanize->new(noproxy => 0);
$mech->get($url) or die "Couldn't load page";
print "Content:\n".$mech->response()->content()."\n\n";
As you can see I'm behind a proxy. I tried setting
$mech->proxy($myproxy);
but for no avail. I even tried to fetch it into a file, but when I checked it, I got the same response content.
Any kind of advice would be appreciated, since I'm just a beginner and there is still a lot to learn of everything. Thanks!
I think the answer lies here: How do I force LWP to use Crypt::SSLeay for HTTPS requests?
use Net::SSL (); # From Crypt-SSLeay
BEGIN {
$Net::HTTPS::SSL_SOCKET_CLASS = "Net::SSL"; # Force use of Net::SSL
$ENV{HTTPS_PROXY} = 'http://10.0.3.1:3128'; #your proxy!
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
}

Perl WWW::Mechanize and authenticated proxy

I have a proxy with IP, Port, Login and Password.
Is it possible to use this proxy with Perl's WWW::Mechanize?
Thanks.
I'm not sure if there's a native way but here's a workaround which is claimed to work:
http://www.perlmonks.org/?node_id=269358
It's based on setting $ENV{HTTP_PROXY} to "http://login:password#proxyserver" where both login and password must be URI-encoded.
Also, WWW::Mechanize is a subclass of LWP::UserAgent, and as such, what works in LWP should work on Mechanize (example from LWP Cookbook, PROXIES section)
use LWP::UserAgent;
$ua = LWP::UserAgent->new;
$ua->proxy(['http', 'ftp'] => 'http://username:password#proxy.myorg.com');
$req = HTTP::Request->new('GET',"http://www.perl.com");
$res = $ua->request($req);
print $res->decoded_content if $res->is_success;