Error while using WWW::Mechanize in Perl to open a Webpage

Error while using WWW::Mechanize in Perl to open a Webpage - perl

I am trying to open a web page in Perl using WWW::Mechanize module. The code for the same is as follows:
use WWW::Mechanize;
my $m = WWW::Mechanize->new();
$url = 'http://www.google.com';
$m->get($url);
print "$m->content()";
When i run this code i get an error like this:
Error GETing http://www.google.com :Can't connect to www.google.com:80.
What could be the reason for such an error and how can I change my code so that it opens up the webpage specified in the URL.?

There's 2 problems :
the line print "$m->content()"; should be written print $m->content(); : you will get WWW::Mechanize=HASH(0xeca870)->content() otherwise.
it seems that you have a network or software problem: the rest of your code works.

Related

Unable to get page via HTTPS with LWP::Simple in Perl

I try to download a page from an HTTPS URL with Perl:
use LWP::Simple;
my $url = 'https://www.ferc.gov/xml/whats-new.xml';
my $content = get $url or die "Unable to get $url\n";
print $content;
There seems to be a problem. Just can't figure out the error. I can't get the page. Is the get request improperly coded? Do I need to use a user agent?

LWP::Protocol::https is needed to make HTTPS requests with LWP. It needs to be installed separately from the rest of LWP. It looks like you installed LWP, but not LWP::Protocol::https, so simply install it now.

Perl url encoding using Curl

I am having an issue where I am using cURL inside a perl script to execute a http request. I believe my issue is related to special characters in the URL string but I cannot figure out how to make it work.
I can confirm that the URL is correct as I can run it from my browser.
My perl script is
#!/usr/bin/perl
use strict;
use warnings;
$url = "http://machine/callResync?start=2017-02-01 00.00.00.000&end=2017-02-01 23.23.999";
system "curl $url
It fails when it reaches the first whitespace. I tired to escape that using %20.
After that I put in %26 to escape the & but then I get another issue. I have tired a number of different combinations but it keeps failing.
Any idea's.

Use the URI module to correctly build a URL, and rather than shelling out to cURL you should use a Perl library like LWP::Simple to access the page
The disadvantage of LWP::Simple is that it may be too simple in that it provides no diagnostics if the transaction fails. If you find you need something more elaborate then you should look at
HTTP::Tiny,
LWP::UserAgent, or
Mojo::UserAgent.
If you need help with these then please ask
use strict;
use warnings 'all';
use URI;
use LWP::Simple 'get';
my $url = URI->new('http://machine/callResync');
$url->query_form(
start => '2017-02-01 00.00.00.000',
end => '2017-02-01 23.23.999',
);
my $content = get($url) or die "Failed to access URL";

Problem number 1: You used an invalid URL. Spaces can't appear in URLs.
my $url = "http://machine/callResync?start=2017-02-01%2000.00.00.000&end=2017-02-01%2023.23.999";
Problem number 2: Shell injection error. You didn't correctly form your shell command.
system('curl', $url);
or
use String::ShellQuote qw( shell_quote );
my $cmd = shell_quote('curl', $url);
system($cmd);

Perl LWP::Simple won't "get" a webpage when running from remote server

I'm trying to use Perl to scrape a publications list as follows:
use XML::XPath;
use XML::XPath::XMLParser;
use LWP::Simple;
my $url = "https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxxx/xxxxxx.rdf";
my $content = get($url);
die "Couldn't get publications!" unless defined $content;
When I run it on my local (Windows 7) machine it works fine. When I try to run it on the linux server where we are hosting some websites, it dies. I installed XML and LWP using cpan so those should be there. I'm wondering if the problem could be some sort of security or permissions on the server (keeping it from accessing an external website), but I don't even know where to start with that. Any ideas?

Turns out I didn't have LWP::Protocol::https" installed. I found this out by switching
LWP::Simple
to
LWP::UserAgent
and adding the following:
my $ua = LWP::UserAgent->new;
my $resp = $ua->get('https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxx/xxxxxxx.rdf' );
print $resp;
It then returned an error telling me it didn't have the protocol to access the https without LWP::Protocol::https, so I installed it with
cpan LWP::Protocol::https
and all was good.

HTTP Error 502.2 - Bad Gateway

I am trying to learn Perl so I need to be able to test it locally. I have IIS7.5 and I have the pl and cgi mapped. But I am throwing this error when I test it in Chrome.
It works fine when I use the command prompt. So I don't understand what I am missing or how to troubleshoot this. I've searched all over google, can someone help me.
My test code:
#!/usr/bin/perl
use warnings;
use strict;
use CGI;
my $test = new CGI();
print $test->header("text/html"),$test->start_html("PERL Test");
print $test->h1("PERL IS WORKING!");
print $test->end_html;
And the command console spits out a perfectly programmed HTML page:

I could not download specific page via perl get, bash command GET and wget

I have an issue with downloading a page,
my $url='http://www.ncbi.nlm.nih.gov/nuccore?linkname=pcassay_nucleotide&from_aid=504934,1806,1805,1674';
I can browse following with a browser but when I run bash command in perl or linux shell,
GET $url >OUTPUT1; # Even it does not write anything to file "OUPUT1"
When I try wget, It downloads but not correct ,I mean with --> <title>Error - Nucleotide - NCBI</title>. I want the page with items , but it returns me a page without items.
my $html = qx{wget --quiet --output-document=OUTPUT1 $url};
**Note: I noticed a few minutes ago, url is ok with Mozilla firefox, but it can not be browsed via google chrome. it is weird, probably my issue related with this too. Any idea?
Code from link:
my $url='http://www.ncbi.nlm.nih.gov/nuccore?linkname=pcassay_nucleotide&from_aid=504934,1806,1805,1674';
my $html = qx{wget --quiet --output-document=OUTPUT11 $url};
# wget get something, but it does not get items, it gets what I get via google chrome
`GET $url2 >OUTPUT11`; # it does not write anything to file,

OK, given your code - the problem is almost certainly one of interpolation. Because the & in your URL is going to be interpreted by the shell you're spawning as 'background this process'.
That's almost certainly not what you want. Why not just use LWP natively?
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
my $url='http://www.ncbi.nlm.nih.gov/nuccore?linkname=pcassay_nucleotide&from_aid=504934,1806,1805,1674';
my $content = get $url;
print $content;
open ( my $output_fh, '>', 'output.html' ) or die $!;
print {$output_fh} $content;
close ( $output_fh );

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Error while using WWW::Mechanize in Perl to open a Webpage - perl

There's 2 problems : the line print "$m->content()"; should be written print $m->content(); : you will get WWW::Mechanize=HASH(0xeca870)->content() otherwise. it seems that you have a network or software problem: the rest of your code works.

Related

Unable to get page via HTTPS with LWP::Simple in Perl

Perl url encoding using Curl

Perl LWP::Simple won't "get" a webpage when running from remote server

HTTP Error 502.2 - Bad Gateway

I could not download specific page via perl get, bash command GET and wget

Categories

Resources