I am trying to open a web page in Perl using WWW::Mechanize module. The code for the same is as follows:
use WWW::Mechanize;
my $m = WWW::Mechanize->new();
$url = 'http://www.google.com';
$m->get($url);
print "$m->content()";
When i run this code i get an error like this:
Error GETing http://www.google.com :Can't connect to www.google.com:80.
What could be the reason for such an error and how can I change my code so that it opens up the webpage specified in the URL.?
There's 2 problems :
the line print "$m->content()"; should be written print $m->content(); : you will get WWW::Mechanize=HASH(0xeca870)->content() otherwise.
it seems that you have a network or software problem: the rest of your code works.
Related
I try to download a page from an HTTPS URL with Perl:
use LWP::Simple;
my $url = 'https://www.ferc.gov/xml/whats-new.xml';
my $content = get $url or die "Unable to get $url\n";
print $content;
There seems to be a problem. Just can't figure out the error. I can't get the page. Is the get request improperly coded? Do I need to use a user agent?
LWP::Protocol::https is needed to make HTTPS requests with LWP. It needs to be installed separately from the rest of LWP. It looks like you installed LWP, but not LWP::Protocol::https, so simply install it now.
I am having an issue where I am using cURL inside a perl script to execute a http request. I believe my issue is related to special characters in the URL string but I cannot figure out how to make it work.
I can confirm that the URL is correct as I can run it from my browser.
My perl script is
#!/usr/bin/perl
use strict;
use warnings;
$url = "http://machine/callResync?start=2017-02-01 00.00.00.000&end=2017-02-01 23.23.999";
system "curl $url
It fails when it reaches the first whitespace. I tired to escape that using %20.
After that I put in %26 to escape the & but then I get another issue. I have tired a number of different combinations but it keeps failing.
Any idea's.
Use the URI module to correctly build a URL, and rather than shelling out to cURL you should use a Perl library like LWP::Simple to access the page
The disadvantage of LWP::Simple is that it may be too simple in that it provides no diagnostics if the transaction fails. If you find you need something more elaborate then you should look at
HTTP::Tiny,
LWP::UserAgent, or
Mojo::UserAgent.
If you need help with these then please ask
use strict;
use warnings 'all';
use URI;
use LWP::Simple 'get';
my $url = URI->new('http://machine/callResync');
$url->query_form(
start => '2017-02-01 00.00.00.000',
end => '2017-02-01 23.23.999',
);
my $content = get($url) or die "Failed to access URL";
Problem number 1: You used an invalid URL. Spaces can't appear in URLs.
my $url = "http://machine/callResync?start=2017-02-01%2000.00.00.000&end=2017-02-01%2023.23.999";
Problem number 2: Shell injection error. You didn't correctly form your shell command.
system('curl', $url);
or
use String::ShellQuote qw( shell_quote );
my $cmd = shell_quote('curl', $url);
system($cmd);
I'm trying to use Perl to scrape a publications list as follows:
use XML::XPath;
use XML::XPath::XMLParser;
use LWP::Simple;
my $url = "https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxxx/xxxxxx.rdf";
my $content = get($url);
die "Couldn't get publications!" unless defined $content;
When I run it on my local (Windows 7) machine it works fine. When I try to run it on the linux server where we are hosting some websites, it dies. I installed XML and LWP using cpan so those should be there. I'm wondering if the problem could be some sort of security or permissions on the server (keeping it from accessing an external website), but I don't even know where to start with that. Any ideas?
Turns out I didn't have LWP::Protocol::https" installed. I found this out by switching
LWP::Simple
to
LWP::UserAgent
and adding the following:
my $ua = LWP::UserAgent->new;
my $resp = $ua->get('https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxx/xxxxxxx.rdf' );
print $resp;
It then returned an error telling me it didn't have the protocol to access the https without LWP::Protocol::https, so I installed it with
cpan LWP::Protocol::https
and all was good.
I am trying to learn Perl so I need to be able to test it locally. I have IIS7.5 and I have the pl and cgi mapped. But I am throwing this error when I test it in Chrome.
It works fine when I use the command prompt. So I don't understand what I am missing or how to troubleshoot this. I've searched all over google, can someone help me.
My test code:
#!/usr/bin/perl
use warnings;
use strict;
use CGI;
my $test = new CGI();
print $test->header("text/html"),$test->start_html("PERL Test");
print $test->h1("PERL IS WORKING!");
print $test->end_html;
And the command console spits out a perfectly programmed HTML page:
I have an issue with downloading a page,
my $url='http://www.ncbi.nlm.nih.gov/nuccore?linkname=pcassay_nucleotide&from_aid=504934,1806,1805,1674';
I can browse following with a browser but when I run bash command in perl or linux shell,
GET $url >OUTPUT1; # Even it does not write anything to file "OUPUT1"
When I try wget, It downloads but not correct ,I mean with --> <title>Error - Nucleotide - NCBI</title>. I want the page with items , but it returns me a page without items.
my $html = qx{wget --quiet --output-document=OUTPUT1 $url};
**Note: I noticed a few minutes ago, url is ok with Mozilla firefox, but it can not be browsed via google chrome. it is weird, probably my issue related with this too. Any idea?
Code from link:
my $url='http://www.ncbi.nlm.nih.gov/nuccore?linkname=pcassay_nucleotide&from_aid=504934,1806,1805,1674';
my $html = qx{wget --quiet --output-document=OUTPUT11 $url};
# wget get something, but it does not get items, it gets what I get via google chrome
`GET $url2 >OUTPUT11`; # it does not write anything to file,
OK, given your code - the problem is almost certainly one of interpolation. Because the & in your URL is going to be interpreted by the shell you're spawning as 'background this process'.
That's almost certainly not what you want. Why not just use LWP natively?
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
my $url='http://www.ncbi.nlm.nih.gov/nuccore?linkname=pcassay_nucleotide&from_aid=504934,1806,1805,1674';
my $content = get $url;
print $content;
open ( my $output_fh, '>', 'output.html' ) or die $!;
print {$output_fh} $content;
close ( $output_fh );