I am trying to extract url from a webpage and follow them and I am skipping the 4XX and 5XX.My question is Is it possible to print the URL of the request that is returning 200 using xidel
xidel -s --error-handling=4xx=skip,5xx=skip "URL" -e "PRINT URL"
'https://videlibri.sourceforge.net/xidel_readme.txt':
Xidel also defines the following global default variables:
$raw Unparsed input text
$url Url the input was retrieved from (past redirect)
$host, $path Respective part of the url
$json Parsed JSON input, if it was JSON
$headers All HTTP headers, including status code
So, you're looking for -e "$url".
Related
I am having an issue where I am using cURL inside a perl script to execute a http request. I believe my issue is related to special characters in the URL string but I cannot figure out how to make it work.
I can confirm that the URL is correct as I can run it from my browser.
My perl script is
#!/usr/bin/perl
use strict;
use warnings;
$url = "http://machine/callResync?start=2017-02-01 00.00.00.000&end=2017-02-01 23.23.999";
system "curl $url
It fails when it reaches the first whitespace. I tired to escape that using %20.
After that I put in %26 to escape the & but then I get another issue. I have tired a number of different combinations but it keeps failing.
Any idea's.
Use the URI module to correctly build a URL, and rather than shelling out to cURL you should use a Perl library like LWP::Simple to access the page
The disadvantage of LWP::Simple is that it may be too simple in that it provides no diagnostics if the transaction fails. If you find you need something more elaborate then you should look at
HTTP::Tiny,
LWP::UserAgent, or
Mojo::UserAgent.
If you need help with these then please ask
use strict;
use warnings 'all';
use URI;
use LWP::Simple 'get';
my $url = URI->new('http://machine/callResync');
$url->query_form(
start => '2017-02-01 00.00.00.000',
end => '2017-02-01 23.23.999',
);
my $content = get($url) or die "Failed to access URL";
Problem number 1: You used an invalid URL. Spaces can't appear in URLs.
my $url = "http://machine/callResync?start=2017-02-01%2000.00.00.000&end=2017-02-01%2023.23.999";
Problem number 2: Shell injection error. You didn't correctly form your shell command.
system('curl', $url);
or
use String::ShellQuote qw( shell_quote );
my $cmd = shell_quote('curl', $url);
system($cmd);
I am trying to convert this shell command into perl I am using use WWW::Curl::Easy;
I want to write the verbose in a different file and i want to get the URL content into a different file . Till now i have been able to curl the URL and add header to it as well .
Now I want to just write the as shell command mentions output from 1> into a file and
2> into antohter file as it is there in shell
my $curl = WWW::Curl::Easy->new();
$curl->setopt(CURLOPT_HEADER,1);
$curl->pushopt(WWW::Curl::Easy::CURLOPT_HTTPHEADER,['AUTH_USER:John']);
$curl->setopt(WWW::Curl::Easy::CURLOPT_URL,"www.abc.com");
$curl->setopt(CURLOPT_VERBOSE,1);
how do i put the content of www.abc.com that is
<html>
<body>
<marquee>abc is up </marquee>
</body>
</html>
into one file
And this into another file
> GET / HTTP/1.1
Host: abc-01
Accept: */*
AUTH_USER:hojn
< HTTP/1.1 200 OK
......
A cursory reading of the documentation and libcurl docs reveals that you probably need:
# A filehandle, reference to a scalar or reference to a typeglob can be used here.
my $response_body;
$curl->setopt(CURLOPT_WRITEDATA,\$response_body);
my $response_header;
$curl->setopt(CURLOPT_WRITEHEADER,\$response_header);
When using libcurl, it's useful to have its documentation handy.
Browsing through the options, we find CURLOPT_HEADERFUNCTION & WRITEHEADER and CURLOPT_WRITEFUNCTION & WRITEDATA.
use strict;
use warnings;
use Net::Curl::Easy qw( /^CURLOPT_/ );
my $url = 'http://stackoverflow.com/';
my $easy = Net::Curl::Easy->new();
open(my $fh_header, '>:raw', 'header.out') or die $!;
open(my $fh_data, '>:raw', 'data.out' ) or die $!;
$easy->setopt(CURLOPT_URL, $url);
$easy->setopt(CURLOPT_WRITEHEADER, $fh_header);
$easy->setopt(CURLOPT_WRITEDATA, $fh_data);
$easy->perform();
Note: I used Net::Curl over WWW::Curl because I know and trust it. The above should also work with WWW::Curl as long as it provides a Perl-aware default for CURLOPT_WRITEFUNCTION and CURLOPT_HEADERFUNCTION. It it doesn't, you'll need to provide values for those options too.
I am trying to check status of google aws servers using similar line to below:
time curl -k https://blablabla.azazonaws.com/ratingsvc
If the connection failed, times out, etc. I would like to email admin.
I am having trouble getting the error code into a variable
To get further control of the HTTP GET (return codes, analyze the message contents, etc), I recommend the use of LWP::UserAgent
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $response = $ua->get('https://blablabla.azazonaws.com/ratingsvc');
and $response is an instance of HTTP::Response, so you can get total control of the GET response.
You can do this to obtain the returned exit code:
use strict;
use warnings;
my $command = 'time curl -k https://blablabla.azazonaws.com/ratingsvc';
my $return_code = system($command);
if ($return_code == 0) {
print "Everything went well...\n";
} else {
print "Oops... curl exited with code $return_code\n";
}
That will also print STDOUT and STDERR from curl to the screen but it will not affect your program. If you don't want to see the output of the command, you can add something to the end like >/dev/null, you'll still have your returned exit code.
HTH
Francisco
I'm trying to use the CouchDB::Client API for Perl. I want to access a database on some server that hosts couchdb. To access local databases with no access controls this is simple:
my $client = CouchDB::Client->new('localhost', '5984');
my $conn = $client->testConnection;
print "client: $client \n";
print "connx : $conn \n";
But how does one do this for a remote host with access controls? I don't see any option for a password in the CouchDB::Client file .... I need to use Perl for this rather than any other language, maybe there's another API that is more suited that I haven't come across?
Cheers!
By default, you can include username and password into the http url like :-
CouchDB::Client->new(uri=>'http://YOUR_USERNAME:YOUR_PASSWORD#localhost:5984/');
You can do a simple test with curl :-
curl -X HEAD http://USER:PASS#localhost/_active_tasks
// required authentication
I'm trying to get the response of a curl call into a variable in perl.
my $foo = `curl yadd yadda`;
print $foo;
does not work. When I run this at the command line the curl call prints all its output correctly in the terminal, but the variable is not filled with that data.
Is there a way to do this without installing and calling the Perl curl lib?
It probably sends its stuff to stderr. Try
my $foo = `curl yadd yadda 2>&1`;
You also might consider looking at LWP::UserAgent or even LWP::Simple.
What do you really want to do? Use curl at all costs, or grab the contents of a web page?
A more perlish way of doing this (which relies on no external programs that may or may not be installed on the next machine where you need to do this) would be:
use LWP::Simple;
my $content = get("http://stackoverflow.com/questions/1015438/")
or die "no such luck\n";
If you want to see why the GET failed, or grab multiple pages from the same site, you'll need to use a bit more machinery. perldoc lwpcook will get you started.
In the shell 2> means redirect fileno 2. Fileno 2 is always what a program sees as stderr. Similarly, fileno 0 is stdin and fileno 1 is stdout. So, when you say 2>&1 you are telling the shell to redirect stderr (fileno 2) into stdout (fileno 1). Since the backticks operator uses the the shell to run the command you specify, you can use shell redirection, so
my $foo = `curl yadda yadda 2>&1`;
is telling curl to redirect its output into stdout, and since the backtick operator catches stdout, you get what you were looking for.
Very old post, but the real way of using curl in backticks is using the appropriate switch of curl.
This switch is -o which defines where to send the output.
More from the curl man page:
Specifying the output as '-' (a single dash) will force the output to
be done to stdout.
This also prevents having possible errors in $foo, which would happen if you redirect the complete STDERR to STDOUT on errors:
my $foo = `curl -o - yadd yadda`;
Try this:
$var = `curl "http://localhost" 2>/dev/null`;
print length($var)
curl displays progress information on stderr, redirecting that to /dev/null makes it easier to see what's going on.
This works on my system:
#!/usr/bin/perl
use strict;
use warnings;
my $output = `curl www.unur.com`;
print $output;
__END__
C:\> z1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd"><html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
etc.
You can open a pipe as if it were a file.
$url = "\"http://download.finance.yahoo.com/d/quotes.csv?s=" .
"$symbol&f=sl1d1t1c1ohgvper&e=.csv\"";
open CURL, "curl -s $url |" or die "single_stock_quote: Can't open curl $!\n";
$line = <CURL>;
close CURL;
It might be that some of the output you want to capture is in standard err, not standard out. Try this:
my $foo = system "curl http://www.stackoverflow.com";
print $foo;