I'm not too familiar with Perl, but I am using it for a simple script I am going to write. This script will interface with Qualys so while looking up information about the Qualys API I found this statement while looking through their sample code. I have put it on Pastebin.com (here) so you don't have to download it to view it. If for some reason you do want to download it yourself, here is a link to the page where I got it for those that want to be able to download the source (it's the "Get Map" one).
Anyways, here is the statement (line 261) that has me a little confused:
$request = new HTTP::Request GET => $url;
I'm confused about the new and GET => $url parts of the statement.
I think I mostly understand what's going on with the new part of the statement, but if someone could explain how the HTTP::Request works with creating a new LWP::UserAgent that would help clarify this line (I looked at LWP::UserAgent on CPAN, but the "KEY/DEFAULT" table they have under the new subroutine explanation made little sense to me).
I really have no idea what is happening in the GET => $url part of the statement. My guess is that it is assigning a value in either HTTP::Request or LWP::UserAgent but I can't find any information to back up that idea.
The given line is equivalent to
$request = HTTP::Request->new(GET => $url);
which could also be written as
$request = HTTP::Request->new('GET', $url);
The example used the indirect method syntax.
The connection between HTTP::Request and LWP::UserAgent is sketched in the CPAN documentation as followes:
require HTTP::Request;
$request = HTTP::Request->new(GET => 'http://www.example.com/');
$ua = LWP::UserAgent->new;
$response = $ua->request($request);
So The HTTP:.Request->new(...) creates a new request which can be executed by a user agent
Related
my $jira = JIRA::REST->new({
url => 'https://something.com:8443',
username => 'username',
password => 'password',
session => 1,
});
The above code doesn't work and fails with below error probably due to port number at the end
Can't connect to something.com:8443 (Bad file descriptor)
is there a way/variable to mention the port number?
You need to add the setting ssl_verify_none => 1 into the hash used in your constructor.
The underlying LWP code allows you to not verify the certs of systems you connect to (which is not recommended for production), or it also allows you to specify a Certificate Authority (CA) cert that can be used to verify certs of systems you connect to. It looks like JIRA::REST has only supported the first option.
You might be better off just using the underlying LWP code, like this:
use LWP::UserAgent;
my $ua = LWP::UserAgent->new(
ssl_opts => { verify_hostname => 1, SSL_ca_file => 'myCA.cer' },
protocols_allowed => ['https'],
);
my $req = HTTP::Request->new(
GET => 'https://something.com:8443/rest/api/latest/issue/ABC-123',
);
$req->authorization_basic('username','password');
my $res = $ua->request($req);
It looks like JIRA::REST is just providing the raw JSON response to you anyway, so it's not really saving you all that much processing.
The main advantage of REST::Client is that it saves some stuff for you to add to each request implicitly. There's nothing particularly REST-y or helpful beyond sending a request and giving its response back to you.
The JIRA::Client has a few advantages, though, since it knows how to get a session cookie and properly attach files, and, more importantly, deal with paginated results. But often, when I'm doing things with Jira, I want more power.
Back in LWP's heyday, it was very frustrating to track transactions: a request-response pair. You could, but you had to manage it yourself. And, there weren't hooks in the process, so you had to create everything every time.
Then, LWP tried to work around some SSL issues (verifying host names, etc) and also split out LWP::Protocol::https. That's not a big deal when you understand that, but even though I do, it's something I have to remember every time I want to use LWP for something. There are reasons for everything that happened, but that doesn't make it any less annoying. That leads to the sort of work jimtut showed in his answer. Every time. But, it's a small speed bump on the way to insecurity.
I like Mojolicious much more because it represents complete transactions but also has hooks (well, events) that allow you to fiddle with things automatically while the process is chugging along.
Here's an example from Mojo Web Clients that shows me creating a user-agent for each service and setting some stuff for each transaction. I can adjust the request any way that I please before it does its work (and this is mostly what that REST::Client and JIRA::Client are doing for you):
my $travis_ua = Mojo::UserAgent->new();
$travis_ua->on( start => sub {
my( $ua, $tx ) = #_;
$tx->req->headers->authorization(
"token $ENV{TRAVIS_API_KEY}" );
$tx->req->headers->accept(
'application/vnd.travis-ci.2.1+json' );
} );
my $appveyor_ua = Mojo::UserAgent->new();
$appveyor_ua->on( start => sub {
my( $ua, $tx ) = #_;
$tx->req->headers->authorization(
"Bearer $ENV{APPVEYOR_API_KEY}" );
} );
In the Basic auth case, it's just a different value in that header:
use Mojo::Util qw(b64_encode);
my $jira_ua = Mojo::UserAgent->new();
$jira_ua->on( start => sub {
my( $ua, $tx ) = #_;
$tx->req->headers->authorization(
'Basic ' . b64_encode( join ':', $username, $password ) );
} );
Now, when I use those user-agents, the auth stuff is automatically added:
my $tx = $travis_ua->get( $url );
And, that $tx gives me access to the request and the response, so I don't need REST::Client to handle that for me either.
Since Mojolicious is handling all of this in one convenient package, I don't have to wrangle different objects. As such, there's not much left over that REST::Client can do for me.
I'm pretty new to Perl. While I just created a simple scripts to retrieve a file with
getstore($url, $file);
But how do I know whether the task is done correctly or the connection interrupted in the middle, or authentication failed, or whatever response. I searched all the web and I found some, like a response list, and some talking about useragent stuff, which I totally can't understand, especially the operator $ua->.
What I wish is to an explanation about that operator stuff (I don't even know what -> used for), and the RC code meaning, and finally, how to use it.
Its a lot of stuff so I appreciate any answer given, even just partially. And, thanks first for whoever will to help. =)
The LWP::Simple module is just that: quite simplistic. The documentation states that the getstore function returns the HTTP status code which we can save into a variable. There are also the is_success and is_error functions that tell us whether a certain return value is ok or not.
my $url = "http://www.example.com/";
my $filename = "some-file.html";
my $rc = getstore($url, $filename)
if (is_error($rc)) {
die "getstore of <$url> failed with $rc";
}
Of course, this doesn't catch errors with the file system.
The die throws a fatal exception that terminates the execution of your script and displays itself on the terminal. If you don't want to abort execution use warn.
The LWP::Simple functions provide high-level controls for common tasks. If you need more control over the requests, you have to manually create an LWP::UserAgent. An user agent (abbreviated ua) is a browser-like object that can make requests to servers. We have very detailed control over these requests, and can even modify the exact header fields.
The -> operator is a general dereference operator, which you'll use a lot when you need complex data structures. It is also used for method calls in object-oriented programming:
$object->method(#args);
would call the method on $object with the #args. We can also call methods on class names. To create a new object, usually the new method is used on the class name:
my $object = The::Class->new();
Methods are just like functions, except that you leave it to the class of the object to figure out which function exactly will be called.
The normal workflow with LWP::UserAgent looks like this:
use LWP::UserAgent; # load the class
my $ua = LWP::UserAgent->new();
We can also provide named arguments to the new method. Because these UA objects are robots, it is considered good manners to tell everybody who sent this Bot. We can do so with the from field:
my $ua = LWP::UserAgent->new(
from => 'ss-tangerine#example.com',
);
We could also change the timeout from the default three minutes. These options can also be set after we constructed a new $ua, so we can do
$ua->timeout(30); # half a minute
The $ua has methods for all the HTTP requests like get and post. To duplicate the behaviour of getstore, we first have to get the URL we are interested in:
my $url = "http://www.example.com/";
my $response = $ua->get($url);
The $response is an object too, and we can ask it whether it is_success:
$response->is_success or die $response->status_line;
So if execution flows past this statement, everything went fine. We can now access the content of the request. NB: use the decoded_content method, as this manages transfer encodings for us:
my $content = $response->decoded_content;
We can now print that to a file:
use autodie; # automatic error handling
open my $fh, ">", "some-file.html";
print {$fh} $content;
(when handling binary files on Windows: binmode $fh after opening the file, or use the ">:raw" open mode)
Done!
To learn about LWP::UserAgent, read the documentation. To learn about objects, read perlootut. You can also visit the perl tag on SO for some book suggestions.
How can I pass the variables from one perl webpage to the next, here is my example:
This is what I want passed from the first page, $data[0] and $data[2]
<a href="Month_entries.pl?month='$data[2]'&user='$data[0]'
style="text-decoration:none"
onclick="return popitup('Month_entries')">$busitotal2</a>
With it going to Month_entries.pl how to a call these variables in the new webpage(Month_entries)? what is this process called?
First, you should make sure that you are constructing the URI you actually want.
You probably don't want ' characters in the data
You problem should be protecting against XSS and broken data with URI::Encode.
Then it comes down to getting data from the query string.
How you do this depends on how you server and Perl are communicating.
If you are using Plack (which is generally a good idea for modern Perl), then see the code in the synopsis for Plack::Request:
my $app_or_middleware = sub {
my $env = shift;
my $req = Plack::Request->new($env);
my $path_info = $req->path_info;
# Change 'query' to whatever you called your key in the query string
my $query = $req->param('query');
my $res = $req->new_response(200);
$res->finalize;
};
If you are using a framework (such as Web::Simple, Catalyst or Dancer) then it will probably provide its own interface.
If you are using CGI, and using the CGI module, you would:
my $cgi = CGI->new();
my $ query = $cgi->param('query')
Do I gain something when I transform my $url like this: $url = URI->new( $url )?
#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use URI;
use XML::LibXML;
my $url = 'http://stackoverflow.com/';
$url = URI->new( $url );
my $doc = XML::LibXML->load_html( location => $url, recover => 2 );
my #nodes = $doc->getElementsByTagName( 'a' );
say scalar #nodes;
The URI module constructor would clean up the URI for you - for example correctly escape the characters invalid for URI construction (see URI::Escape).
The URI module as several benefits:
It normalizes the URL for you
It can resolve relative URLs
It can detect invalid URLs (although you need to turn off the schemeless bits)
You can easily filter the URLs that you want to process.
The benefit that you get with the little bit of code that you show is minimal, but as you continue to work on the problem, perhaps spidering the site, URI becomes more handy as you select what to do next.
I'm surprised nobody has mentioned it yet, but$url = URI->new( $url ); doesn't clean up your $url and hand it back to you, it creates a new object of class URI (or, rather, of one if its subclasses) which can then be passed to other code which requires a URI object. That's not particularly important in this case, since XML::LibXML appears to be happy to accept locations as either strings or objects, but some other modules require you to give them a URI object and will reject URLs presented as plain strings.
I've run into an issue with mod_rewrite when submitting forms to our site perl scripts. If someone does a GET request on a page with a url such as http://www.example.com/us/florida/page-title, I rewrite that using the following rewrite rule which works correctly:
RewriteRule ^us/(.*)/(.*)$ /cgi-bin/script.pl?action=Display&state=$1&page=$2 [NC,L,QSA]
Now, if that page had a form on it I'd like to do a form post to the same url and have Mod Rewrite use the same rewrite rule to call the same script and invoke the same action. However, what's happening is that the rewrite rule is being triggered, the correct script is being called and all form POST variables are being posted, however, the rewritten parameters (action, state & page in this example) aren't being passed to the Perl script. I'm accessing these variables using the same Perl code for both the GET and POST requests:
use CGI;
$query = new CGI;
$action = $query->param('action');
$state = $query->param('state');
$page = $query->param('page');
I included the QSA flag since I figured that might resolve the issue but it didn't. If I do a POST directly to the script URL then everything works correctly. I'd appreciate any help in figuring out why this isn't currently working. Thanks in advance!
If you're doing a POST query, you need to use $query->url_param('action') etc. to get parameters from the query string. You don't need or benefit from the QSA modifier.
Change your script to:
use CGI;
use Data::Dumper;
my $query = CGI->new; # even though I'd rather call the object $cgi
print $query->header('text/plain'), Dumper($query);
and take a look at what is being passed to your script and update your question with that information.