How to close get image from page in Perl - perl

Using the following code:
use LWP::Simple;
my $url= "http://example.com";
my $html= get $url;
I am able to get the source in $html variable, and when I echo out the variable, it does have the code. I am wondering how would I save one of the images on that page to my folder called images in the same folder where this file is now? I am getting to that page through my $IE->Navigate($url); method.
I tried using Image::Grab module, but it wouldn't install without force (no idea how to do that).

First, parse the HTML using an appropriate module (I tend towards HTML::TreeBuilder::XPath).
Second, use its API to find the <img> element you care about and extract its URI.
Third, convert that URI to an absolute one with URI if needed.
Fourth, use the getstore method from LWP::Simple to save it.

Related

Login to web site using Perl library Selenium::Remote::Driver v1.28

The simple send_keys method has been removed from the v1.28 version of
Selenium::Remote::Driver
and replaced with send_keys_to_active_element. I'm now unable to log in to a web site with username and password fields.
Below is the previous library methods.
How can I do the same using the v1.28 version?
$sel->wait_for_element_present("name=username");
$sel->type("name=username", $username);
$sel->type("name=password", $password);
$sel->submit("name=Login");
I think you're getting confused with the various CPAN modules. The code you show uses methods from
WWW::Selenium,
but it has no send_keys because there is type instead which you use in your code
Selenium::Remote::WebElement
has a send_keys method. If you want to use this module then you need to call one of the find_element methods
from
Selenium::Remote::Driver
to get a WebElement object, and call send_keyson that. You will also need the
Selenium::Waiter
module to wait for given elements to appear
Something like this should work, but there is insufficient detail in your question for me to write a full demonstration, and I have no way of testing Perl code at present
use Selenium::Remote::Driver;
use Selenium::Waiter qw/ wait_until /;
my $driver = Selenium::Remote::Driver->new(...);
$driver->get(...);
wait_until({
$driver->find_element_by_name('username')
})->send_keys($username);
$driver->find_element_by_name('password')->send_keys($password);
$driver->find_element_by_name('Login')->submit;

Perl Dancer how to manage form actions

I'm learning perl Dancer and working on a to-do list depending on a form selection of two dates(today and tomorrow). If you select today a todo list for today will be generated, if you select tomorrow a different list will be created.
I've created a Dancer app called: Organizador and have the following in my Organizador.pm:
package Organizador;
use Dancer ':syntax';
use DBI;
our $VERSION = '0.1';
set session => "Simple";
get '/' => sub{
template 'index';
};
get '/create_to_do_list'=>sub{
template 'create_to_do_list';
};
I've created a file called create_to_do_list.pl which contains the script that I would like to execute when the form is created.
<form action="create_to_do_list.pl">
<legend>Create todo list</legend>
<label for="todoList">Create a todo list</label>
<select name='todoList' id='todoList'>
<option value="today">today</option>
<option value="tomorrow">tomorrow</option>
</select>
<button>Cancel</button>
<button>Create</button>
</form>
How can I call create_to_do_list.pl as an action on template 'create_to_do_list'; after hitting the create button?
Thanks!
I wanted to move to Dancer so I thought there was a faster way of calling my script instead of having to copy it...I'm working with thousands of thousand of [CGI] to-do lists...
Ideally, you should convert all of your CGI scripts to modules so that you can use them in non-CGI contexts (e.g. unit tests, web frameworks like Dancer and Mojolicious); however, if you really have thousands of them, that will take a long time.
As a stop-gap measure while you work on the conversion, you can use CGI::Compile and CGI::Emulate::PSGI to create a PSGI wrapper around each of your unconverted CGI scripts. You can easily integrate these with a Dancer2* app using Plack::Builder.
For example, to integrate the following CGI script with a Dancer2 app:
use strict;
use warnings 'all';
use CGI;
my $q = CGI->new;
print $q->header,
$q->start_html,
$q->h1('Hello, CGI!'),
$q->end_html;
Modify bin/app.psgi to look like this:
use strict;
use warnings 'all';
use FindBin;
use lib "$FindBin::Bin/../lib";
use CGI::Compile;
use CGI::Emulate::PSGI;
use Plack::Builder;
use MyApp;
my $foo_cgi = CGI::Compile->compile('/path/to/foo.cgi');
builder {
mount '/' => MyApp->to_app;
mount '/foo' => CGI::Emulate::PSGI->handler($foo_cgi);
};
Now, requests to / will call the / route in MyApp, while requests to /foo will call your CGI script.
In your form, change:
<form action="create_to_do_list.pl">
to:
<form action="/foo">
Make sure the names of your form fields all match what the CGI script is expecting, and voila! You can keep using your CGI script without modification.
(Note that you could skip all the PSGI wrapper business and just continue serving your CGI scripts with Apache or whatever you were using before, but this approach allows you to centralize your routes and simplifies deployment.)
Add a separate mount statement for each CGI script you want to integrate with your app. Note that this approach will probably have performance problems, so you should only use it as a temporary measure while you work on converting your CGI scripts to proper modules.
* For new development, you should really be using Dancer2. Dancer1 is in maintenance mode and although it's still officially supported, it won't be getting any new features. I know you've had trouble getting started with Dancer2, but you should resolve those issues instead of using an old version of the framework. (And it's still unclear what exactly you were having trouble with; you should edit that question if you still need help.)
Firstly, before you go too far down this path, switch from Dancer to Dancer2.
From your comments, it seems that create_to_do_list.pl is a CGI program. Is it running on the same web server? You could probably call it remotely using something from LWP or HTTP::Tiny, but I don't think that's a very good idea - you'll get HTML back which you'll need to parse in some way to extract the useful information.
It's a far better idea to move the code from create_to_do_list.pl into a module. If the CGI program needs to exist as well (for historical reasons, perhaps) then move the core code into a module which can be used from both the CGI program and the new Dancer app. But if you won't need the CGI program once the Dancer app is ready, I'd just copy the code into the correct place in Organizador.pm.
Instead of using DBI directly, you might find it easier to switch to Dancer::Plugin::Database (or its Dancer2 equivalent), bit for anything other than the simplest of database programs, I'd recommend DBIx::Class (and Dancer2::Plugin::DBIC).

Opening Web Site

I'm kind of new at Perl. A friend of mine asks me to write him a program that could search specific ad on his favourite boat for sale Web Site. It's a very convenient little program that will allow a user to search multiple Web Sites for specific ads.
Here is how it works. I load up the Web Page into a temporary file and search for matching ad and return the result. It works fine most of the time but I noticed that some site won't load up and I don’t know why.
Here is the script that loads the pages and stores it a temp file:
use LWP::UserAgent;
use HTTP::Response;
use URI::Heuristic;
unless (defined ($content = get ($URL) )) { print "could not get $URL <br>"; }
open (DATABASE, ">$web_page_file");
print DATABASE "$content";
close (DATABASE);
#
I've been successfully run it on many sites and it works fine. But recently, 2 sites won't load up. It's :
http://www.babord.ca
http://www.sailboatlistings.com
I have 2 Questions:
A) Can you tell me what is wrong with my script with these two sites?
B) More important, is there a diagnostic tool that can tell me what the problems are (for future problem site)?
The example you posted doesn't work at all for me, and you don't say exactly what isn't working with the two examples you give so it's tough to debug your sample. The below works and I think is a cleaner way of getting what you're looking for:
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $URL = 'http://www.yourboatsite.com';
my $mech = WWW::Mechanize->new(); #Autocheck defaults to ON to check for success.
$mech->get($URL); # Use :content_file option to auto-write to a file.
print $mech->content();
You also probably want to tag your entry as perl rather than mod-perl since it's not a mod_perl problem.

How do I get the text-form verification code when doing auto site access in Perl?

I'm playing around with Win32::IE:Mechanize to try to access some authentication-required sites automatically. So far I've achieved moderate success, for example, I can automatically log in to my yahoo mailbox. But I find many sites are using some kind of image verification mechanism, which is possibly called CAPTCHA. I can do nothing to them. But one of the sites I'm trying to auto access is using a plain-text verification code. It is comnposed of four digits, selectable and copyable. But they're not in the source file which can be fetched using
$mech->content;
I searched for the keyword that appears on the webpage but not in the source file through all the files in the Temporary Internet Files but still can't find it.
Any idea what's going on? I was suspecting that the verification code was somehow hidden in some cookie file but I can't seem to find it :(
The following is the code that completes all the fields requirements except for the verification code:
use warnings;
use Win32::IE::Mechanize;
my $url = "http://www.zjsmap.com/smap/smap_login.jsp";
my $eccode = "myeccode";
my $username = "myaccountname";
my $password = "mypassword";
my $verify = "I can't figure out how to let the script get the code yet"
my $mech = Win32::IE::Mechanize->new(visible=>1);
$mech->get($url);
sleep(1); #avoids undefined value error
$mech->form_name("BaseForm");
$mech->field(ECCODE => $eccode);
$mech->field(MEMBERACCOUNT => $username);
$mech->field(PASSWORD => $password);
$mech->field(verify => $verify);
$mech->click();
Like always any suggestions/comments would be greatly appreciated :)
UPDATE
I've figured out a not-so-smart way to solve this problem. Please comment on my own asnwer posted below. Thanks like always :)
This is the reason why they are there. To stop program like yours to do automated stuff ;-)
A CAPTCHA or Captcha is a type of
challenge-response test used in
computing to ensure that the response
is not generated by a computer.
This appears to be an irrelevant number. The page uses it in 3 places: generating it; displaying it on the form next to the input field for it; and checking for the input value being equal to the random number chosen. That is, it is a client-only check. Still, if you disable javascript it looks like, I'm guessing, important cookies don't get set. If you can execute JavaScript in the context of the page (you should be able to with a get method call and a javascript URI), you could change the value of random_number to f.e. 42 and fill that in on the form.
The code is inserted by JavaScript – disable JS, reload the page and see it disappear. You have to hunt through the JS code to get an idea where it comes from and how to replicate it.
Thanks to james2vegas, zoul and Shoban.
I've finally figured out on my own a not-so-smart but at-least-workable way to solve the problem I described here. I'd like to share it here. I think the approach suggested by #james2vegas is probably much better...but anyway I'm learning along the way.
My approach is this:
Although the verification code is not in the source file but since it is still selectable and copyable, I can let my script copy everything in the login page and then extract the verification code.
To do this, I use the sendkeys functions in the Win32::Guitest module to do "Select All" and "Copy" to the login page.
Then I use Win32:Clipboard to get the clipboard content and then Regexp to extract the code. Something like this:
$verify = Win32::Clipboard::GetText();
$verify =~ s/.* (\d{4}).*/$1/msg;
A few thoughts:
The random number is generated by something like this in Perl
my $random_number = int(rand(8999)) + 1000; #var random_number = rand(1000,10000);
And then it checks if $verify == $random_number. I don't know how to catch the value of one-session-only $random_number. I think it is stored somewhere in the memory. If I can capture the value directly then I wouldn't have gone to so much trouble of using this and that extra module.

How can I read the URL-Data send with POST in Perl?

I'm trying to read out the POST-Data that was sent from a form in a page to my Perl Script. I googled and found out that:
read(STDIN, $param_string, $ENV{'CONTENT_LENGTH'})
reads out the whole Data-String with and writes the whole string to $param_string in the form of
Param1=Value1&Param2=Value2&Param3=Value3
by spliting it at the right places I get the necessary Data.
But I wonder why my $param_string is empty.
When I try the whole thing with GET:
$param_string = $ENV{'QUERY_STRING'};
everything works fine. Does anybody have an idea?
There absolutely no real reason for someone at your level to want to hand parse CGI requests.
Please use CGI::Simple or CGI.pm.
CGI.pm has a lot of baggage (HTML generation, function oriented interface) which makes CGI::Simple preferable.
Using any CGI processing module on CPAN is better than trying to write CGI processing code from scratch.
See parse_query_string in CGI::Simple for a way of accessing parameters passed using the query string when processing a form that is POSTed to your script.
If you want to learn how to do it right, you can read the source code of either module. Reading through the CGI.pm CHANGES file is also instructive.
If you are able to retrieve GET-data but not able to retrieve POST-data, most likely you forgot to change form method from to be post. You can check your submit method by using this condition in if statement:
if ($ENV{'REQUEST_METHOD'} eq "POST"){
read(STDIN, $param_string, $ENV{'CONTENT_LENGTH'});
}else {
$param_string = $ENV{'QUERY_STRING'};
}
Under mod_perl 2, Apache2::Request works for me.