WWW::Mechanize::Firefox - How to open multiple tabs - perl

I want to open two separate tabs and keep them open in firefox. I am using WWW::Mechanize::Firefox.
Please find the code below:
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize::Firefox;
my #array = <DATA>;
foreach (#array)
{
my $mech = WWW::Mechanize::Firefox->new(
activate => 1, # bring the tab to the foreground
autoclose => 0 # to prevent autoclosing of the Tab
);
$mech->get($_);
}
__DATA__
www.google.com
www.yahoo.com
Above code works fine and creates two tabs. But I am not happy with my approach as I am creating two object. I am looking for some way to open multiple tabs. I could not do it with a single $mech object and hence created multiple object using for loop.
I have gone through below documentation:
WWW::Mechanize::Firefox
Firefox::Application
WWW::Mechanize::Plugin::Display
WWW::Mechanize::Firefox::Cookbook
however none of them seem to answer this issue. OR I may have missed some information in those documentation.
QUESTION:
Is it possible to open multiple tabs using WWW::Mechanize::Firefox
or some other relevant firefox mechanize module? and how?
Personally, I did not like my code pasted above whihc is creating
multiple objects. Is it OK/BAD to create multiple objects?
Thanks.

There's nothing inherently wrong with creating multiple objects. You may be concerned by the memory overhead. The objects may have extra stuff going on (localhost TCP sockets in this case) that you'd rather not duplicate. You may find it irritating to have to loop over your multiple objects every time you want to perform one logical action that must be shared across the objects.
But nobody's going to recoil in horror. "Multiple objects?! HOW DARE YOU."
If all that you want to do is open multiple tabs, I use something like this:
sub browse {
my $method = $_[1] ? 'new-window' : 'new-tab';
system firefox => -remote => "openURL($_[0],$method)"
}

Related

Selenium WebDriver with Perl

I am trying to run the Selenium driver with Perl bindings, and due to the lack of examples and documentation, I am running into some roadblocks. I have figured out how to do some basic things, but I seem to be running into some issues with other simple things like validating the text on a page using Remote::Driver package.
If I try to do something like this:
$sel->get("https://www.yahoo.com/" );
$ret = $sel->find_element("//div[contains( text(),'Thursday, April 26, 2012')]");
I get a message back that the element couldn't be found. I am using xpath because the driver package doesn't appear to have a sub specific for finding text.. at least not that I've found.
If my xpath setup is wrong or if someone knows a better way, that would be extremely helpful. I'm having problems with some button clicking too.. but this seems like it should be easier and is bugging me.
Finding text on a web page and comparing that text to some "known good value" using Selenium::Remote::Driver can be implemented as follows:
File: SomeWebApp.pm
package SomeWebApp;
sub get_text_present {
my $self = shift;
my $target = shift;
my $locator = shift;
my $text = $self->{driver}->find_element($target, $locator)->get_text();
return $text;
}
Somewhere in your test script: test.pl
my $text = $some_web_app->get_text_present("MainContent_RequiredFieldValidator6", "id");
The above finds the element identified by $target using the locating scheme identified by $locator and stores it in the variable $text. You can then use that to compare / validate as required / needed.
https is a tad slower loading than http. Although WebDriver is pretty good about waiting until it's figured out that the requested page is fully loaded, maybe you need to give it a little help here. Add a sleep(2); after the get() call and see if it works. If it does, try cutting down to 1 second. You can also do a get_title call to see if you've loaded the page you think you have.
The other possibility is that your text target isn't quite exactly the same as what's on the page. You could try looking first for one word, such as "April", and see if you get a hit, and then expand until you find the mismatch (e.g., does this string actually have a newline or break within it? How about an HTML entity such as a non-breaking space?). Also, you are looking for that bit of text anywhere under a div (all child text supposedly is concatenated, and then the search done). That would more likely cast too wide a net than not get anything at all, but it's worth knowing.

How can I dynamically process all .html files through the Catalyst Framework?

I want to use Catalyst to process all requests: Both html (mixed with Template Toolkit), and those normally intended to be processed by Catalyst. I am aware of Catalyst::Plugin::Static::Simple, but that doesn't seem like it does what I am describing since it simply prints files statically.
As an example, I want to show whether the user is logged in on index.html without using ajax or SSI. There are many other cases beside that one.
There is probably a simple answer to this...
Thanks for the pointer, RET. My solution turned out to be pretty simple.
I made the nginx config point to my Catalyst App before serving files directly and added the path to my html files to the TT config in myapp.pm.
Here are the basics of the snippet I added to my Root.pm controller in the default subroutine:
if($c->req->path =~ m{\.html$} || $c->req->path =~ m{\.htm$}) {
$c->stash->{template} = $c->req->path;
$c->detach;
}
elsif($c->req->path !~ m{[.]+}) {
$c->stash->{template} = $c->req->path . '/index.html';
$c->detach;
}

Erroring in my Perl script coming from CAM::PDF::Annot module. Don't know why

I believe this may be a bug in the module I am using, or I am just completely overlooking something.
My code is this:
#!/usr/bin/perl
use strict;
use warnings;
use CAM::PDF;
use CAM::PDF::Annot;
sub main()
{
my $pdf = CAM::PDF::Annot->new( 'b.pdf' );
my $otherDoc = CAM::PDF::Annot->new( 'b_an.pdf' );
my $page = 1;
my %refs;
my #list = #{$pdf->getAnnotations($page)};
for my $annotRef (#list){
$otherDoc->appendAnnotation( $page, $pdf, $annotRef, \%refs);
}
$otherDoc->output('pdf_merged.pdf');
}
exit main;
This code was taken almost directly from the synopsis found on the module's CPAN page: http://metacpan.org/pod/CAM::PDF::Annot
The problem comes when I run the script using TWO pdf's with annotations. Using two pdf's without annotations runs. Using one pdf with annotations, and one pdf without annotations, runs. Only when both pdf's have annotations does it error.
The error is: "Can't use string ("46") as an ARRAY ref while "strict refs" in use at /usr/opt/perl5/lib/site_perl/5.10.1/CAM/PDF/Annot.pm line 195"
Line 195 of Annot.pm is:
push #{$annots->{value}}, $pupRef;
Annot.pm is inside the CAM::PDF::Annot module.
Any guidance in fixing this would be greatly appreciated!
P.S. In the error, "string ("x")", x is always a number, and seems to change depending on the pdf and the annotations within the pdf.
And I will try to add any other information that you need to help figure this out!
Whenever I have a problem with a CPAN module, I go to its webpage to try and assess its quality and see if any bugs have already been reported.
http://search.cpan.org/~donatoaz/CAM-PDF-Annot-0.06 shows the following suspicious results:
CPAN Testers PASS (2) FAIL (168) NA (49)
It is surprising that you were able to install the module. No one has reported bugs, but there is clearly a major problem with the code. It seems the author is either unaware of the tester reports (which have been sent to his CPAN email address for more than a year), or has stopped maintaining it.
You could submit a bug report, so at least others will be aware of your issue.
I realize this does not answer your question of how to fix the problem, but even if you do identify a fix, the author may not apply it (in which case, someone could start the process of becoming a co-maintaner).

Opening Web Site

I'm kind of new at Perl. A friend of mine asks me to write him a program that could search specific ad on his favourite boat for sale Web Site. It's a very convenient little program that will allow a user to search multiple Web Sites for specific ads.
Here is how it works. I load up the Web Page into a temporary file and search for matching ad and return the result. It works fine most of the time but I noticed that some site won't load up and I don’t know why.
Here is the script that loads the pages and stores it a temp file:
use LWP::UserAgent;
use HTTP::Response;
use URI::Heuristic;
unless (defined ($content = get ($URL) )) { print "could not get $URL <br>"; }
open (DATABASE, ">$web_page_file");
print DATABASE "$content";
close (DATABASE);
#
I've been successfully run it on many sites and it works fine. But recently, 2 sites won't load up. It's :
http://www.babord.ca
http://www.sailboatlistings.com
I have 2 Questions:
A) Can you tell me what is wrong with my script with these two sites?
B) More important, is there a diagnostic tool that can tell me what the problems are (for future problem site)?
The example you posted doesn't work at all for me, and you don't say exactly what isn't working with the two examples you give so it's tough to debug your sample. The below works and I think is a cleaner way of getting what you're looking for:
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $URL = 'http://www.yourboatsite.com';
my $mech = WWW::Mechanize->new(); #Autocheck defaults to ON to check for success.
$mech->get($URL); # Use :content_file option to auto-write to a file.
print $mech->content();
You also probably want to tag your entry as perl rather than mod-perl since it's not a mod_perl problem.

How can I run a CGI::Application run mode from the command line?

I have a run mode in my CGI::Application web-app that I would like to be able to trigger from the command line so i can automate it. From the web-app's perspective it does some processing then sends the results in an email.
When called from the web interface it passes in a set of parameters (email address, which query to run, date, etc) so these need to be passed in.
How can I construct a call to the CGI::Application app that will be the same as if I ran it from the web?
The original CGI specification makes it easy to run things from the command line and was fully intended not as a specific HTTP-only interface but something that could handle FTP and gopher as well as new top-level URL schemes. I know what I wanted when I helped specify it.
The spec I referenced should give you all you need, but for the most part it is just a collection of environment variables. If you see a request for:
http://some.server.com/some/path?a=b&c=d
The environment variables come out looking like this:
SERVER_PROTOCOL=http
REQUEST_METHOD=GET
HTTP_HOST=some.server.com
SERVER_PORT=80
PATH_INFO=/some/path
QUERY_INFO=a=b&c=d
To reverse the polarity of that in Perl would go something like this:
$ENV{'SERVER_PROTOCOL'} = 'http';
$ENV{'REQUEST_METHOD'} = 'GET';
$ENV{'SERVER_PORT'} = 80;
$ENV{'PATH_INFO'} = '/some/path';
$ENV{'QUERY_INFO'} = 'a=b&c=d';
system("perl your-CGI-script.pl");
Things get a bit more complicated in handling POST queries and there are more possible environment variables that may be required. Worst case you can enumerate them all with a quick CGI script something like:
print "Content-Type: text/plain\r\n\r\n";
foreach (keys(%ENV))
{
print "$_=$ENV{$_}\r\n";
}
Now put that on the web server in place of your CGI script and you'll see all the environment that gets passed in (and the original environment so you'll need to make a few judgement calls).
Upon further digging through the CGI::App and the CGI documentation, it appeared to be more straightforward than I thought. The simplest case (no real argument handling or dealing with the output from the webapp run call) is:
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use WebApp;
my $cgi = new CGI( \%{#ARGV} );
my $webapp = WebApp->new( QUERY => $cgi );
$webapp->run();
It just takes a series of space separated name value pairs to create the CGI. You need to pass in the run mode and all the arguments.
I'm the maintainer of CGI::Application, and I do this all the time-- I have dozen of cron scripts built with CGI::Application because it's convenient to share the infrastructure with the application.
The simplest approach is this:
# There is no browser to return results to.
$ENV{CGI_APP_RETURN_ONLY} = 1;
my $app = WebApp->new;
$app->direct_run_mode_method;
In that example, you bypass the normal flow and call a method directly. Be sure you don't need any of the "setup" or "teardown" actions to happen in that case.
If you just have one run mode you are calling, you can also just set the "start_mode", and call run(), so then the default run mode is called by default.
Another idea: you can use a module like Getopt::Long and pass in values through the PARAM hash to new(), or completely replace the run-mode selection process. Here's an example where command line flags are used to determine the run mode:
sub setup {
my $self = shift;
$self->start_mode('send_error_digests');
$self->run_modes([qw/
send_error_digests
help
/]);
my ($dry_run, $help);
GetOptions(
'dry-run' => \$dry_run,
'help' => \$help
);
$self->param('dry_run' => $dry_run);
$self->mode_param(sub {
return 'help' if $help;
return $self->start_mode();
});
}
Thusly:
$ perl yourscript.pl field1=value1 field2=value2
Perl's CGI library takes care of the magic for you, and it appears that CGI::Application relies on CGI (judging from their example code).
Instead of having to go through CGI::Application every time you want to get something done, enforce a proper separation of concerns, perhaps using an MVC setup. All of the functionality should exist outside of the CGI::Application stuff since that should only work as a controller. Once you separate out those bits, you can easily write other controllers for other input methods.
Don't write a web application; write an an application that happens to have a web interface. When you have that, you can easily give your application other sorts of interfaces.
You could automate by calling the web app using curl, wget, or an LWP GET-script with the appropriate parameters. I've used a similar system for cron-driven tasks with a Catalyst application.
That deals with all the environment variables for you..