I'm kind of new at Perl. A friend of mine asks me to write him a program that could search specific ad on his favourite boat for sale Web Site. It's a very convenient little program that will allow a user to search multiple Web Sites for specific ads.
Here is how it works. I load up the Web Page into a temporary file and search for matching ad and return the result. It works fine most of the time but I noticed that some site won't load up and I don’t know why.
Here is the script that loads the pages and stores it a temp file:
use LWP::UserAgent;
use HTTP::Response;
use URI::Heuristic;
unless (defined ($content = get ($URL) )) { print "could not get $URL <br>"; }
open (DATABASE, ">$web_page_file");
print DATABASE "$content";
close (DATABASE);
#
I've been successfully run it on many sites and it works fine. But recently, 2 sites won't load up. It's :
http://www.babord.ca
http://www.sailboatlistings.com
I have 2 Questions:
A) Can you tell me what is wrong with my script with these two sites?
B) More important, is there a diagnostic tool that can tell me what the problems are (for future problem site)?
The example you posted doesn't work at all for me, and you don't say exactly what isn't working with the two examples you give so it's tough to debug your sample. The below works and I think is a cleaner way of getting what you're looking for:
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $URL = 'http://www.yourboatsite.com';
my $mech = WWW::Mechanize->new(); #Autocheck defaults to ON to check for success.
$mech->get($URL); # Use :content_file option to auto-write to a file.
print $mech->content();
You also probably want to tag your entry as perl rather than mod-perl since it's not a mod_perl problem.
Related
I'm learning perl Dancer and working on a to-do list depending on a form selection of two dates(today and tomorrow). If you select today a todo list for today will be generated, if you select tomorrow a different list will be created.
I've created a Dancer app called: Organizador and have the following in my Organizador.pm:
package Organizador;
use Dancer ':syntax';
use DBI;
our $VERSION = '0.1';
set session => "Simple";
get '/' => sub{
template 'index';
};
get '/create_to_do_list'=>sub{
template 'create_to_do_list';
};
I've created a file called create_to_do_list.pl which contains the script that I would like to execute when the form is created.
<form action="create_to_do_list.pl">
<legend>Create todo list</legend>
<label for="todoList">Create a todo list</label>
<select name='todoList' id='todoList'>
<option value="today">today</option>
<option value="tomorrow">tomorrow</option>
</select>
<button>Cancel</button>
<button>Create</button>
</form>
How can I call create_to_do_list.pl as an action on template 'create_to_do_list'; after hitting the create button?
Thanks!
I wanted to move to Dancer so I thought there was a faster way of calling my script instead of having to copy it...I'm working with thousands of thousand of [CGI] to-do lists...
Ideally, you should convert all of your CGI scripts to modules so that you can use them in non-CGI contexts (e.g. unit tests, web frameworks like Dancer and Mojolicious); however, if you really have thousands of them, that will take a long time.
As a stop-gap measure while you work on the conversion, you can use CGI::Compile and CGI::Emulate::PSGI to create a PSGI wrapper around each of your unconverted CGI scripts. You can easily integrate these with a Dancer2* app using Plack::Builder.
For example, to integrate the following CGI script with a Dancer2 app:
use strict;
use warnings 'all';
use CGI;
my $q = CGI->new;
print $q->header,
$q->start_html,
$q->h1('Hello, CGI!'),
$q->end_html;
Modify bin/app.psgi to look like this:
use strict;
use warnings 'all';
use FindBin;
use lib "$FindBin::Bin/../lib";
use CGI::Compile;
use CGI::Emulate::PSGI;
use Plack::Builder;
use MyApp;
my $foo_cgi = CGI::Compile->compile('/path/to/foo.cgi');
builder {
mount '/' => MyApp->to_app;
mount '/foo' => CGI::Emulate::PSGI->handler($foo_cgi);
};
Now, requests to / will call the / route in MyApp, while requests to /foo will call your CGI script.
In your form, change:
<form action="create_to_do_list.pl">
to:
<form action="/foo">
Make sure the names of your form fields all match what the CGI script is expecting, and voila! You can keep using your CGI script without modification.
(Note that you could skip all the PSGI wrapper business and just continue serving your CGI scripts with Apache or whatever you were using before, but this approach allows you to centralize your routes and simplifies deployment.)
Add a separate mount statement for each CGI script you want to integrate with your app. Note that this approach will probably have performance problems, so you should only use it as a temporary measure while you work on converting your CGI scripts to proper modules.
* For new development, you should really be using Dancer2. Dancer1 is in maintenance mode and although it's still officially supported, it won't be getting any new features. I know you've had trouble getting started with Dancer2, but you should resolve those issues instead of using an old version of the framework. (And it's still unclear what exactly you were having trouble with; you should edit that question if you still need help.)
Firstly, before you go too far down this path, switch from Dancer to Dancer2.
From your comments, it seems that create_to_do_list.pl is a CGI program. Is it running on the same web server? You could probably call it remotely using something from LWP or HTTP::Tiny, but I don't think that's a very good idea - you'll get HTML back which you'll need to parse in some way to extract the useful information.
It's a far better idea to move the code from create_to_do_list.pl into a module. If the CGI program needs to exist as well (for historical reasons, perhaps) then move the core code into a module which can be used from both the CGI program and the new Dancer app. But if you won't need the CGI program once the Dancer app is ready, I'd just copy the code into the correct place in Organizador.pm.
Instead of using DBI directly, you might find it easier to switch to Dancer::Plugin::Database (or its Dancer2 equivalent), bit for anything other than the simplest of database programs, I'd recommend DBIx::Class (and Dancer2::Plugin::DBIC).
I want to open two separate tabs and keep them open in firefox. I am using WWW::Mechanize::Firefox.
Please find the code below:
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize::Firefox;
my #array = <DATA>;
foreach (#array)
{
my $mech = WWW::Mechanize::Firefox->new(
activate => 1, # bring the tab to the foreground
autoclose => 0 # to prevent autoclosing of the Tab
);
$mech->get($_);
}
__DATA__
www.google.com
www.yahoo.com
Above code works fine and creates two tabs. But I am not happy with my approach as I am creating two object. I am looking for some way to open multiple tabs. I could not do it with a single $mech object and hence created multiple object using for loop.
I have gone through below documentation:
WWW::Mechanize::Firefox
Firefox::Application
WWW::Mechanize::Plugin::Display
WWW::Mechanize::Firefox::Cookbook
however none of them seem to answer this issue. OR I may have missed some information in those documentation.
QUESTION:
Is it possible to open multiple tabs using WWW::Mechanize::Firefox
or some other relevant firefox mechanize module? and how?
Personally, I did not like my code pasted above whihc is creating
multiple objects. Is it OK/BAD to create multiple objects?
Thanks.
There's nothing inherently wrong with creating multiple objects. You may be concerned by the memory overhead. The objects may have extra stuff going on (localhost TCP sockets in this case) that you'd rather not duplicate. You may find it irritating to have to loop over your multiple objects every time you want to perform one logical action that must be shared across the objects.
But nobody's going to recoil in horror. "Multiple objects?! HOW DARE YOU."
If all that you want to do is open multiple tabs, I use something like this:
sub browse {
my $method = $_[1] ? 'new-window' : 'new-tab';
system firefox => -remote => "openURL($_[0],$method)"
}
Well, at the moment i have two goals.
User don't have Edit bug rights in bugzilla, but he/she should write/post comments on that bug. I think this could be possible by the following API, but I am not sure, since I am new in bugzilla and Perl. http://www.bugzilla.org/docs/tip/en/html/api/Bugzilla/WebService/Bug.html#add_comment
I want to import bugs by using importxml.pl, but I don't want new entries in DB. I just want to modify some fields of existing bugs of bugzilla, on the base of bug.xml file which contains bug info.
i.e. perl -T C:\bugzilla\bugzilla\importxml.pl -v C:\bugzilla\bugzilla\mybugs\bug.xml
Might be following API could be helpful, but I am not sure.
http://www.bugzilla.org/docs/tip/en/html/api/Bugzilla/WebService/Bug.html#update
So, what are possible ways to achieve these goals ??
As I am thinking, may be i should use the methods of these APIs into existing bugzilla code and my dreams are:
comments would be enabled for the user who don't have bug-edit right.
I'll run the importxml.pl script from command line by passing some parameters and I'll modify some fields of existing bugs.
But I am not sure, either I am thinking right or wrong. I also don't know how to use the method of these APIs??
The email_in.pl script can do the types of things you are asking. However, you will need to create a user that does have permissions to make the changes and you will need to transform the data into a form that email_in.pl understands.
http://www.bugzilla.org/docs/4.2/en/html/api/email_in.html
I can help with the first point:
Here's an excerpt from a a svn_bz_append.pl script (http://www.telegraphics.com.au/svn/svn_bz/trunk/) that I've modified that I use to update bugzilla comments on svn commits. Note that I have this script running on the same machine as the Bugzilla install, as it uses modules from within the Bugzilla directory. I have this working for Bugzilla v 4.2.3.
I've omitted quite a bit of this script to pull out the excerpt below:
use strict;
use warnings;
use Bugzilla;
use Bugzilla::Config;
use Bugzilla::Bug;
use Data::Dumper;
... create/fetch the userid and some bug Ids to work on ...
eg:
my $userid = 1;
my #bugs = ( 1, 2, 3 );
my $message = 'Say something here';
... now loop through the bug ids and add the comment...
foreach my $bugId (#bugs) {
my $user = new Bugzilla::User({ id => $userid})
|| ThrowUserError('invalid_username', { id => $userid}); #get the user from bugzilla
print STDERR 'user: '. Dumper($user); #pretty prints the user object
Bugzilla->set_user($user); #this authenticates the user so that you may perform actions on bugs that the user has permissions to.
my $bug = Bugzilla::Bug->check($bugId); #gets the bug
print STDERR 'bug: '. Dumper($bug); #pretty prints the bug object
$bug->add_comment($message); #adds a comment to the bug
$bug->update(); #updated the bug - don't forget to do this!
}
Please note that the Dumper functions are just using the excellent Data::Dumper module: http://perldoc.perl.org/Data/Dumper.html - you don't need them except for debugging.
The log in info came from: How can I authenticate when using the Bugzilla Perl API in a script?
I believe this may be a bug in the module I am using, or I am just completely overlooking something.
My code is this:
#!/usr/bin/perl
use strict;
use warnings;
use CAM::PDF;
use CAM::PDF::Annot;
sub main()
{
my $pdf = CAM::PDF::Annot->new( 'b.pdf' );
my $otherDoc = CAM::PDF::Annot->new( 'b_an.pdf' );
my $page = 1;
my %refs;
my #list = #{$pdf->getAnnotations($page)};
for my $annotRef (#list){
$otherDoc->appendAnnotation( $page, $pdf, $annotRef, \%refs);
}
$otherDoc->output('pdf_merged.pdf');
}
exit main;
This code was taken almost directly from the synopsis found on the module's CPAN page: http://metacpan.org/pod/CAM::PDF::Annot
The problem comes when I run the script using TWO pdf's with annotations. Using two pdf's without annotations runs. Using one pdf with annotations, and one pdf without annotations, runs. Only when both pdf's have annotations does it error.
The error is: "Can't use string ("46") as an ARRAY ref while "strict refs" in use at /usr/opt/perl5/lib/site_perl/5.10.1/CAM/PDF/Annot.pm line 195"
Line 195 of Annot.pm is:
push #{$annots->{value}}, $pupRef;
Annot.pm is inside the CAM::PDF::Annot module.
Any guidance in fixing this would be greatly appreciated!
P.S. In the error, "string ("x")", x is always a number, and seems to change depending on the pdf and the annotations within the pdf.
And I will try to add any other information that you need to help figure this out!
Whenever I have a problem with a CPAN module, I go to its webpage to try and assess its quality and see if any bugs have already been reported.
http://search.cpan.org/~donatoaz/CAM-PDF-Annot-0.06 shows the following suspicious results:
CPAN Testers PASS (2) FAIL (168) NA (49)
It is surprising that you were able to install the module. No one has reported bugs, but there is clearly a major problem with the code. It seems the author is either unaware of the tester reports (which have been sent to his CPAN email address for more than a year), or has stopped maintaining it.
You could submit a bug report, so at least others will be aware of your issue.
I realize this does not answer your question of how to fix the problem, but even if you do identify a fix, the author may not apply it (in which case, someone could start the process of becoming a co-maintaner).
I have a run mode in my CGI::Application web-app that I would like to be able to trigger from the command line so i can automate it. From the web-app's perspective it does some processing then sends the results in an email.
When called from the web interface it passes in a set of parameters (email address, which query to run, date, etc) so these need to be passed in.
How can I construct a call to the CGI::Application app that will be the same as if I ran it from the web?
The original CGI specification makes it easy to run things from the command line and was fully intended not as a specific HTTP-only interface but something that could handle FTP and gopher as well as new top-level URL schemes. I know what I wanted when I helped specify it.
The spec I referenced should give you all you need, but for the most part it is just a collection of environment variables. If you see a request for:
http://some.server.com/some/path?a=b&c=d
The environment variables come out looking like this:
SERVER_PROTOCOL=http
REQUEST_METHOD=GET
HTTP_HOST=some.server.com
SERVER_PORT=80
PATH_INFO=/some/path
QUERY_INFO=a=b&c=d
To reverse the polarity of that in Perl would go something like this:
$ENV{'SERVER_PROTOCOL'} = 'http';
$ENV{'REQUEST_METHOD'} = 'GET';
$ENV{'SERVER_PORT'} = 80;
$ENV{'PATH_INFO'} = '/some/path';
$ENV{'QUERY_INFO'} = 'a=b&c=d';
system("perl your-CGI-script.pl");
Things get a bit more complicated in handling POST queries and there are more possible environment variables that may be required. Worst case you can enumerate them all with a quick CGI script something like:
print "Content-Type: text/plain\r\n\r\n";
foreach (keys(%ENV))
{
print "$_=$ENV{$_}\r\n";
}
Now put that on the web server in place of your CGI script and you'll see all the environment that gets passed in (and the original environment so you'll need to make a few judgement calls).
Upon further digging through the CGI::App and the CGI documentation, it appeared to be more straightforward than I thought. The simplest case (no real argument handling or dealing with the output from the webapp run call) is:
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use WebApp;
my $cgi = new CGI( \%{#ARGV} );
my $webapp = WebApp->new( QUERY => $cgi );
$webapp->run();
It just takes a series of space separated name value pairs to create the CGI. You need to pass in the run mode and all the arguments.
I'm the maintainer of CGI::Application, and I do this all the time-- I have dozen of cron scripts built with CGI::Application because it's convenient to share the infrastructure with the application.
The simplest approach is this:
# There is no browser to return results to.
$ENV{CGI_APP_RETURN_ONLY} = 1;
my $app = WebApp->new;
$app->direct_run_mode_method;
In that example, you bypass the normal flow and call a method directly. Be sure you don't need any of the "setup" or "teardown" actions to happen in that case.
If you just have one run mode you are calling, you can also just set the "start_mode", and call run(), so then the default run mode is called by default.
Another idea: you can use a module like Getopt::Long and pass in values through the PARAM hash to new(), or completely replace the run-mode selection process. Here's an example where command line flags are used to determine the run mode:
sub setup {
my $self = shift;
$self->start_mode('send_error_digests');
$self->run_modes([qw/
send_error_digests
help
/]);
my ($dry_run, $help);
GetOptions(
'dry-run' => \$dry_run,
'help' => \$help
);
$self->param('dry_run' => $dry_run);
$self->mode_param(sub {
return 'help' if $help;
return $self->start_mode();
});
}
Thusly:
$ perl yourscript.pl field1=value1 field2=value2
Perl's CGI library takes care of the magic for you, and it appears that CGI::Application relies on CGI (judging from their example code).
Instead of having to go through CGI::Application every time you want to get something done, enforce a proper separation of concerns, perhaps using an MVC setup. All of the functionality should exist outside of the CGI::Application stuff since that should only work as a controller. Once you separate out those bits, you can easily write other controllers for other input methods.
Don't write a web application; write an an application that happens to have a web interface. When you have that, you can easily give your application other sorts of interfaces.
You could automate by calling the web app using curl, wget, or an LWP GET-script with the appropriate parameters. I've used a similar system for cron-driven tasks with a Catalyst application.
That deals with all the environment variables for you..