thumbnail screenshot of webpages with Perl :: Mechanize - perl

i use WWW::Mechanize::Firefox to control a firefox instance and dump the rendered page with $mech->content_as_png.
New update: see at the end of the initial posting:
thanks to user1126070 we have a new solution - which i want to try out later the day [right now i am in office and not at home - in front of the machine with the programme ]
$mech->repl->repl->setup_client( { extra_client_args => { timeout => 5*60 } } );
i try out the version that put links to #list and use eval and do the following:
while (scalar(#list)) {
my $link = pop(#list);
print "trying $link\n";
sleep (5);
my $png = $mech->content_as_png();
my $name = "$_";
$name =~s/^www\.//;
$name .= ".png";
open(OUTPUT, ">$name");
print OUTPUT $png;
if ($#){
print "link: $link failed\n";
push(#list,$link);#put the end of the list
print "$link is done!\n";
BTW: user1126070 what with the trimming down the images to thumbnail-size. Should i use imager here. Can you suggest some solution thing here...!? That would be great.
end of Update
Here the problem-outline continues - as written at the very beginning of this Q & A
problem-outline: I have a list of 2500 websites and need to grab a thumbnail screenshot of them. How do I do that? I could try to parse the sites either with Perl.- Mechanize would be a good thing. Note: i only need the results as a thumbnails that are a maximum 240 pixels in the long dimension. At the moment i have a solution which is slow and does not give back thumbnails: How to make the script running faster with less overhead - spiting out the thumbnails
But i have to be aware that setting it up can pose quite a challenge, though.
If all works as expected, you can simply use a script like this to dump images of the desired websites, but you should start Firefox and resize it to the desired width manually (height doesn't matter, WWW::Mechanize::Firefox always dumps the whole page).
What i have done so far is alot - i work with mozrepl. At the moment i struggle with timeouts: Is there a way to specify Net::Telnet timeout with WWW::Mechanize::Firefox?
At the moment my internet connection is very slow and sometimes I get error
with $mech->get():
command timed-out at /usr/local/share/perl/5.12.3/MozRepl/ line 186
> $mech->repl->repl->timeout(100000);
Unfortunatly it does not work: Can't locate object method "timeout" via package "MozRepl"
Documentation says this should:
$mech->repl->repl->setup_client( { extra_client_args => { timeout => 1 +80 } } );
What i have tried allready; here it is:
use strict;
use warnings;
use WWW::Mechanize::Firefox;
my $mech = new WWW::Mechanize::Firefox();
open(INPUT, "<urls.txt") or die $!;
while (<INPUT>) {
print "$_\n";
my $png = $mech->content_as_png();
my $name = "$_";
$name =~s/^www\.//;
$name .= ".png";
open(OUTPUT, ">$name");
print OUTPUT $png;
sleep (5);
Well this does not care about the size: See the output commandline:
linux-vi17:/home/martin/perl # perl
command timed-out at /usr/lib/perl5/site_perl/5.12.3/MozRepl/ line 186
linux-vi17:/home/martin/perl #
And here - this is my source: see a snippet-example of the sites i have in the url-list.
urls.txt - the list of sources and so on...
BTW: With that many url's we have to expect that some will fail and handle that. For example, we put the failed ones in an array or hash and retry them X times.
well how is this one here...
sub content_as_png {
my ($self, $tab, $rect) = #_;
$tab ||= $self->tab;
$rect ||= {};
# Mostly taken from
my $screenshot = $self->repl->declare(<<'JS');
function (tab,rect) {
var browser = tab.linkedBrowser;
var browserWindow = Components.classes[';1']
var win = browser.contentWindow;
var body = win.document.body;
if(!body) {
var canvas = browserWindow
.createElementNS('', 'canvas');
var left = rect.left || 0;
var top = || 0;
var width = rect.width || body.clientWidth;
var height = rect.height || body.clientHeight;
canvas.width = width;
canvas.height = height;
var ctx = canvas.getContext('2d');
ctx.clearRect(0, 0, width, height);;
ctx.scale(1.0, 1.0);
ctx.drawWindow(win, left, top, width, height, 'rgb(255,255,255)');
//return atob(
return canvas
.toDataURL('image/png', '')
// );
my $scr = $screenshot->($tab, $rect);
return $scr ? decode_base64($scr) : undef
Love to hear from you!
greetings zero

Are you tried this out? It is working?
$mech->repl->repl->setup_client( { extra_client_args => { timeout => 5*60 } } );
put links to #list and use eval
while (scalar(#list)) {
my $link = pop(#list);
print "trying $link\n";
sleep (5);
my $png = $mech->content_as_png();
my $name = "$_";
$name =~s/^www\.//;
$name .= ".png";
open(OUTPUT, ">$name");
print OUTPUT $png;
if ($#){
print "link: $link failed\n";
push(#list,$link);#put the end of the list
print "$link is done!\n";


How do I add variables to be set based on a numeric input in perl?

I am making a score-keeping script in Perl, and would like to have it ask how many players there are, and ask for a name, then score, for each player. I have a good bit of this script done, but only for 3 players. the current script can be found on github here: skore
(from link:)
#!/usr/bin/env perl
use strict;
my $version = "1.0";
my $arg = shift(#ARGV);
my $subname = $arg;
if (!defined($arg)){
$subname =~ s/-/_/g;
my $sub = main->can("cmd_$subname") || main->can("dev_$subname") || main->can("hid_$subname");
if (!defined($sub))
print STDERR "Invalid command given.\nUse \e[1;32m./skore help\e[0m for a list of commands.\n";
exit 1;
exit 0;
# Main command
sub cmd_go()
print "\e[2J\e[0G\e[0d"; # J = Erase in Display, 2 = Entire Screen, (G, d) = Move cursor to (..,..)
print "••••••••••••••••••••\n";
print "• Welcome to \e[1;32mskore\e[0m •\n";
print "••••••••••••••••••••\n\n";
my #game = prompt("What game are we scoring?\n");
print "••• Game name locked: #game\n\n";
my #p1name = prompt("Player 1 name?\n");
my #p2name = prompt("Player 2 name?\n");
my #p3name = prompt("Player 3 name?\n");
print "\n";
print "••• Player names locked: #p1name #p2name #p3name\n\n";
my #p1score = prompt_num("score for #p1name?\n");
my #p2score = prompt_num("score for #p2name?\n");
my #p3score = prompt_num("score for #p3name?\n");
print "\n";
print "••• Game: #game\n";
print "••• #p1name\n";
print "••••• \e[1;32m#p1score\e[0m\n";
print "••• #p2name\n";
print "••••• \e[1;32m#p2score\e[0m\n";
print "••• #p3name\n";
print "••••• \e[1;32m#p3score\e[0m\n";
exit 1;
sub cmd_help()
print "To get right into using skore, simply type ./skore\n";
print "For details about skore, such as version, use ./skore pkg\n";
sub cmd_pkg()
print "skore version: $version\n";
print "Detected OS: ";
exec "uname -r";
sub prompt {
my ($query) = #_; # take a prompt string as argument
local $| = 1; # activate autoflush to immediately show the prompt
print $query;
chomp(my $answer = <STDIN>); return $answer;
sub prompt_num {
my ($querynum) = #_;
print $querynum;
chomp(my $pnum = <STDIN>);
if ($pnum eq $pnum+0) { return $pnum; }
else { print "Error: That is not a number. Try again.\n"; goto NSTART; }
sub prompt_yn {
my ($queryyn) = #_;
my $answer = prompt("$queryyn (y/N): ");
return lc($answer) eq 'y';
I'd like to also point out that I'm new to perl.
OK, wow. Stop for a moment, step back and put the code down. Think about what you're trying to accomplish here.
There's a bunch of things you're doing in your code that's really going to benefit from taking a step back, and understanding what's going on, before proceeding.
First off:
my $arg = shift(#ARGV);
my $subname = $arg;
if (!defined($arg)){
What is this intended to do? You only use $arg 3 times here, and one of those is to copy it to $subname.
This could be quite simplified by:
my $subname = shift;
cmd_go() unless defined $subname;
Now this:
my $sub = main->can("cmd_$subname") || main->can("dev_$subname") || main->can("hid_$subname");
Where did that come from? Because I'm pretty sure that - as a beginner to perl - you didn't write that yourself, not least because you don't have any subroutines prefixed with dev_ or hid. And this sort of redirect is serious overkill for a program that basically does just one thing.
(And normally, you'd use flags like getopt rather than a command that you leave blank in a default state).
You are also massively overusing arrays - which suggests you're not really sure the difference between #game and $game.
E.g. this:
my #game = prompt("What game are we scoring?\n");
prompt does this though:
chomp(my $answer = <STDIN>); return $answer;
It returns a scalar (a single line) and you're putting it into an array for - as far as I can tell - no particular reason.
Likewise this:
my #p1score = prompt_num("score for #p1name?\n");
my #p2score = prompt_num("score for #p2name?\n");
my #p3score = prompt_num("score for #p3name?\n");
First off - you're using a bunch of single element arrays. But then you're numbering them. When ... actually, the whole point of arrays is to have numbered values.
So how about instead:
print "Num players?:\n";
chomp ( my $num = <STDIN> );
my #players;
my %scores;
for ( 1..$num ) {
print "Player name\n";
chomp ( my $name = <STDIN> );
push ( #players, $name );
foreach my $person ( #players ) {
print "Score for $person:\n";
chomp ( my $score = <STDIN> );
while ( $score =~ /\D/ ) {
print "Invalid - please enter numeric value\n";
chomp ( $score = <STDIN> );
$scores{$person} = $score;
foreach my $person ( #players ) {
print "$person => $score{$person}\n";
There are a bunch of other things that you're doing that is more complicated than it needs to be.
What I would suggest you do:
go re-read the perl basics. perldata in particular.
have a look at getopt which is a good (and standard) way to take program 'flag' style input. (e.g. showing version, if that's what you really want.
it looks a lot like you've cargo-culted the code here. I would suggest you re-write from the ground up, and when you hit a problem - ask about it on Stack Overflow, if you can't figure it out from the perl docs.
Try this. Hope this is what you wanted.
#!/usr/bin/env perl
use strict;
my $version = "1.0";
my $arg = shift(#ARGV);
my $subname = $arg;
if (!defined($arg)){
$subname =~ s/-/_/g;
my $sub = main->can("cmd_$subname") || main->can("dev_$subname") || main->can("hid_$subname");
if (!defined($sub))
print STDERR "Invalid command given.\nUse \e[1;32m./skore help\e[0m for a list of commands.\n";
exit 1;
exit 0;
# Main command
sub cmd_go()
print "\e[2J\e[0G\e[0d"; # J = Erase in Display, 2 = Entire Screen, (G, d) = Move cursor to (..,..)
print "••••••••••••••••••••\n";
print "• Welcome to \e[1;32mskore\e[0m •\n";
print "••••••••••••••••••••\n\n";
my #game = prompt("What game are we scoring?\n");
print "••• Game name locked: #game\n\n";
my $players= prompt("Enter total number of players:\n");
my #players_list;
for(my $i=0;$i<$players;$i++){
push(#players_list , prompt("Enter Player ".($i+1)." name\n"));
print "\n";
print "••• Player names locked: ";
for(my $i=0;$i<$players;$i++){
print $players_list[$i]."\t";
print "\n\n";
my #players_score;
for(my $i=0;$i<$players;$i++){
push(#players_score, prompt("score for $players_list[$i]?\n"));
print "\n";
print "••• Game: #game\n";
for(my $i=0;$i<$players;$i++){
print "$players_list[$i]\n";
print "••••• \e[1;32m$players_score[$i]\e[0m\n";
exit 1;
sub cmd_help()
print "To get right into using skore, simply type ./skore\n";
print "For details about skore, such as version, use ./skore pkg\n";
sub cmd_pkg()
print "skore version: $version\n";
print "Detected OS: ";
exec "uname -r";
sub prompt {
my ($query) = #_; # take a prompt string as argument
local $| = 1; # activate autoflush to immediately show the prompt
print $query;
chomp(my $answer = <STDIN>); return $answer;
sub prompt_num {
my ($querynum) = #_;
print $querynum;
chomp(my $pnum = <STDIN>);
if ($pnum eq $pnum+0) { return $pnum; }
else { print "Error: That is not a number. Try again.\n"; goto NSTART; }
sub prompt_yn {
my ($queryyn) = #_;
my $answer = prompt("$queryyn (y/N): ");
return lc($answer) eq 'y';

Flush INET Socket response data with BLOCKING enabled

I am making a program that interfaces with Teamspeak, and I have an issue where the responses received will not match the commands sent. I run the program multiple times and each time, I will get different results when they should be the same, due to responses being out of sync.
my $buf = '';
use IO::Socket;
my $sock = new IO::Socket::INET (
PeerAddr => 'localhost'
,PeerPort => '10011'
,Proto => 'tcp'
,Autoflush => 1
,Blocking => 1
,Timeout => 10
sub ExecuteCommand{
print $sock $_[0]."\n";$sock->sysread($buf,1024*10);
return $buf;
ExecuteCommand("login ${username} ${password}");
ExecuteCommand("use sid=1");
ExecuteCommand("clientupdate client_nickname=Idle\\sTimer");
my $client_list = ExecuteCommand("clientlist");
Each command is executed properly, however the server likes to return extra lines, so a single sysread will not be enough and I will have to execute another. The size of responses are at most 512, so they aren't being cut off. If I try to run the sysread multiple times in an attempt to flush it, when there is nothing to read it will just make the program hang.
The end of the executions are followed with "error id=0 msg=ok"
How would I be able to read all the data that comes out, even if it's multiple lines? Or just be able to flush it all out so I can move onto the next command without having to worry about old data?
So you want to read until you find a line starting with error. In addition to doing that, the following buffers anything extra read since it's part of the next response.
sub read_response {
my ($conn) = #_;
my $fh = $conn->{fh};
our $buf; local *buf = \($conn->{buf}); # alias
our $eof; local *eof = \($conn->{eof}); # alias
$buf = '' if !defined($buf);
return undef if $eof;
while (1) {
if ($buf =~ s/\A(.*?^error[^\n]*\n)//ms) {
return $1;
my $rv = sysread($fh, $buf, 64*1024, length($buf));
if (!$rv) {
if (defined($rv)) {
$eof = 1;
return undef;
} else {
die "Can't read response: $!\n";
my $conn = { fh => $sock };
... send command ...
my $response = read_response($conn);
... send command ...
my $response = read_response($conn);
I changed my ExecuteCommand subroutine to include a check for "error code=[0-9]{1,}", which is what is always at the end of a response for Teamspeak 3 servers.
sub ExecuteCommand{
print $sock $_[0]."\n";
my $response = "";
while (1){
last if($buf =~ /error id=([0-9]{1,})/);
$response .= $buf;
return $response;

Web crawler using perl

I want to develop a web crawler which starts from a seed URL and then crawls 100 html pages it finds belonging to the same domain as the seed URL as well as keeps a record of the traversed URLs avoiding duplicates. I have written the following but the $url_count value does not seem to be incremented and the retrieved URLs contain links even from other domains. How do I solve this? Here I have inserted as my starting URL.
use strict;
use warnings;
use LWP::Simple;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;
##open file to store links
open my $file1,">>", ("extracted_links.txt");
##starting URL
my #urls = '';
my $browser = LWP::UserAgent->new('IE 6');
my %visited;
my $url_count = 0;
while (#urls)
my $url = shift #urls;
if (exists $visited{$url}) ##check if URL already exists
my $request = HTTP::Request->new(GET => $url);
my $response = $browser->request($request);
if ($response->is_error())
printf "%s\n", $response->status_line;
my $contents = $response->content();
$visited{$url} = 1;
#lines = split(/\n/,$contents);
foreach $line(#lines)
$line =~ m#(((http\:\/\/)|(www\.))([a-z]|[A-Z]|[0-9]|[/.]|[~]|[-_]|[()])*[^'">])#g;
print "$1\n";
push #urls, $$line[2];
sleep 60;
if ($visited{$url} == 100)
close $file1;
Several points, your URL parsing is fragile, you certainly won't get relative links. Also you don't test for 100 links but 100 matches of the current url, which almost certainly isn't what you mean. Finally, I'm not too familiar with LWP so I'm going to show an example using the Mojolicious suite of tools.
This seems to work, perhaps it will give you some ideas.
#!/usr/bin/env perl
use strict;
use warnings;
use Mojo::UserAgent;
use Mojo::URL;
##open file to store links
open my $log, '>', 'extracted_links.txt' or die $!;
##starting URL
my $base = Mojo::URL->new('');
my #urls = $base;
my $ua = Mojo::UserAgent->new;
my %visited;
my $url_count = 0;
while (#urls) {
my $url = shift #urls;
next if exists $visited{$url};
print "$url\n";
print $log "$url\n";
$visited{$url} = 1;
# find all <a> tags and act on each
my $url = Mojo::URL->new($_->{href});
if ( $url->is_abs ) {
return unless $url->host eq $base->host;
push #urls, $url;
last if $url_count == 100;
sleep 1;

Why does my Perl script using WWW-Mechanize fail intermittently?

I am trying to write a Perl script using WWW-Mechanize.
Here is my code:
use DBI;
use JSON;
use WWW::Mechanize;
sub fetch_companies_list
my $url = shift;
my $browser = WWW::Mechanize->new( stack_depth => 0 );
my ($content, $json, $parsed_text, $company_name, $company_url);
print "Getting the companies list...\n";
$browser->get( $url );
# die "Can't get the companies list.\n" unless( $browser->status );
$content = $browser->content();
# die "Can't get companies names.\n" unless( $browser->status );
$json = new JSON;
$parsed_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode( $content );
$company_name = $_->{name};
fetch_company_info( $company_name, $browser );
fetch_companies_list( "" );
The problem is the follows:
I start the script it finishes fine.
I restart the script. The script fails in "$browser->get()".
I have to wait some time (about 5 min) then it will start working again.
I am working on Linux and have WWW-Mechanize version 1.66.
Any idea what might be the problem? I don't have any firewall installed either on computer or on my router.
Moreover uncommenting the "die ..." line does not help as it stopping inside get() call. I can try to upgrade to the latest, which is 1.71, but I'd like to know if someone else experience this with this Perl module.
5 minutes (300 seconds) is the default timeout. Exactly what timed out will be returned in the response's status line.
my $response = $mech->res;
if (!$response->is_success()) {
This is target site issue. It shows
503 Service Unavailable No server is available to handle this
right now.
Retry with wait, try this
## set maximum no of tries
my $retries = 10;
## number of secs to sleep
my $sleep = 1;
do {
eval {
print "Getting the companies list...\n";
# die "Can't get the companies list.\n" unless( $browser->status );
$content = $browser->content();
# die "Can't get companies names.\n" unless( $browser->status );
$json = new JSON;
$parsed_text = $json->allow_nonref->utf8->relaxed->escape_slash->loose->allow_singlequote->allow_barekey->decode($content);
foreach (#$parsed_text) {
$company_name = $_->{name};
fetch_company_info( $company_name, $browser );
if ($#) {
warn $#;
## rest for some time
## increase the value of $sleep exponetially
$sleep *= 2;
} while ( $# && $retries-- );

mib name printing from mib values in perl

This is the code that I used to walk through the table in net:snmp using perl:
#! /usr/local/bin/perl
use strict;
use warnings;
use Net::SNMP qw(:snmp);
my $OID_hrSystem = '';
my $OID_ifPhysAddress = '';
my ($session, $error) = Net::SNMP->session(
-hostname => shift || 'localhost',
-community => shift || 'public',
-nonblocking => 1,
-translate => [-octetstring => 0],
-version => 'snmpv2c',
if (!defined $session) {
printf "ERROR: %s.\n", $error;
exit 1;
my %table; # Hash to store the results
my $result = $session->get_bulk_request(
-varbindlist => [ $OID_hrSystem ],
-callback => [ \&table_callback, \%table ],
-maxrepetitions => 10,
if (!defined $result) {
printf "ERROR: %s\n", $session->error();
exit 1;
# Now initiate the SNMP message exchange.
# Print the results, specifically formatting ifPhysAddress.
for my $oid (oid_lex_sort(keys %table)) {
if (!oid_base_match($OID_ifPhysAddress, $oid)) {
printf "%s = %s\n", $oid, $table{$oid};
} else {
printf "%s = %s\n", $oid, unpack 'H*', $table{$oid};
exit 0;
sub table_callback
my ($session, $table) = #_;
my $list = $session->var_bind_list();
if (!defined $list) {
printf "ERROR: %s\n", $session->error();
# Loop through each of the OIDs in the response and assign
# the key/value pairs to the reference that was passed with
# the callback. Make sure that we are still in the table
# before assigning the key/values.
my #names = $session->var_bind_names();
my $next = undef;
while (#names) {
$next = shift #names;
if (!oid_base_match($OID_hrSystem, $next)) {
return; # Table is done. chakri
$table->{$next} = $list->{$next};
# Table is not done, send another request, starting at the last
# OBJECT IDENTIFIER in the response. No need to include the
# calback argument, the same callback that was specified for the
# original request will be used.
my $result = $session->get_bulk_request(
-varbindlist => [ $next ],
-maxrepetitions => 10,
if (!defined $result) {
printf "ERROR: %s.\n", $session->error();
Output is: = 1 hour, 12:00.77 = �
+ = 1536 = BOOT_IMAGE=/boot/vmlinuz-3.0.0-14-generic root=UUID=5c4c8d22-3cea-4410-aaad-f297c75d217e ro quiet splash vt.handoff=7 = 1 = 133 = 0
But the required output for me is as follows:
hrSystemUptime.0 = 1:08:54.36
hrSystemDate.0 = 2011-12-14,16:0:2.0,+1:0
hrSystemInitialLoadDevice.0 = 1536
hrSystemInitialLoadParameters.0 = "BOOT_IMAGE=/boot/vmlinuz-3.0.0-14-generic root=UUID=5c4c8d22-3cea-4410-aaad-f297c75d217e ro quiet splash vt.handoff=7"
hrSystemNumUsers.0 = 1
hrSystemProcesses.0 = 133
hrSystemMaxProcesses.0 = 0
The main thing in the output is I want mib names to be printed in the output instead of the mib values
You could use the SNMP module (available on Ubuntu as libsnmp-perl) which offers a tied hash to loaded MIBs, %SNMP::MIB. Here's some example code:
use SNMP;
print "$SNMP::MIB{''}{label} = \n";
#Should print "hrSystemUptime = "
Because %SNMP::MIB is a tied hash, you can't just do a lookup and assign to a lexical variable, i.e. my $oid = $SNMP::MIB{$oidstr}. You have to access it directly every time.
There is lots of other information that it loads from the MIB, including data type, which could help with the issue it looks like you have with hrSystemDate. Also, see the man page for mib_api if you need to load specific MIBs. The ones you used in your example loaded by default on my system, though.
have you tried the snmpget command on your server? When I run snmpget direcly on CLI, the result cames with the name:
Ex: /usr/local/bin/snmpget -O Q -v 2c -c Community x.x.x.x .
IF-MIB::ifHCInOctets.100663301 = 152528664859348
If it works for you, you might want to exectute the command in the PERL code, instead of using the LIB. Then you just have to handle with the output.
Also, tou can use snmptranslate to tranlate your OIDs:
Ex: /usr/local/bin/snmptranslate
More Info ->
Why don't you:
my $pathSnmpTranslate = '/your/path/to/snmptranslate';
for my $oid (oid_lex_sort(keys %table)) {
my $oidTrans = `$pathSnmpTranslate $oid`;
if (!oid_base_match($OID_ifPhysAddress, $oid)) {
printf "%s = %s\n", $oidTrans, $table{$oid};
} else {
printf "%s = %s\n", $oidTrans, unpack 'H*',$table{$oid};
On my machine it worked:
> /xxx % /usr/local/bin/snmptranslate
> /xxx % /usr/local/bin/snmptranslate