WWW::Mechanize::Firefox looping though links - perl

I am using a foreach to loop through links. Do I need a $mech->back(); to continue the loop or is that implicit.
Furthermore do I need a separate $mech2 object for nested for each loops?
The code I currently have gets stuck (it does not complete) and ends on the first page where td#tabcolor3 is not found.
foreach my $sector ($mech->selector('a.link2'))
{
$mech->follow_link($sector);
foreach my $place ($mech->selector('td#tabcolor3'))
{
if (($mech->selector('td#tabcolor3', all=>1)) >= 1)
{
$mech->follow_link($place);
print $_->{innerHTML}, '\n'
for $mech->selector('td.dataCell');
$mech->back();
}
else
{
$mech->back();
}
}

You cannot access information from a page when it is no longer on display. However, the way foreach works is to build the list first before it is iterated through, so the code you have written should be fine.
There is no need for the call to back as the links are absolute. If you had used click then there must be a link in the page to click on, but with follow_link all you are doing is going to a new URL.
There is also no need to check the number of links to follow, as a for loop over an empty list will simply not be executed.
To make things clearer I suggest that you assign the results of selector to an array before the loop.
Like this
my #sectors = $mech->selector('a.link2');
for my $sector (#sectors) {
$mech->follow_link($sector);
my #places = $mech->selector('td#tabcolor3');
for my $place (#places) {
$mech->follow_link($place);
print $_->{innerHTML}, '\n' for $mech->selector('td.dataCell');
}
}
Update
My apologies. It seems that follow_link is finicky and needs to follow a link on the current page.
I suggest that you extract the href attribute from each link and use get instead of follow_link.
my #selectors = map $_->{href}, $mech->selector('a.link2');
for my $selector (#selectors) {
$mech->get($selector);
my #places = map $_->{href}, $mech->selector('td#tabcolor3');
for my $place (#places) {
$mech->get($place);
print $_->{innerHTML}, '\n' for $mech->selector('td.dataCell');
}
}
Please let me know whether this works on the site you are connecting to.

I recommend to use separate $mech object for this:
foreach my $sector ($mech->selector('a.link2'))
{
my $mech = $mech->clone();
$mech->follow_link($sector);
foreach my $place ($mech->selector('td#tabcolor3'))
{
if (($mech->selector('td#tabcolor3', all=>1)) >= 1)
{
my $mech = $mech->clone();
$mech->follow_link($place);
print $_->{innerHTML}, '\n'
for $mech->selector('td.dataCell');
#$mech->back();
}
# else
# {
# $mech->back();
# }
}

I am using WWW:Mechanize::Firefox to loop over a bunch of URLs with loads of Javascript. The page does not render immediately so need test if a particular page element is visible (similar to suggestion in Mechanize::Firefox documentation except 2 xpaths in the test) before deciding next action.
The page eventually renders a xpath to 'no info' or some wanted stuff after about 2-3 seconds. If no info we go to next URL. I think there is some sort of race condition with both xpaths not existing at once causing the MozRepl::RemoteObject: TypeError: can't access dead object error intermittently (at the sleep 1 in the loop oddly enough).
My solution that seems to work/improve reliability is to enclose all the $mech->getand$mech->is_visible in an eval{}; like this:
eval{
$mech->get("$url");
$retries = 15; #test to see if element visible = page complete
while ($retries-- and ! $mech->is_visible( xpath => $xpath_btn ) and ! $mech->is_visible( xpath => $xpath_no_info )){
sleep 1;
};
last if($mech->is_visible( xpath => $xpath_no_info) ); #skip rest if no info page
};
Others might suggest improvements on this.

Related

How would I match the correct hash pair based on a specific string?

I have a simple page hit tracking script that allows for the output to display friendly names instead of urls by using a hash.
UPDATE: I used php to generate the hash below, but used the wrong dynamic page name of item.html. When changed to the correct name, the script returns the desired results. Sorry for wasting anyone's time.
my %LocalAddressTitlePairs = (
'https://www.mywebsite.com/index.html' => 'HOME',
'https://www.mywebsite.com/art_gallery.html' => 'GALLERY',
'https://www.mywebsite.com/cart/item.html?itemID=83&cat=26' => 'Island Life',
'https://www.mywebsite.com/cart/item.html?itemID=11&cat=22' => 'Castaways',
'https://www.mywebsite.com/cart/item.html?itemID=13&cat=29' => 'Pelicans',
and so on..
);
The code for returning the page hits:
sub url_format {
local $_ = $_[0] || '';
if ((m!$PREF{'My_Web_Address'}!i) and (m!^https://(.*)!i) ) {
if ($UseLocalAddressTitlePairs == 1) {
foreach my $Address (keys %LocalAddressTitlePairs) {
return "<a title=\"$Address\" href=\"$_\">$LocalAddressTitlePairs{$Address}</A>" if (m!$_$! eq m!$Address$!);
}
}
my $stub =$1;
return $stub;
}
}
Displaying the log hits will show
HOME with the correct link, GALLERY with the correct url link, but https://www.mywebsite.com/cart/item.html?itemID=83&cat=26
will display a random name instead of what it should be, Island Life for this page.. it has the correct link,-- a different name displays every time the page is loaded.
And, the output for all pages with query strings will display the exact same name. I know the links are correct by clicking thru site pages and checking the log script for my own page visits.
I tried -
while (my($mykey, $Value) = each %LocalAddressTitlePairs) {
return "<a title=\"$mykey\" href=\"$_\">$Value</a>" if(m!$_$! eq m!$mykey$!);
but again, the link is correct but the mykey/Value associated is random too. Way too new to perl to figure this out but I'm doing a lot of online research.
m!$Address$! does not work as expected, because the expression contains special characters such as ?
You need to add escape sequences \Q and \E
m!\Q$Address\E$!
it’s even better to add a check at the beginning of the line, otherwise
my $url = "https://www.mywebsite.com/?foo=bar"
my $bad_url = "https://bad.com?u=https://www.mywebsite.com/?foo=bar"
$bad_url =~ m!\Q$url\E$! ? 1 : 0 # 1, pass
$bad_url =~ m!^\Q$url\E$! ? 1 : 0 # 0, fail

Perl Selenium::Remote::Driver test to check button element is formed?

I need to test web site where button is formed at the bottom of the page after
user scroll the page for two times.
I have written a small script to test if the required element is formed.
the condition tested always return false even though required element is formed in the page.
use Selenium::Remote::Driver;
use Scalar::Util qw/blessed reftype/;
my $driver= Selenium::Remote::Driver->new;
$driver->get('http://www.foo.com');
while ( 1 ) {
$query = $driver->find_element_by_xpath(q{//button[#class='button']});
#to test the if the element is present
if ( blessed($query) && $query->isa('Selenium::Remote::Driver') ) {
$query->click;
last;
}
else {
#always goes into else loop
#to go to the end of the webpage
my $script = q{window.scrollTo(0,document.body.scrollHeight);};
my $elem = $driver->execute_script($script);
}
}
Is there any way to test if the button element has been formed in the script?
Use the below to find the element
my $elem = $driver->find_element_by_xpath($locator);
If this doesn't return 0 you've found your element, then you could run below:
To check whether the element is displayed
$elem->is_displayed();
To check whether the element is hidden
$elem->is_hidden();
Check out more methods at: https://metacpan.org/pod/Selenium::Remote::WebElement
All these function return 0 when element is not found.
Read the doc carefully.
find_element_by_xpath
These functions all take a single STRING argument: the locator
search target of the element you want. If the element is found, we
will receive a WebElement. Otherwise, we will return 0. Note that
invoking methods on 0 will of course kill your script.

How do I add this value to an Array and it stays in the script

I want to add a command that adds numbers to the array.
This is what i have exactly:
my $ownerids = ('374867065');
Then later in the script i have this:
if($ownerids == $spl2[0]){
if (index($message, "!adduser") != -1) {
$msg = $spl[1];
$send = "<m t=\"User Added $msg\" u=\"$botid\" />\0";
$socket->send($send);
push (my $ownerids, "$msg");
}
}
I am on a chatbox and this is a chatbot, i want to make it when i say !adduser (thereid) it adds them to a list and they can use the bot commands, and also i want a Delete User, If you can help this will be MUCH appretiated.
If you want ownerids to be an array, then you must prefix it with a #
my #ownerids = ('374867065');
Then to add an element, you can push
push #ownerids, "$msg";
However, you're going to need to fix your other references to #ownerids so it's treated like an array. For example, your first if looks like it's intending to see if $spl2[0] is an owner. If that's the case, then you'll need to grep the array:
if(grep {$_ == $spl2[0]} #ownerids) {

Can't invoke submit queries in PERL CGI

Hey so I am trying to make a simple 'dating website' however I'm struggling with CGI aspect :( Mainly I'm having trouble with forms(I think I'm not too sure what I'm struggling with).
I have this statement
print header, start_html("EngCupid"), h2("EngCupid"), start_form;
if (!param() || param("home")) {
show_welcome();
} elsif (param("browse")) {
browse_page();
} elsif (param("search")) {
search_users();
} elsif (param("username")) {
search_results();
} else {
print "fail";
}
print end_form, end_html;
exit 0;
To Handle the general navigation of the website. However, I'm struggling when it comes to submit buttons etc inside these functions. So my browse_page() function is
sub browse_page {
print h2("Browse Page");
print p;
if (param("next")) {
$hidden_variable = param("x") + 1;
}
param('x', $hidden_variable);
$hidden_variable = 0;
print hidden('x');
print submit("next", "Next");
print submit("home", "Home"), " ", submit("search", "Search Users");
}
Which is supposed to increment a variable that I need to use for further functions every time I press the next key. However, whenever I press the next key it just prints fail as in the form isn't being passed?
Do I need a new form inside each function I am printing? I tried it but it still didn't work. Just a little lost in forms in general.
I am not sure what you are trying to achieve. Maybe the problem is that the x is not being sent back from client to server, maybe you wanted
print hidden('x', param('x'));
Also, why do you set $hidden_variable to 0? After submitting, the script will run again, the old value of the variable will not be accessible anymore.

How to cancel a file upload based on file size in Catalyst

I'm writing a file upload handler Catalyst. I'm trying to restrict the maximum file size. To do this I've made a Plugin (based on the answer here). Here is the code where I check for the file size:
before 'prepare_body' => sub {
my $c = shift;
my $req = $c->request;
my $length = $req->headers->{"content-length"};
if ($length > 10000)
{
$c->stash->{errors} = "File upload error";
# how do I abort the upload?
}
};
This correctly detects files that are too big, but I can't for the life of me figure out how to abort the upload. Ideally, it should also reach the controller/action. Can anyone give me a pointer? Thanks a lot.
Very simply, you probably shouldn't. Anything you do from plugin code to abort the handling is going to knock out the ability of user code to deal with the situation in a nice way (for example by giving a validation error or a nice error page, instead of a Catalyst exception page).
However, all is not lost. Why not try something like this?
around 'prepare_body' => sub {
my ($orig, $self) = (shift, shift);
my ($c) = #_;
my $max_length = $c->config->{'Plugin::WhateverMyNameIs'}->{max_request_size};
$max_length = 1_000_000 unless defined $max_length; # default
my $length = $c->engine->read_length;
if ($length <= $max_length) { # ok, go ahead
$self->$orig(#_);
} else {
$c->stash->{request_body_aborted} = 1;
}
};
This will stop the read if your request is over-size, but it will let dispatch proceed as normal -- which means you will want to write some code in your action, or in a begin action, or in a chain root, that checks for $c->stash->{request_body_aborted} and does something appropriate -- whether that's setting a form validation error, or calling $c->error("Request too large"); $c->detach or whatever. It's also configurable, as any plugin should be.
I think this needs to occur earlier in the chain. If you have the headers, then the packet is already created.
Perhaps you could try: $c->detach(); or possibly loop through the $c->stack array and remove actions that might have been added, related to your upload.