Perl MozRepl cleanup problem - perl

I'm coding a web crawler and I've been using WWW::Mechanize::Firefox to navigate some pages (for the others I use WWW::Mechanize) which keep loading content after the page loaded and I've never had an issue with that.
Yesterday I added DBI and DBD::mysql to the script, adding queries to export data to a database (this works perfectly), but suddenly MozRepl started giving this error:
(in cleanup) Can't call method "execute" on an undefined value at /Library/Perl/5.10.0/MozRepl.pm line 372 during global destruction.
(in cleanup) Can't call method "execute" on an undefined value at /Library/Perl/5.10.0/MozRepl.pm line 372 during global destruction.
and terminating the script after 1 cycle (it should run until it gets to the end of a specific text file, which it doesn't).
I haven't touched anything from this part of the script (don't need to use the db with those pages), at least willingly. I checked with a file compare app and couldn't solve anything.
Posting the code could be tricky, it's pretty long and I have no idea where the problem may lie.
EDIT
Sometimes it also gives this error instead of the previous one:
(in cleanup) Can't call method "cmd" on an undefined value at /Library/Perl/5.10.0/MozRepl/Client.pm line 186 during global destruction.

This has nothing to do with DBI or DBD::mysql.The messages are nothing to worry about, but I admit they are unsightly.
The messages come as remaining Perl/Javascript objects get destroyed in an unordered way during Perl Global Destruction. If you want to avoid them, destroy your $mech object before
quitting your application.
undef $mech;
# end of program
If the $mech object is released before the program gets shut down, the Perl/Javascript bridge can also shut down in an orderly fashion.
Also note that the preferred forum for questions about WWW::Mechanize::Firefox is http://perlmonks.org :)

Related

Cryptic Moo (Perl) Error "Attempt to bless into a reference at..."

Probably a long shot but I'm wondering if anyone has seen an error like this before, as I can not reproduce it outside of a production environment. Essentially the situation is as follows:
I have a module called My::Budget::Module (renamed for simplicity) which is responsible for updating the "budget" for a given object in the application
The My::Budget::Module uses a Moo object that I built called My::Bulk::Update::Module which does the following:
build up an array of database rows that need to be updated
build a MySQL update query string / statement which will update all rows at once
actually update all rows at once
The My::Bulk::Update::Module will then perform the update and mark the rows that have been updated as "stale" so that they will not be cached
The error always seems to occur somewhere after adding a row to be updated but before the code which actually applies the update returns.
If you look at the stack trace that I have included below you can see that the error takes the form
Attempt to bless into a reference at...
and the point at which this occurs is in the constructor of Moo/Object.pm which is Version 2.003002 of Moo from cpan(see here).
Attempt to bless into a reference at /path/to/module/from/cpan/Moo/Object.pm line 25 at /path/to/module/from/cpan/Moo/Object.pm line 25.
Moo::Object::new(My::Bulk::Update::Module=HASH(0xf784b50)) called at (eval 1808) line 28
MongoDB::Collection::new(My::Bulk::Update::Module=HASH(0xf784b50)) called at /path/to/my/bulk/update/module line XXXX
My::Bulk::Update::Module::apply_bulk_update(My::Bulk::Update::Module=HASH(0xf784b50)) called at /path/to/my/budget/module line XXXX
My::Budget::Module::update_budget(My::Budget::Module=HASH(0xf699a38)) called at /path/to/my/budget/module line XXXX
Moving backwards through the stack trace leads to MongoDB::Collection & this is where things start to get very weird.
MongoDB::Collection is also a cpan module but the module which appears at this point varies and I can't see a pattern here except that it is always a Moo object. Moreover, I'm unsure why this module is being instantiated as there is no call to MongoDB::Collection::new at the line mentioned.
In addition, from the stack trace it looks like MongoDB::Collection and Moo::Object are instantiated with the first argument being My::Bulk::Update::Module=HASH(0xf784b50). Given the application logic I do not believe MongoDB::Collection should be instantiated here nor should My::Bulk::Update::Module be passed to MongoDB::Collection at all.
Other than the fact that it is a Moo object, My::Bulk::Update::Module does not extend any other module and is designed to be a stand alone "utility" module. It is only used at one place in the entire application.
Has anyone seen something similar before?
EDIT: Adding some more code - apply_bulk_update doesn't do much at all. There is no call to MongoDB::Collection here and MongoDB::Collection just "happens" to be the moudule included in the stack trace in this particular example. This is not always MongoDB::Collection - I've also seen MongoDB::Timestamp, MongoDB::Cursor, Search::Elasticsearch::Serializer::JSON, Search::Elasticsearch::Logger::LogAny etc etc
sub apply_bulk_update
{
my $self = shift;
my ($db) = #_; # wrapper around DBI module
my $query = $self->_generate_query(); # string UPDATE table SET...
my $params = $self->_params; # arrayref
return undef unless $params && scalar #$params;
$db->do($query, undef, #$params);
}
The code sometimes dies as soon as apply_bulk_update is called, sometimes on the call to _generate_query and sometimes after the query executes on the last line...
Just in case anyone was interested...
After a chunk of further debugging the error was traced to the exact point where My::Bulk::Update::Module::apply_bulk_update or My::Bulk::Update::Module::_generate_query was called but logging code inside these subroutines determined that they were not being executed as expected.
To determine what was going on B::Deparse was used to rebuild the source code for the body of these subroutines (or at least the source code located at the memory address to which these subs were pointing)
After using this library e.g.
B::Deparse->new->coderef2text(\&My::Bulk::Update::_generate_query)
it became obvious that the error occurred when My::Bulk::Update::_generate_query was pointing at a memory location which contained something entirely different (i.e. MongoDB::Collection::new etc).
This issue appears to have been solved upstream by the following commit in the Sub::Defer module (which is a dependency for Moo).
https://github.com/moose/Sub-Quote/commit/4a38f034366e79b76d29fec903d8e8d02ee01896
If you read the summary of the commit you can see the change that was made:
Prevent defer_info and undefer_sub from operating on expired subs. Validate that the arguments to defer_info and undefer_sub refer to
actual live subs. Use the weak refs we are storing to the deferred and
undeferred subs to make sure the original subs are still alive, and we
aren't returning data related to a reused memory address. Also make sure we don't expire data related to unnamed subs. Since the
user can capture the undeferred sub via undefer_sub, we can't track the
expiry without using a fieldhash. For now, avoid introducing that
complexity, since the amount we leak should not be that great.
Upgrading the version of Sub::Defer appears to have solved the issue.

Where Does create_custom_level() Need to Be Declared (log4perl)?

I'm trying to create a custom message level 'alert' (between warn and error) in Perl, but consistently get the error message:
create_custom_level must be called before init or first get_logger() call at /usr/share/perl5/Log/Log4perl/Logger.pm line 705.
My declaration of the custom level looks like this:
use Log::Log4perl qw(get_logger);
use Log::Log4perl::Level;
Log::Log4perl::Logger::create_custom_level("ALERT", "ERROR");
As far as I can tell from the documentation putting this at the top of any file which intends to use the custom level should be enough. So I can't tell what I'm doing wrong. Looking in the file Logger.pm where the error is thrown from shows that logger is being initialized before the custom level is being declared. Does anyone know how this could be happening?
P.S. I assure you creating a custom level is the right choice here, even if it's frowned upon.
EDIT: Question Answered! The top answer was more a guide to debugging, so I wanted to copy my solution from the comment section so that future readers would be more likely to see it.
I found that there were two steps to fixing my problem:
I needed to put create_custom_level in a BEGIN { ... } statement so that it would run at compile time, since it was apparently being beaten by a logger initialization that was being called at compile time.
I realized that putting the same create_custom_level line in both the main script (.pl) and its modules (.pm) is redundant and caused part of my problems. Depending on the order in which you've put your statements that execute at compile time (like 'use' and 'BEGIN'), calling create_custom_level in multiple files could lead to the sequence: 'create custom level', 'initialize logger', 'create custom level', across multiple files. I didn't figure out where the logger was being initialized early, but I was able to circumvent that by just creating my custom level as early as possible (for other inexperienced coders, using the perl debugger can be key in understanding the order in which lines and files are executed). Best to put create_custom_level in the original script or the first module it uses.
Hope this helps someone else!
The code you provided doesn't produce an error.
Perhaps you have some other code later in your script that is evaluated at compile time -- a module loaded in a use statement or some code in a BEGIN { ... } block -- that initializes a Logger.
If it's not obvious where this might be happening, you can use the Perl debugger to find out where the Logger call could be coming from. First, put this line in your file right after the use Log::Log4perl; and use Log::Log4perl::Level; statements:
BEGIN { $DB::single=1 }
This statement will get the debugger to stop at this line during the compile time phase, and allow you to stop at breakpoints during the rest of the compile phase. Then fire up a debugger
$ perl -d the_script.pl
set breakpoints on the critical Log::Log4perl functions
DB<1> b Log::Log4perl::init
DB<2> b Log::Log4perl::get_logger
begin executing the code
DB<3> c
and when the code stops, get a stack trace
Log::Log4perl::get_logger(/usr/local/lib/perl5/site_perl/5.18.1/Log/Log4perl.pm:371):
371: my $category;
DB<4> T

Catching runtime errors in Perl and converting to exceptions

Perl currently implements $SIG{__DIE__} in such a way that it will catch any error that occurs, even inside eval blocks. This has a really useful property that you can halt the code at the exact point where the error occurs, collect a stack trace of the actual error, wrap this up in an object, and then call die manually with this object as the parameter.
This abuse of $SIG{__DIE__} is deprecated. Officially, you are supposed to replace $SIG{__DIE__} with *CORE::GLOBAL::die. However, these two are NOT remotely equivalent. *CORE::GLOBAL::die is NOT called when a runtime error occurs! All it does is replace explicit calls to die().
I am not interested in replacing die.
I am specifically interested in catching runtime errors.
I need to ensure that any runtime error, in any function, at any depth, in any module, causes Perl to pass control to me so that I can collect the stack trace and rethrow. This needs to work inside an eval block -- one or more enclosing eval blocks may want to catch the exception, but the runtime error could be in a function without an enclosing eval, inside any module, from anywhere.
$SIG{__DIE__} supports this perfectly—and has served me faithfully for a couple of years or more—but the Powers that Be™ warn that this fantastic facility may be snatched away at any time, and I don't want a nasty surprise one day down the line.
Ideally, for Perl itself, they could create a new signal $SIG{__RTMERR__} for this purpose (switching signal is easy enough, for me anyway, as it's only hooked in one place). Unfortunately, my persuasive powers wouldn't lead an alcoholic to crack open a bottle, so assuming this will not happen, how exactly is one supposed to achieve this aim of catching runtime errors cleanly?
(For example, another answer here recommends Carp::Always, which … also hooks DIE!)
Just do it. I've done it. Probably everyone who's aware of this hook has done it.
It's Perl; it's still compatible going back decades. I interpret "deprecated" here to mean "please don't use this if you don't need it, ew, gross". But you do need it, and seem to understand the implications, so imo go for it. I seriously doubt an irreplaceable language feature is going away any time soon.
And release your work on CPAN so the next dev doesn't need to reinvent this yet again. :)

Is the only way to disconnect WWW::Mechanize::Firefox from mozrepl destruction of the objects?

As the title says I'm trying to make a perl daemon which, being long-running I want to be sane on resource usage.
All the examples / documentation I've seen doesn't seem to mention a way to disconnect a session.
The best documentation on the topic I can find in WWW::Mechanize::Firefox::Troubleshooting
Where it's suggested the object (and connection?) is kept alive until global destruction.
In short, I've seen no 'disconnect' function, and wonder if I'm missing something.
Disconnection seems to be handled via destructors. Perl uses special DESTROY methods for this. It is not advisable to call this method manually.
You need to decrease the refcount of your $mech object in order to get it destroyed automatically. This happens when the variable drops out of scope, in the Global Destruction Phase at the end of the process, or (in the case of objects), by assigning something different to your variable, e.g.
$mech = undef;
To completely deallocate any variable, you can also
undef $mech; # which btw is the answer provided in the FAQ you linked
The differences are subtle, and irrelevant in this case.

Skipping error in eval{} statement

I am trying to extract data from website using Perl API. I am using a list of URIs to get the data from the website.
Initially the problem was that if there was no data available for the URI it would die and I wanted it to skip that particular URI and go to the next available URI. I used next unless ....; to come over this problem.
Now the problem is I am trying to extract specific data from the web by calling a specific method (called as identifiers()) from the API. Now the data is available for the URI but the specific data (the identifiers), what I am looking for, is not available and it dies.
I tried to use eval{} like this
eval {
for $bar ($foo->identifiers()){
#do something
};
}
When I use eval{} I think it skips the error and moves ahead but I am not sure. Because the error it gives is Invalid content type in response:text/plain.
Whereas I checked the URI manually, though it doesn't have the identifiers it has rest of the data. I want this to skip and move to next URI. How can I do that?
OK, I think I understand your question, but a little more code would have helped, as would specifying which Perl API -- not that it seems to matter to the answer, but it is a big part of your question. Having said that, the problem seems very simple.
When Perl hits an error, like most languages, it runs out through the calling contexts in order until it finds a place where it can handle the error. Perl's most basic error handling is eval{} (but I'd use Try::Tiny if you can, as it is then clearer that you're doing error handling instead of some of the other strange things eval can do).
Anyway, when Perl hits eval{}, the whole of eval{} exits, and $& is set to the error. So, having the eval{} outside the loop means errors will leave the loop. If you put the eval{} inside the loop, when an error occurs, eval{} will exit, but you will carry on to the next iteration. It's that simple.
I also detect signs that maybe you're not using use strict; and use warnings;. Please do, as they help you find many bugs quicker.