How to avoid error maximum open_cursor exceeded when using Class::DBI

How to avoid error maximum open_cursor exceeded when using Class::DBI - perl

(Update to answer Jonathan Leffler's question below):
We're running Perl 5.8.7 and Oracle 11.1.0.7.0.
Due to the company's policy, developers have no arbitrary control in regard to software upgrade. Giving the proposal to the upper management takes months to be followed up (if approved) - I guess it's not a surprisingly odd situation for several other companies too.
I inherited the program from someone else left the company and found the warning about "issuing rollback() ..." from the application log file. The actual problem "maximum open_cursor exceeded" was found after I run DBI_TRACE=2=/tmp/trace.log program_name.pl.
Looking at the number of $dbh->{ActiveKids}, $dbh->{Kids}, and $dbh->{CachedKids}, I assume the maximum open cursor is 50 as the error happens after it reaches 50.
Our legacy production codes are using these modules:
DBI - 1.48
Ima::DBI - 0.33
Class::DBI - 0.96
Class::DBI::Oracle - 0.51
DBD::Oracle - 1.16
For some odd policy reason, upgrading the module to a newer version is not possible :(
The application relies on using CDBI to handle relationships on a large number of tables. A simplify snippet of the code is as below:
JOB:
foreach my $job (#jobs) {
my #records = $job->record;
RECORD:
foreach my $record (#records) {
my #datas = $record->data;
DATA:
foreach my $data (#datas) {
....
}
}
}
where each #jobs, $record, and $data is an object to a table and the inner most loop calls several other triggers.
Somewhere after several loops I'm getting an Oracle error: maximum open_cursor exceeded and then I got the error from the CDBI: issuing rollback() for database handle being DESTROYE'd without explicit disconnect.
I can workaround it by undef-ing the DBI CachedKids on the most outer loop, with:
# somewhere during initialization
$self->{_this_dbh} = __PACKAGE__->db_Main();
....
JOB:
foreach my $job (#jobs) {
RECORD: ....
DATA: ....
$self->{_this_dbh}->{CachedKids} = undef;
}
Is that the proper way to do it?
Or does CDBI support a way to clear statement handle the same way as DBI $sth->finish() ?
Thanks.

At some point, you will have to explain why you cannot upgrade to more nearly current versions of the software. You didn't mention which version of Perl you are using, or which version of Oracle; somehow, I suspect that it is neither 5.10.1 nor 11gR2.
Current versions:
Class::DBI 3.0.17
Class::DBI::Oracle 0.51
DBI 1.609 (version 1.48 is from 2005)
DBD::Oracle 1.23 (version 1.16 is from 2004)
Ima::DBI 0.35
What changed recently? Why are you suddenly finding problems in a piece of software that was, presumably, very stable? Is this new code?
With plain DBI, when you undef a statement handle (by having it go out of scope, for example), then the resources associated with it are released - more or less noisily. However, there is enough infrastructure between Class::DBI and DBI that it is hard to tell how this might map.
Have you worked out what the limit on open cursors actually is?
Have you worked out whether you've opened enough cursors to actually exceed that limit?
Have you tried running with DBI_TRACE set in the environment? A value such as 3 will tell you a fair amount about what it going on - maybe too much. It would show whether cursors are being released properly or not.
Have you tried reducing the number of tables manipulated in a single session?
Have you considered disconnecting and reconnecting between manipulating tables?
Is there a way to get to the statement handle corresponding to the Class::DBI abstractions, so that you can in fact execute $sth->finish()?

Related

What's the most reliable method for cross-platform alarm signal handling or execution timeouts in Perl?

I've added advisory locking to Sqitch, using Postgres advisory locks and MySQL GET_LOCK(). This feature prevents more than one instance of Sqitch from deploying to a database at one time. This works great, but I wanted to add a lock timeout, too, so that one never finds a CI/CD process hung for hours or days because something went amiss.
MySQL's GET_LOCK() supports a timeout argument, but Postgres advisory locks do not. Since I thought it likely that other database engines would also not have timeouts, I thought it best to implement the timeout in Perl. Following the DBI manual, I used Sys::SigAction to set and handle the timeout:
# Try waiting for the lock.
require App::Sqitch::SigAction;
return $self->_locked(1) unless App::Sqitch::SigAction::timeout_call($wait, sub {
$self->wait_lock
});
I also added tests to confirm it works with both MySQL and Postgres. So far so good.
Alas, Sys::SigAction does not work on Windows. I took a stab and testing it on Windows, but since Windows Perl is not compiled with d_sigaction, which Sys::SigAction also requires, I didn't get far. I tried implementing the Perl-standard alarm/$SIG{ALRM} pattern, but it failed to send the signal while waiting on the Postgres lock.
Which has led me here and to my question: What is the best cross-platform pattern for timing out some execution in Perl? Ideally it has a straight-forward interface, works on *nix and Windows, and effectively handles breaking out of a database query.

I ended up ditching Sys::SigAction following discussion here and elsewhere, and instead switched to:
Letting the database handle the timeout, as MySQL's get_lock() does
Adding a simple interface for polling with exponential backoff and timeout that engines can use to poll for a lock instead of waiting (similar to Retry::Backoff)
Switching the Postgres implementation to use the async query support in DBD::Pg to send off the lock request, and uses the backoff/timeout interface to check to see if it has returned and cancel the query if it times out
I was especially pleased to realize I could do #3, as I originally used the timeout/backoff interface to poll with pg_try_advisory_lock( key ), which just feels heavy. Better to asynchronously call pg_advisory_lock ( key ) and poll for its response. It looks like this:
sub wait_lock {
my $self = shift;
# Asyncronouslly request a lock with an indefinite wait.
my $dbh = $self->dbh;
$dbh->do(
'SELECT pg_advisory_lock(75474063)',
{ pg_async => DBD::Pg::PG_ASYNC() },
);
# Use _timeout to periodically check for the result.
return 1 if $self->_timeout(sub { $dbh->pg_ready && $dbh->pg_result });
# Timed out, cancel the query and return false.
$dbh->pg_cancel;
return 0;
}
Of course the MySQL implementation is simpler, since get_lock() does all the work:
sub wait_lock {
my $self = shift;
$self->dbh->selectcol_arrayref(
q{SELECT get_lock('sqitch working', ?)},
undef, $self->lock_timeout
)->[0]
}

Why does a program with Parallel::Loops exhaust my memory?

I've inherited some code at work i'm trying to improve on. My Perl skills are somewhat lacking so would love some assistance!
Essentially this script is SNMP polling a network of thousands of nodes to update it's local interface index cache. I've found it's hitting a problem where it's exhausting it's memory and failing. Code as follows (heavily reduced but i think you'll get the jist)
use strict;
use warnings;
use Parallel::Loops;
my %snmp_results;
my $maxProcs = 50;
my #exceptions;
my #devices;
my %snmp_results;
my $pl = Parallel::Loops->new($maxProcs);
$pl->share(\%snmp_results, \#exceptions );
load_devices();
get_snmp_interfaces();
sub get_snmp_interfaces {
$pl->foreach( \#devices, sub {
my ($name, $community, $snmp_ver) = #$_;
# Create the new ifindex cache, and return an array reference to the new entries
my $result = getSNMPIFFull($name, $community, $snmp_ver);
if (defined $result && $result ne "") {
my %cache = %{$result};
print "Got cache for $name\n";
# Build hash of all the links polled through SNMP
# [ifindex, ifdesc, ifalias, ifspeed, ip]
for my $link (keys %cache) {
$snmp_results{$name}{$cache{$link}[0]} = [$cache{$link}[0], $cache{$link}[1], $cache{$link}[2], $cache{$link}[3], $cache{$link}[4]];
}
}
else {
push(#exceptions, "Unable to poll $name - $community - $snmp_ver");
}
});
}
This particular VM has 3.1GB of ram alloctable and is idling on about 83MB usage when this script is not running. If i drop the maxProcs down to 25, it will finish fine but this script can already take a long time given the sheer number of devices + latency so would rather keep the parallelism high!
I have a feeling that the $pl->share() is sharing the ever-expanding %snmp_results with each forked process which is definitely not necessary since it's not reading/modifying other entries: just adding new entries. Is there a better way I can be doing this?
I'm also slightly unsure about my %cache = %{$result};. If this is just creating a pointer as a hash then cool but if it's doing a copy, that's also a bit wasteful!
Any help will be greatly appreciated!

Documentation of the module can be found in the CPAN here.
There's one part talking about the performance:
Also, if each loop sub returns a massive amount of data, this needs to
be communicated back to the parent process, and again that could
outweigh parallel performance gains unless the loop body does some
heavy work too.
You are probably moving around complete copies of the variables in memory, pushing to the machine's limit if the MIB to poll and number of machines are big enough.
Since what you are doing is an I/O intensive task and not a CPU task that could benefit of parallel CPU processing, I would reconsider the approach of launching so many (50!) threads for polling.
Run the program with $maxProcs down to 1 to 5 processes and see how it behaves. Do some profiling of your code, attaching Devel::NYTProf to check where you are consuming time and if increasing the number of processes actually leads to a better performance.
Reconsider using Parallel::Loops for this task. You may get better performance with use threads[1] and a hash shared between the different threads (use threads::shared).
Apologies if this could have been a comment. Starting in SO is difficult due to all the limitations that are in place :(
If you already found a solution it would be great if you could share with us your findings. I didn't know Parallel::Loops before and I think I can give it some use.

EF6/Code First: Super slow during the 1st query, but only in Debug

I'm using EF6 rc1 with Code First strategy, without precompiled views and the problem is:
If I compile and run the exe application it takes like 15 seconds to run the first query (that's okay, since I'm still working on the pre-generated views). But if I use Visual Studio 2013 Preview to Debug the exact same application it takes almost 2 minutes BEFORE running the first query:
Dim Context = New MyEntities()
Dim Query = From I in Context.Itens '' <--- The debug takes 2 minutes in here
Dim Item = Query.FirstOrDefault()
Is there a way to remove this extra time? Am I doing something wrong here?
Ps.: The context itself is not complicated, its just full with 200+ tables.
Edit: Found out that the problem is that during debug time the EF appears to be generating the Views ignoring the pre-generated ones.
Using the source code from EF I discovered that the property:
IQueryProvider IQueryable.Provider
{
get
{
return _provider ?? (_provider = new DbQueryProvider(
GetInternalQueryWithCheck("IQueryable.Provider").InternalContext,
GetInternalQueryWithCheck("IQueryable.Provider").ObjectQueryProvider));
}
}
is where the time is being consumed. But this is strange since it only takes time in debug. Am I missing something here?
Edit: Found more info related to the question:
Using the Process Monitor (by Sysinternals) I found out that there its the 'desenv.exe' process that is consuming tons of time. To be more specific its consuming time with an 'Thread Exit'. It repeats the Thread Exit stack 36 times. I don't know if this info is very useful, but I saved a '.cvs' with the stack, here is his body: [...] (edit: removed the '.cvs' body, I can post it again by the comments if someone really think its going to be useful, but it was confusing and too big.)
Edit: Installed VS2013 Ultimate and Entity Framework 6 RTM. Installed the Entity Framework Power Tools Beta 4 and used it to generate the Views. Nothing changed... If I run the exe it takes 20 seconds, if I 'Start' debugging it takes 120 seconds.
Edit: Created a small project to simulate the error: http://sdrv.ms/16pH9Vm
Just run the project inside the environment and directly through the .exe, click the button and compare the loading time.

This is a known performance issue in Lazy (which EF is using) when the debugger is attached. We are currently working on a fix (the current approach we are looking at is removing the use of Lazy). We hope to ship this fix in a patch release soon. You can track progress of this issue on our CodePlex site - http://entityframework.codeplex.com/workitem/1778.
More details on the coming 6.0.2 patch release that will include a fix are here - http://blogs.msdn.com/b/adonet/archive/2013/10/31/ef6-performance-issues.aspx

I don't know if you have found the solution. But in my case, I had similar issue which wasted me close to a week after trying different suggestions. Finally, I found a solution by changing my web.config to optimizeCompilations="true" and performance improved dramatically from 15-30 seconds to about 2 seconds.

Perl/Swig/Python/Postgresql/C++ Script just stops executing, only getting "Premature end of script headers"

This is hard to explain in a few sentences. I have spent the last 5 days trying to figure this out, so now I'm asking here as a last resort. I am trying to run a pool physics library with tournament server, built by the Stanford University Computational Billiards Group, available at http://www.stanford.edu/group/billiards/
It provides a tournament server using apache2, postgresql and perl. I have been able to enable functions like logging in or creating simple matches, but the communication with AI clients does not work. The client is a C++ application I'm running in a terminal, its fetching commands from the server and its printing this error:
XMLRPC error: Unable to transport XML to server and get XML response back. HTTP response code is 500, not 200
And the apache2 error.log is printing this:
[Thu May 17 21:30:17 2012] [error] [client 127.0.0.1] Premature end of script headers: api.pl
I tried running it with CGI::Debug but didnt get any more output. I have been able to identify the line it fails at with some debug outputs (one after every line :) ) and it's in this function, on the line between the print STDERRs:
package Pool::Rules::GameState;
sub addToDb {
my $self=shift;
my $timeleft=shift || $self->timeLeft();
my $timeleft_opp=shift || $self->timeLeftOpp();
#print STDERR "uh uh uh".$self." ".$self->timeLeft()." ".$self->timeLeftOpp()." ".$timeleft." ".$timeleft_opp."\n";
my $playingSolids = ($self->isOpenTable() ? undef : ($self->playingSolids()?1:0));
# print STDERR "addToDb: X".$self->isOpenTable()."X Y".$self->playingSolids()."Y Z".$playingSolids."Z\n";
$Pool::dB::dbh->do('INSERT INTO states (turntype,cur_player_started,playing_solids,timeleft,timeleft_opp,gametype) VALUES (?,?,?,?,?,?)',{},
$self->getTurnType(),$self->curPlayerStarted()?1:0,$playingSolids,$timeleft,$timeleft_opp,$self->gameType()) or STDERR $DBI::errstr;
my $stateid=$Pool::dB::dbh->selectrow_array('SELECT lastval()');
$self->tableState()->addToDb($stateid);
return $stateid;
}
It's only simple getter-methods. And still, it just stops executing. The code of the getter-methods has been generated by swig. The first print-line prints this:
uh uh uhPool::Rules::GameState=HASH(0x99fa99c) 0 0 00:10:00 00:10:00
Which seems to be okay, just curious that the $self->-getters only print 0. Since the library is from Stanford University and actual competitions have been held with it, I find it hard to believe that this is an error in the code. It just seems to be a bit too old to work (I had to make some adjustments on other places to make it work on more recent versions of libraries, f.ex. I added a < cstddef >-include in another file unrelated to this perl-issue). I first tried making it run on current versions, which I failed. Then I tried to install it on the versions the readme.txt sais it's been tested for (Ubuntu 8.10, Perl 5.10 and so on), but at some point my Ubuntu 8.10 installation always died and I had to reinstall (I tried this ~4 times, the gnome-terminal always Segmentation faulted). So now I'm back in Ubuntu 12 trying to make it work. I dont know much about perl, only the little I have been able to pick up over the last few days trying to get this to run.
Does anyone have an idea what could be triggering this kind of behavior? Is anyone aware of any compatibility issues that may be related to this? If you need any additional information just ask and I'll provide it.
Thank you for your help!

Perl CGI gets parameters from a different request to the current URL

This is a weird one. :)
I have a script running under Apache 1.3, with Apache::PerlRun option of mod_perl. It uses the standard CGI.pm module. It's a regularly accessed script on a busy server, accessed over https.
The URL is typically something like...
/script.pl?action=edit&id=47049
Which is then brought into Perl the usual way...
my $action = $cgi->param("action");
my $id = $cgi->param("id");
This has been working successfully for a couple of years. However we started getting support requests this week from our customers who were accessing this script and getting blank pages. We already had a line like the following that put the current URL into a form we use for customers to report an issue about a page...
$cgi->url(-query => 1);
And when we view source of the page, the result of that command is the same URL, but with an entirely different query string.
/script.pl?action=login&user=foo&password=bar
A query string that we recognise as being from a totally different script elsewhere on our system.
However crazy it sounds, it seems that when users are accessing a URL with a query string, the query string that the script is seeing is one from a previous request on another script. Of course the script can't handle that action and outputs nothing.
We have some automated test scripts running to see how often this happens, and it's not every time. To throw some extra confusion into the mix, after an Apache restart, the problem seems to initially disappear completely only to come back later. So whatever is causing it is somehow relieved by a restart, but we can't see how Apache can possibly take the request from one user and mix it up with another.

This, it appears, is an interesting combination of Apache 1.3, mod_perl 1.31, CGI.pm and Apache::GTopLimit.
A bug was logged against CGI.pm in May last year: RT #57184
Which also references CGI.pm params not being cleared?
CGI.pm registers a cleanup handler in order to cleanup all of it's cache.... (line 360)
$r->register_cleanup(\&CGI::_reset_globals);
Apache::GTopLimit (like Apache::SizeLimit mentioned in the bug report) also has a handler like this:
$r->post_connection(\&exit_if_too_big) if $r->is_main;
In pre mod_perl 1.31, post_connection and register_cleanup appears to push onto the stack, while in 1.31 it appears as if the GTopLimit one clobbers the CGI.pm entry. So if your GTopLimit function fires because the Apache process has got to large, then CGI.pm won't be cleaned up, leaving it open to returning the same parameters the next time you use it.
The solution seems to be to change line 360 of CGI.pm to;
$r->push_handlers( 'PerlCleanupHandler', \&CGI::_reset_globals);
Which explicitly pushes the handler onto the list.
Our restart of Apache temporarily resolved the problem because it reduced the size of all the processes and gave GTopLimit no reason to fire.
And we assume it has appeared over the past few weeks because we have increased the size of the Apache process either through new developments which included something that wasn't before.
All tests so far point to this being the issue, so fingers crossed it is!

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse