How to hook after DBI reconnect - perl

I deal with a module which provides abstraction over DBI which automatically reconnects. I need to perform some actions after `DBI->connect.
Is there a way to add a hook there without modifying this module? I don't have luck finding it in the documentation. Did I miss something?

The DBI doc has a chapter about subclassing which mentions a $dbh->connected method that does nothing. It seems to be exactly what you want.
When subclassing is being used then, after a successful new connect, the DBI->connect method automatically calls:
$dbh->connected($dsn, $user, $pass, \%attr);
I have not tried that, but it might work by just monkey-patching this connected method into DBI directly without subclassing anything. In connect there is definitely a call to connected.
But I am not sure where to patch that in. Possibly into the driver. A quick grep of the cpan shows that only two drivers included in the DBI dist include this. DBD::Gofer and DBD::Proxy, but that one is empty. In both of them it's in the DBD::<drivername>::db package.
Let's assume you are doing MySQL, then you'd go and hook it into your driver. Either by subclassing and using that driver, or by simply monkey-patching it in.
*DBD::mysql::db::connected = sub {
my ($dbh, dsn, $user, $pass, $attr, $old_driver) = #_;
warn 'Connected!';
}
This should work the same with other drivers, unless they have their own connected. In that case, you should probably wrap it manually or use something like Class::Method::Modifiers's around to wrap it to make sure the original behavior stays intact.
You also have the actual connected $dbh at this point, so you can go rummaging around in the database in connected if you want.
Of course this would give you the callback after every connect. If you wanted to only get the reconnects, you could create a closure over a lexical variable that counts the connections and skip the very first one.
{
my $connection_counter;
*DBD::mysql::db::connected = sub {
my ($dbh, dsn, $user, $pass, $attr, $old_driver) = #_;
return unless $connection_counter++; # skip first connection
warn 'Connected!';
}
}
Please note that I have not tested any of this.

While I must say I have not tried this; I found this article
http://justatheory.com/computers/databases/postgresql/execute-on-select.html
where #theory writes about an option to leverage DBI to run code whenever something happens - like connect. Hope it helps!

Related

What happens if I assign two DBI connections in a row to a variable?

What happens to the first connection if I assign two DBI connections in a row to a variable?
It doesn't feel like there is an implicit disconnect (no lag).
my $dbh = DBI->connect( $dsn, $user, $pass );
$dbh = DBI->connect( $dsn, $user, $pass );
The destructors of DBDs do close the connection.
I mean, it's possible that there's a DBD that leaks database connections out there, but that's highly unlikely and it would be a bug.
One may connect multiple times with the exact same parameters. So I don't see a reason for connect to be checking whether there is any existing connection, and certainly not to touch (disconnect) it.
However, the object for this new connection then overwrites the Perl variable $dbh. At this point anything could happen since DBI uses tie-ed variables, but I don't see in sources that this would trigger the destructor (DESTROY method) nor can I see that in drivers' sources that I looked up (mySQL and Postgresql).
So it seems to me that in this way the handle on the first connection is just lost, and you effectively got a leak. If that is the case then it is also undefined what happens to possibly open transactions in the first connection.
So I'd add a check on $dbh before connecting, using (defined and) state and/or ping. Then, if previous work with it is indeed done and you want to overwrite it, first close it if needed. (Another option is to structure your code so that a new connection goes into a newly declared variable.)

Should I disconnect() if I'm using Apache::DBI's connect_cached()?

My mod_perl2-based intranet app uses DBI->connect_cached() which is supposedly overridden by Apache::DBI's version of the same. It has normally worked quite well, but just recently we started having an issue on our testing server--which had only two users connected--whereby our app would sometimes, but not always, die when trying to reload a page with 'FATAL: sorry, too many clients already' connecting to our postgres 9.0 backend, despite all of them being <IDLE> if I look at the stats in pgadmin3.
The backend is separate from our development and production backends, but they're all configured with max_connections = 100. Likewise the httpd services are all separate, but configured with
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 99
MaxClients 99
MaxRequestsPerChild 4000
....
PerlModule Apache::DBI
I had been under the impression that I shouldn't call disconnect() on my database handles if I wanted them to actually benefit from caching. Was I wrong about that? If not, I guess I'll ask about the above error separately. Just wanted to make sure it wasn't this setup...
Apache::DBI's docs say:
When loading the DBI module (do not confuse this with the Apache::DBI
module) it checks if the environment variable 'MOD_PERL' has been set
and if the module Apache::DBI has been loaded. In this case every
connect request will be forwarded to the Apache::DBI module.
....
There is no need to remove the disconnect statements from your code.
They won't do anything because the Apache::DBI module overloads the
disconnect method.
If you are developing new code that is strictly for use in mod_perl,
you may choose to use DBI->connect_cached() instead, but consider
adding an automatic rollback after each request, as described above.
So I guess for my mod_perl2-only app, I don't need Apache::DBI because Apache::DBI's devs recommend using DBI->connect_cached. And I don't need disconnect statements.
But then DBI's docs say:
Note that the behaviour of [ connect_cached ] differs in several
respects from the behaviour of persistent connections implemented by
Apache::DBI. However, if Apache::DBI is loaded then connect_cached
will use it.
This makes it sound like Apache::DBI will actually affect connect_cached, in that instead of getting DBI->connect_cached behaviour when I call that, I'll get Apache::DBI->connect behaviour. And Apache::DBI's docs recommend against that.
UPDATE: I've set the first 5 parameters in the above config all to 1, and my app is still using up more and more connections as I hit its pages. This I don't understand at all--it should only have one process, and that one process should be re-using its connection.
Unless you plan on dropping Apache::DBI, the answer is a firm no, because Apache::DBI's override really does nothing:
# overload disconnect
{
package Apache::DBI::db;
no strict;
#ISA=qw(DBI::db);
use strict;
sub disconnect {
my $prefix = "$$ Apache::DBI ";
Apache::DBI::debug(2, "$prefix disconnect (overloaded)");
1;
}
;
}

Why does not catalyst die only once in a chained action?

Consider the following actions:
sub get_stuff :Chained('/') :PathPart('stuff') :CaptureArgs(1) {
my ($self,$c,$stuff_id) = #_;
die "ARRRRRRGGGG";
}
sub view_stuff :Chained('get_stuff') :PathPart('') :Args(0){
die "DO'H";
}
Now if you request '/stuff/314/' , you'll get
Error: ARRRRG in get_stuff at ...
Error: DO'H in view_stuff at ...
Is there a reason why not just throw the error at the first failing chain link?
Why is catalyst trying to carry on the chain?
I'm not sure of the answer as to 'why' but I presume it was done that way to give flexibility.
You should probably catch the error with eval (or preferably something like Try::Tiny or TryCatch) and call $c->detach if you want to stop processing actions.
Catalyst::Plugin::MortalForward
The chosen answer may be outdated. Catalyst can die early, when the application's config key abort_chain_on_error_fix is set.
__PACKAGE__->config(abort_chain_on_error_fix => 1);
See the documentation about Catalyst configurations. It also states, that this behaviour may be standard in future.
Cucabit is right, detach is the way to go. As to the why, normally in a perl process, 'die' stops the process. In Catalyst you don't want that. If for instance you run your Catalyst app under FastCGI, you spawn one or more standalone processes that handle multiple requests. If the first request would kill the process itself, the web server would have to respawn the FastCGI process in order to be able to handle the next call. I think for that, Catalyst catches 'die' (its used a lot as the default 'do_something() or die $!') and turns it into an Exception.
You could also end the process with 'exit' I guess, but you end up with the same problems as above, killing the process.
What you can of course do is create your own 'die' method that logs the error passed with the default log object and then calls detach or something. it should also be possible to redefine the Catalyst exception handling as anything is possible with Catalyst :)

What is wrong with accessing DBI directly?

I'm currently reading Effective Perl Programming (2nd edition). I have come across a piece of code which was described as being poorly written, but I don't yet understand what's so bad about it, or how it should be improved. It would be great if someone could explain the matter to me.
Here's the code in question:
sub sum_values_per_key {
my ( $class, $dsn, $user, $password, $parameters ) = #_;
my %results;
my $dbh =
DBI->connect( $dsn, $user, $password, $parameters );
my $sth = $dbh->prepare(
'select key, calculate(value) from my_table');
$sth->execute();
# ... fill %results ...
$sth->finish();
$dbh->disconnect();
return \%results;
}
The example comes from the chapter on testing your code (p. 324/325). The sentence that has left me wondering about how to improve the code is the following:
Since the code was poorly written and accesses DBI directly, you'll have to create a fake DBI object to stand in for the real thing.
I have probably not understood a lot of what the book has so far been trying to teach me, or I have skipped the section relevant for understanding what's bad practice about the above code... Well, thanks in advance for your help!
Since the chapter is about testing, consider this:
When testing your function, you are also (implicitly) testing DBI. This is why it's bad.
Good testing always only checks one functionality. To guarantee this, it would be required
to not use DBI directly, but use a mock object instead. This way, if your test fails, you
know it's your function and not something else in another module (like DBI in your example).
I think what Brian was trying to say by "poorly written" is that you do not have a separation between business logic and data access code (and database connection mechanics, while at it).
A correct approach to writing functions is that a function (or method) should do one thing, not 3 things at once.
As a result of this big lump of functionality, when testing, you have to test ALL THREE at the same time, which is difficult (see discussion of using "test SQLite DB" in those paragraphs). Or, as an alternative, do what the chapter was devoted to, and mock the DBI object to test the business logic by pretending that the data access AND DB setup worked a certain way.
But mocking a complicated-behaving object like DBI is very and very complicated to do right.
What if the database is not accessible? What if there's blocking? What if your query has a syntax error? What if the DB connection times out when executing the query? What if...
Good test code tests ALL those error situations and more.
A more correct approach (pattern) for the code would be:
my $dbh = set_up_dbh();
my $query = qq[select key, calculate(value) from my_table];
my $data = retrieve_data($dbh, $query);
# Now, we don't need to test setting up database connection AND data retrieval
my $calc_results = calculate_results($data);
This way, to test the logic in calculate_results (e.g. summing the data), you merely need to mock DATA passed to it, which is very easy (in many cases, you just store several sets of test data in some test config); as opposed to mocking the behavior of a complicated DBI object used to retrieve the data.
There is nothing wrong with using DBI by itself.
The clue is in the fact that this is the testing chapter. I assume the issue being pointed out is that the function opens and closes a database connection itself. It should instead expect a database handle as a parameter and just run queries on it, leaving any concerns about opening and closing a database connection to its caller. That will make the job of the function more narrow, so it makes the function more flexible.
That in turn also makes the function easier to test: just pass it a mock object as a database handle. As it is currently written, you need at least to redefine DBI::connect to test it, which isn’t hard, but is definitely messy.
A method called sum_values_per_key should be interested in summing the values of some keys, not fetching the data to be summed.
It does not meet the S (Single responsibility principle) of SOLID programming. http://en.wikipedia.org/wiki/Solid_%28object-oriented_design%29
This means that it is both:
Not reusable if you wish to use different source data.
Difficult to test in an environment without a database connection.
1) Suppose you have a dozen objects each with a dozen methods like this. Twenty of those methods will be called during the execution of the main program. You now have made 20 DB connections where you only need one.
2) Suppose you are not happy with original DBI and extended it with My::DBI. You now have to rewrite 144 functions in 12 files.
(Apache::DBI might be an example here).
3) You have to carry 3 positional parameters in each call to those 144 functions. Human brain works well with about 7 objects at a time; you have just waisted almost half that space. This makes code less maintainable.

Why would the rollback method not be available for a DBI handle?

For some reason I am having troubles with a DBI handle. Basically what happened was that I made a special connect function in a perl module and switched from doing:
do 'foo.pl'
to
use Foo;
and then I do
$dbh = Foo->connect;
And now for some reason I keep getting the error:
Can't locate object method "rollback" via package "Foo" at ../Foo.pm line 171.
So the weird thing is that $dbh is definitely not a Foo, it's just defined in foo. Anyway, I haven't had any troubles with it up until now. Any ideas what's up?
Edit: #Axeman: connect did not exist in the original. Before we just had a string that we used like this:
do 'foo.pl';
$dbh = DBI->connect($DBConnectString);
and so connect is something like this
sub connect {
my $dbh = DBI->connect('blah');
return $dbh;
}
We need to see the actual code in Foo to be able to answer this. You probably want to read Subclassing the DBI from the documentation to see how to do this properly.
Basically, you either need Foo to subclass DBI properly (again, you'll need to read the docs), or you need to declare a connect function to properly delegate to the DBI::connect method. Be careful about writing a producedural wrapper for OO code, though. It gets awfully hard to maintain state that way.
From perlfunc:
do 'stat.pl';
is just like
eval `cat stat.pl`;
So when you do 'foo.pl', you execute the code in the current context. Because I don't know what goes on in foo.pl or Foo.pm, I can't tell you what's changed. But, I can tell you that it's always executed in the current context, and now in executes in Foo:: namespace.
The way you're calling this, you are passing 'Foo' as the first parameter to Foo::connect or the returned sub from Foo->can('connect'). It seems that somehow that's being passed to some code that thinks it's a database handle, and that's telling that object to rollback.
I agree with Axeman. You should probably be calling your function using
use Foo;
...
$dbh = Foo::connect();
instead of Foo->connect();