What happens if I assign two DBI connections in a row to a variable? - perl

What happens to the first connection if I assign two DBI connections in a row to a variable?
It doesn't feel like there is an implicit disconnect (no lag).
my $dbh = DBI->connect( $dsn, $user, $pass );
$dbh = DBI->connect( $dsn, $user, $pass );

The destructors of DBDs do close the connection.
I mean, it's possible that there's a DBD that leaks database connections out there, but that's highly unlikely and it would be a bug.

One may connect multiple times with the exact same parameters. So I don't see a reason for connect to be checking whether there is any existing connection, and certainly not to touch (disconnect) it.
However, the object for this new connection then overwrites the Perl variable $dbh. At this point anything could happen since DBI uses tie-ed variables, but I don't see in sources that this would trigger the destructor (DESTROY method) nor can I see that in drivers' sources that I looked up (mySQL and Postgresql).
So it seems to me that in this way the handle on the first connection is just lost, and you effectively got a leak. If that is the case then it is also undefined what happens to possibly open transactions in the first connection.
So I'd add a check on $dbh before connecting, using (defined and) state and/or ping. Then, if previous work with it is indeed done and you want to overwrite it, first close it if needed. (Another option is to structure your code so that a new connection goes into a newly declared variable.)

Related

How do check DB connection is proper [PERL]

In perl how to check whether db con object is able to access data base or not ?
== when fcgi is running and db got disconnected/shutdown fcgi db con object will fail to connect to database and the error will get only while binding a query or executing a query .. How to detect dbcon obj is proper before binding or executing a query .. ?
I assume you're talking about a DBI connection object. All DBI handles have a ping() method which will check whether or not the connection is still active.
The documentation says this:
ping
$rc = $dbh->ping;
Attempts to determine, in a reasonably efficient way, if the database server is still running and the connection to it is still working. Individual drivers should implement this function in the most suitable manner for their database engine.
The current default implementation always returns true without actually doing anything. Actually, it returns "0 but true" which is true but zero. That way you can tell if the return value is genuine or just the default. Drivers should override this method with one that does the right thing for their type of database.
Few applications would have direct use for this method. See the specialized Apache::DBI module for one example usage.
I think that's what you want.

How to hook after DBI reconnect

I deal with a module which provides abstraction over DBI which automatically reconnects. I need to perform some actions after `DBI->connect.
Is there a way to add a hook there without modifying this module? I don't have luck finding it in the documentation. Did I miss something?
The DBI doc has a chapter about subclassing which mentions a $dbh->connected method that does nothing. It seems to be exactly what you want.
When subclassing is being used then, after a successful new connect, the DBI->connect method automatically calls:
$dbh->connected($dsn, $user, $pass, \%attr);
I have not tried that, but it might work by just monkey-patching this connected method into DBI directly without subclassing anything. In connect there is definitely a call to connected.
But I am not sure where to patch that in. Possibly into the driver. A quick grep of the cpan shows that only two drivers included in the DBI dist include this. DBD::Gofer and DBD::Proxy, but that one is empty. In both of them it's in the DBD::<drivername>::db package.
Let's assume you are doing MySQL, then you'd go and hook it into your driver. Either by subclassing and using that driver, or by simply monkey-patching it in.
*DBD::mysql::db::connected = sub {
my ($dbh, dsn, $user, $pass, $attr, $old_driver) = #_;
warn 'Connected!';
}
This should work the same with other drivers, unless they have their own connected. In that case, you should probably wrap it manually or use something like Class::Method::Modifiers's around to wrap it to make sure the original behavior stays intact.
You also have the actual connected $dbh at this point, so you can go rummaging around in the database in connected if you want.
Of course this would give you the callback after every connect. If you wanted to only get the reconnects, you could create a closure over a lexical variable that counts the connections and skip the very first one.
{
my $connection_counter;
*DBD::mysql::db::connected = sub {
my ($dbh, dsn, $user, $pass, $attr, $old_driver) = #_;
return unless $connection_counter++; # skip first connection
warn 'Connected!';
}
}
Please note that I have not tested any of this.
While I must say I have not tried this; I found this article
http://justatheory.com/computers/databases/postgresql/execute-on-select.html
where #theory writes about an option to leverage DBI to run code whenever something happens - like connect. Hope it helps!

DBD::SQLite: database is locked: How to retry?

I use an SQLite database in parallel. Mostly for reading - which means everything works out great. But also for writing and dropping tables. And then suddenly I get this at random times (which indicates a race condition - which is expected running things in parallel):
Error: near line 1: database is locked
Now I know that in 10 ms the database will not be locked, so I would like to just wait 10 ms and try again, but I cannot find a way to catch that error.
How can I catch that error?
Update
Please take note of Georg Mavridis' comment above
It sounds like your child processes are sharing the same database connection and locking out one another
If you want true parallelism then you need to make multiple connections to your database. SQLite will queue requests from separate connections and resolve the conflict for you, unless that behaviour is disabled.
You need to design the error handling of your DBI application. There are three options that may be specified in the connect call
PrintError — on by default — this will cause a warning to be issued if an error arises
RaiseError — off by default — this will cause the process to die if an error arises
HandleError — unset by default — this option must be set to a subroutine reference, and that subroutine will be called if an error arises
If you expect no database errors then it is probably best to use
my $dbh = DBI->connect( ..., { PrintError => 0, RaiseError => 1 } )
Then you can enable error handling for parts of the code where something may go wrong and you want to attempt to fix it
The DBI documentation for the RaiseError option says this
If you want to temporarily turn RaiseError off (inside a library function that is likely to fail, for example), the recommended way is like this:
{
local $h->{RaiseError}; # localize and turn off for this block
...
}
That way, the RaiseError option is implicitly turned back on at the closing brace, and within the block you can check the values returned by execute, which says whether the operation was successful, and errstr, which gives details of the type of error sustained. You can then write retry code in Perl to do whatever you want
The standard sleep call will suspend a process, but has a granularity of one second. If you cannot afford for your program to wait that long between retries, then take a look at the usleep function from the Time::HiRes module, which takes multiples of one microsecond

What is wrong with accessing DBI directly?

I'm currently reading Effective Perl Programming (2nd edition). I have come across a piece of code which was described as being poorly written, but I don't yet understand what's so bad about it, or how it should be improved. It would be great if someone could explain the matter to me.
Here's the code in question:
sub sum_values_per_key {
my ( $class, $dsn, $user, $password, $parameters ) = #_;
my %results;
my $dbh =
DBI->connect( $dsn, $user, $password, $parameters );
my $sth = $dbh->prepare(
'select key, calculate(value) from my_table');
$sth->execute();
# ... fill %results ...
$sth->finish();
$dbh->disconnect();
return \%results;
}
The example comes from the chapter on testing your code (p. 324/325). The sentence that has left me wondering about how to improve the code is the following:
Since the code was poorly written and accesses DBI directly, you'll have to create a fake DBI object to stand in for the real thing.
I have probably not understood a lot of what the book has so far been trying to teach me, or I have skipped the section relevant for understanding what's bad practice about the above code... Well, thanks in advance for your help!
Since the chapter is about testing, consider this:
When testing your function, you are also (implicitly) testing DBI. This is why it's bad.
Good testing always only checks one functionality. To guarantee this, it would be required
to not use DBI directly, but use a mock object instead. This way, if your test fails, you
know it's your function and not something else in another module (like DBI in your example).
I think what Brian was trying to say by "poorly written" is that you do not have a separation between business logic and data access code (and database connection mechanics, while at it).
A correct approach to writing functions is that a function (or method) should do one thing, not 3 things at once.
As a result of this big lump of functionality, when testing, you have to test ALL THREE at the same time, which is difficult (see discussion of using "test SQLite DB" in those paragraphs). Or, as an alternative, do what the chapter was devoted to, and mock the DBI object to test the business logic by pretending that the data access AND DB setup worked a certain way.
But mocking a complicated-behaving object like DBI is very and very complicated to do right.
What if the database is not accessible? What if there's blocking? What if your query has a syntax error? What if the DB connection times out when executing the query? What if...
Good test code tests ALL those error situations and more.
A more correct approach (pattern) for the code would be:
my $dbh = set_up_dbh();
my $query = qq[select key, calculate(value) from my_table];
my $data = retrieve_data($dbh, $query);
# Now, we don't need to test setting up database connection AND data retrieval
my $calc_results = calculate_results($data);
This way, to test the logic in calculate_results (e.g. summing the data), you merely need to mock DATA passed to it, which is very easy (in many cases, you just store several sets of test data in some test config); as opposed to mocking the behavior of a complicated DBI object used to retrieve the data.
There is nothing wrong with using DBI by itself.
The clue is in the fact that this is the testing chapter. I assume the issue being pointed out is that the function opens and closes a database connection itself. It should instead expect a database handle as a parameter and just run queries on it, leaving any concerns about opening and closing a database connection to its caller. That will make the job of the function more narrow, so it makes the function more flexible.
That in turn also makes the function easier to test: just pass it a mock object as a database handle. As it is currently written, you need at least to redefine DBI::connect to test it, which isn’t hard, but is definitely messy.
A method called sum_values_per_key should be interested in summing the values of some keys, not fetching the data to be summed.
It does not meet the S (Single responsibility principle) of SOLID programming. http://en.wikipedia.org/wiki/Solid_%28object-oriented_design%29
This means that it is both:
Not reusable if you wish to use different source data.
Difficult to test in an environment without a database connection.
1) Suppose you have a dozen objects each with a dozen methods like this. Twenty of those methods will be called during the execution of the main program. You now have made 20 DB connections where you only need one.
2) Suppose you are not happy with original DBI and extended it with My::DBI. You now have to rewrite 144 functions in 12 files.
(Apache::DBI might be an example here).
3) You have to carry 3 positional parameters in each call to those 144 functions. Human brain works well with about 7 objects at a time; you have just waisted almost half that space. This makes code less maintainable.

Why would the rollback method not be available for a DBI handle?

For some reason I am having troubles with a DBI handle. Basically what happened was that I made a special connect function in a perl module and switched from doing:
do 'foo.pl'
to
use Foo;
and then I do
$dbh = Foo->connect;
And now for some reason I keep getting the error:
Can't locate object method "rollback" via package "Foo" at ../Foo.pm line 171.
So the weird thing is that $dbh is definitely not a Foo, it's just defined in foo. Anyway, I haven't had any troubles with it up until now. Any ideas what's up?
Edit: #Axeman: connect did not exist in the original. Before we just had a string that we used like this:
do 'foo.pl';
$dbh = DBI->connect($DBConnectString);
and so connect is something like this
sub connect {
my $dbh = DBI->connect('blah');
return $dbh;
}
We need to see the actual code in Foo to be able to answer this. You probably want to read Subclassing the DBI from the documentation to see how to do this properly.
Basically, you either need Foo to subclass DBI properly (again, you'll need to read the docs), or you need to declare a connect function to properly delegate to the DBI::connect method. Be careful about writing a producedural wrapper for OO code, though. It gets awfully hard to maintain state that way.
From perlfunc:
do 'stat.pl';
is just like
eval `cat stat.pl`;
So when you do 'foo.pl', you execute the code in the current context. Because I don't know what goes on in foo.pl or Foo.pm, I can't tell you what's changed. But, I can tell you that it's always executed in the current context, and now in executes in Foo:: namespace.
The way you're calling this, you are passing 'Foo' as the first parameter to Foo::connect or the returned sub from Foo->can('connect'). It seems that somehow that's being passed to some code that thinks it's a database handle, and that's telling that object to rollback.
I agree with Axeman. You should probably be calling your function using
use Foo;
...
$dbh = Foo::connect();
instead of Foo->connect();