I am looking for a DBI (or similar) proxy that supports both SQL restrictions and transactions. The two I know about are:
DBD::Proxy
DBD::Gofer
DBD::Proxy
The problem I have found with DBD::Proxy is that its server, DBI::ProxyServer, doesn't just restrict queries coming in over the network (which I want), but it also restricts queries generated internally by the database driver. So, for example, with DBD::Oracle, ping no longer works, as well as many other queries it issues itself.
I can't just allow them, because:
That is quite a bit of internal knowledge of DBD::Oracle and would be quite fragile.
The whitelist is query_name => 'sql', where query_name is the first word of whatever is passed to prepare. DBD::Oracle has a lot of internal queries, and the first word of many of them is select (duh).
So, it doesn't seem I can use DBD::Proxy
DBD::Gofer
I haven't tried DBD::Gofer, because the docs seem to tell me that I can't use transactions through it:
CONSTRAINTS
...
You can’t use transactions
AutoCommit only. Transactions aren’t supported.
So, before I write my own application-specific proxy (using RPC::PLServer ?), is there code out there that solves this problem?
This question would be best asked on the DBI Users mailing list, dbi-users#perl.org.
Sign up at http://dbi.perl.org/
I'm not sure what you mean about DBD::Proxy restricting queries. On the only occasion I've used it, it didn't modify the queries at all.
Related
I've been scouring the MongoDB documentation, Google, Stackoverflow and YouTube... but I still can't seem to understand what a driver is used for in MongoDB.
I do know that different programming language can have one or many different drivers - but why do I need one?
You don't strictly speaking need one, but the alternative is building network packets manually scattered around in your code base... The term 'driver' is a bit irritating, because most people expect some kernel-level program that talks to hardware.
The MongoDB driver is more like an SDK or a helper library that helps you with a number of tasks that you'll almost certainly need to solve when you want to use MongoDB.
In essence, the MongoDB driver does these things:
it implements the MongoDB wire protocol that is used to talk to the database, i.e. it knows what 'messages' the database expects, it knows relevant constants, etc. 'It implements the MongoDB API' if you will.
It also comes with helpers to manage the actual TCP/IP sockets, creating them, resolving replica set addresses, implementing connection pooling, etc.
Next, the drivers contain helpers that make it easier to work with the BSON datatypes from your language, since there normally isn't a 1:1 mapping of types. A mongodb array, for example, could be mapped to an array or some kind of list or set container in most languages; ObjectId and ISODate might need a wrapper, and so on.
Lastly, the driver implements a serializer, that is, a piece of software that can create a copy of an instance 'from the outside', that is, without you having to implement a Serialize() method on each and every class (or whatever concept your language supports) you want to store. Together with 3), this writes the BSON representation of your data.
Serialization in itself isn't trivial, because one quickly has to cope with cyclical references, so a recursive algorithm on a set of unknown properties is required. If that doesn't sound complicated enough, the de-serialization (or hydration) of objects is even more painful, so it's not exactly the type of code that is super rewarding to write, unless it's highly reusable.
I'm sure I forgot something else the drivers do, but I think these are the key pain points they solve. As far as I know, their exact feature set varies from language to language and in some languages, the individual problems might be less or more pronounced, but they generally exist everywhere.
I have just moved to PostgreSQL after having worked with Oracle for a few years.
I have been looking into some performance issues with prepared statements in the application (Java, JDBC) with the PostgreSQL database.
Oracle caches prepared statements in its SGA - the pool of prepared statements is shared across database connections.
PostgreSQL documentation does not seem to indicate this. Here's the snippet from the documentation (https://www.postgresql.org/docs/current/static/sql-prepare.html) -
Prepared statements only last for the duration of the current database
session. When the session ends, the prepared statement is forgotten,
so it must be recreated before being used again. This also means that
a single prepared statement cannot be used by multiple simultaneous
database clients; however, each client can create their own prepared
statement to use.
I just want to make sure that I am understanding this right, because it seems so basic for a database to implement some sort of common pool of commonly executed prepared statements.
If PostgreSQL does not cache these that would mean every application that expects a lot of database transactions needs to develop some sort of prepared statement pool that can be re-used across connections.
If you have worked with PostgreSQL before, I would appreciate any insight into this.
Yes, your understanding is correct. Typically if you had a set of prepared queries that are that critical then you'd have the application call a custom function to set them up on connection.
There are three key reasons for this afaik:
There's a long todo list and they get done when a developer is interested/paid to tackle them. Presumably no-one has thought it worth funding yet or come up with an efficient way of doing it.
PostgreSQL runs in a much wider range of environments than Oracle. I would guess that 99% of installed systems wouldn't see much benefit from this. There are an awful lot of setups without high-transaction performance requirement, or for that matter a DBA to notice whether it's needed or not.
Planned queries don't always provide a win. There's been considerable work done on delaying planning/invalidating caches to provide as good a fit as possible to the actual data and query parameters.
I'd suspect the best place to add something like this would be in one of the connection pools (pgbouncer/pgpool) but last time I checked such a feature wasn't there.
HTH
Whenever I watch a demo regarding the Entity Framework the demonstrator simply sets up some tables and performs Inserts, Updates and Deletes using automatically created code stubs but never shows any use of stored procedures. It seems to me that this is executing SQL from the client.
In my experience this is not particular good practice so I am presuming that my understanding of the Entity Framework is wrong.
Similarly WCF RIA Services demos use the EF and the demos are always the same. Can anyone shed any light on how you would use EF in a typical Business Layer/Data Access Layer/Stored Procedures set up.
I think I am confused and shouldn't be!!?
There's nothing wrong with executing SQL from the client. Most (if not all) of the problems that it might cause are in fact not there when using something like EF. For instance:
Client generated SQL might cause runtime syntax errors. This is not unlikely since the description of your query is mostly checked on compile time (assuming that the generator itself doesn't generate invalid SQL, which is also unlikely)
Client generated SQL might be inefficient. This is not true with modern database software which have query caches. EF works in a way that's compatible with query caches, i.e. it generates the same SQL consistently (as long as you use the same code consistently) and uses parameters for varying data.
Client generated SQL might be insecure (SQL injections and whatnot). This is all handled by the generator, which uses parameters for your values and does not interpolate user input into the query itself.
Back in the old Client / Server days, it used to be considered good practice to do all db updates using stored procedures.
But now, it's perfectly acceptable to have an O/RM generate SQL and run directly against DB.
Well, part of the reason why executing sql in stored procedures is a good idea is that it gives you a level of abstraction - when db changes inevitably occur, you make a change in a single place (the proc) rather than a dozen places (all the places where you were calling the client sql). Entity Framework provides this layer of abstraction through the data model, and you have the same advantage.
There are some other reasons why you might want to look at procs, like security granularity (only allowing certain users the right to execute), and some minor performance differences. Ultimately, you have to decide for yourself what the right trade-off is. EF is an attempt to dramatically reduce the developer time spent creating a data layer, with the trade-offs listed above.
never shows any use of stored procedures
Take a look at this video: Using Your Own Stored Procedures to Insert, Update and Delete Entities in Entity Framework.
Note that there are a lot of other videos on that topic there that are certainly worth watching!
The legend is that Scott Hanselman once said "It's not a real demo unless someone drags a datagrid" (pg 478 Silverlight 4 In Action, Pete Brown)
You have to remember that demos, are all about selling software, and not at all about communicating best practice. So your observations about the demos are absolutely correct, they cover the basics, and leave it to the observer to fill in the blanks.
As to your comment about Stored Procedures, and various answers to your question about the generator. The generator is good, and getting better. Howerver there are certain circumstances when it will generate completely unusable queries. (see my SO question here and discussed on the ADO.NET team blog)
Therefore there are occasions when hand crafted queries are your only recourse (either by way of stored proc, table value functions, views etc)
I want to write some queries which can work in almost all the databases without any SQLExceptions. So, where can I get the ANSI standards to write the queries ?
Not sure that'll help you.
Vendors are touch and go as far as standards implementation and often the standards themselves are imprecise enough such that you could never write a query that would work with all implementors.
For example, SQL 92 defines the concatenation operator as || but neither MySQL nor MSSQL use this (Oracle does). Vendor independent string concatenation is impossible.
Similarly, a standard escape character is not specified so how you handled that might not work in all vendors.
Having said that:
SQL 92:
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
Wiki article with links to SQL 99 ISO documents:
http://en.wikipedia.org/wiki/SQL:1999
From wikipedia:
The SQL standard is not freely available. The whole standard may be purchased from the ISO as ISO/IEC 9075(1-4,9-11,13,14):2008.
Nevertheless I would not advise you to follow this strategy because no database engine follows any SQL standard (SQL 99, 2003, etc.) to the letter. All of them take liberties in the way they handle instructions or define variables (for example, when comparing two strings different engines handle case sensitivity differently). A method that is very efficient with one engine can be terrible inefficient for another.
A suggestion would be to develop a standard group of queries and develop different classes that contain the specific implementation of that query for a certain target RDBMS.
Hope this helped
Check out the BNF of the core SQL grammars available at http://savage.net.au/SQL/
This is part of the answer - the rest, as pointed out by Kiranu and MattMitchell, is that different vendors implement the standard differently. No DBMS adheres perfectly to even SQL-92, though most are pretty close.
One observation: the SQL standard says nothing about indexes - so there is no standard syntax for creating an index. It also says nothing about how to create a database; each vendor has their own mechanisms for doing that.
The Sql-92 standard is probably the one you want to target. I believe it's supported most of the major RDBMSs.
Here is a less terse link. Sample content:
PostgreSQL Has views. Breaks standard by not allowing updates to views...
DB2 Conforms to at least SQL-92.
MSSQL Conforms to at least SQL-92.
MySQL Conforms to at least SQL-92.
Oracle Conforms to at least SQL-92.
Informix Conforms to at least SQL-92.
Something else you might consider, if you're using .NET, is to use the factory pattern in System.Data.Common which does a good job of abstracting provider specifics for a number of RDBMSs.
If you are trying to make a product that will work against multiple databases I think trying to only use standard sql is not the way to go, as other answers have indicated, due to the different 'interpretations' of the standard. Instead you should if possible have some kind of data access layer in your application which has different implementations specific for each database. Depending on what you are trying to do, there are tools such as Hibernate which will so a lot of the heavy lifting in regards to this for you.
Is it possible to access a Apache::DBI database handle from a perl script (that isn't running under mod_perl).
What I am looking for is database pooling for my perl scripts, I have a fair number of database sources (oracle/mysql) and a growing number of scripts.
Some ideas like SQLRelay, using Oracle10XE with database links and pooling, and or convert all scripts to SOAP calls, etc are becoming more and more viable. But if there was a mechanism for reusing Apache::DBI I could fight this a bit.
I have no non-perl requirements, so we don't have a php/jdbc implementation or similar to deal with.
Thanks
First off it helps to remember that DBI/DBD is not a wire protocol, but an API over diverse data sources.
Since you are wanting to connect to a pool of database connections from separate processes, DBIx::Connector is not appropriate for that, and Rose::DB seems an odd choice too (they are both wrappers over DBI). You are looking for something like DBD::Proxy or DBD::Gofer, which let you connect multiple processes to a shared database handle.