Where can I get the ANSI or ISO standards for the RDBMS queries? - rdbms

I want to write some queries which can work in almost all the databases without any SQLExceptions. So, where can I get the ANSI standards to write the queries ?

Not sure that'll help you.
Vendors are touch and go as far as standards implementation and often the standards themselves are imprecise enough such that you could never write a query that would work with all implementors.
For example, SQL 92 defines the concatenation operator as || but neither MySQL nor MSSQL use this (Oracle does). Vendor independent string concatenation is impossible.
Similarly, a standard escape character is not specified so how you handled that might not work in all vendors.
Having said that:
SQL 92:
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
Wiki article with links to SQL 99 ISO documents:
http://en.wikipedia.org/wiki/SQL:1999

From wikipedia:
The SQL standard is not freely available. The whole standard may be purchased from the ISO as ISO/IEC 9075(1-4,9-11,13,14):2008.
Nevertheless I would not advise you to follow this strategy because no database engine follows any SQL standard (SQL 99, 2003, etc.) to the letter. All of them take liberties in the way they handle instructions or define variables (for example, when comparing two strings different engines handle case sensitivity differently). A method that is very efficient with one engine can be terrible inefficient for another.
A suggestion would be to develop a standard group of queries and develop different classes that contain the specific implementation of that query for a certain target RDBMS.
Hope this helped

Check out the BNF of the core SQL grammars available at http://savage.net.au/SQL/
This is part of the answer - the rest, as pointed out by Kiranu and MattMitchell, is that different vendors implement the standard differently. No DBMS adheres perfectly to even SQL-92, though most are pretty close.
One observation: the SQL standard says nothing about indexes - so there is no standard syntax for creating an index. It also says nothing about how to create a database; each vendor has their own mechanisms for doing that.

The Sql-92 standard is probably the one you want to target. I believe it's supported most of the major RDBMSs.
Here is a less terse link. Sample content:
PostgreSQL Has views. Breaks standard by not allowing updates to views...
DB2 Conforms to at least SQL-92.
MSSQL Conforms to at least SQL-92.
MySQL Conforms to at least SQL-92.
Oracle Conforms to at least SQL-92.
Informix Conforms to at least SQL-92.
Something else you might consider, if you're using .NET, is to use the factory pattern in System.Data.Common which does a good job of abstracting provider specifics for a number of RDBMSs.

If you are trying to make a product that will work against multiple databases I think trying to only use standard sql is not the way to go, as other answers have indicated, due to the different 'interpretations' of the standard. Instead you should if possible have some kind of data access layer in your application which has different implementations specific for each database. Depending on what you are trying to do, there are tools such as Hibernate which will so a lot of the heavy lifting in regards to this for you.

Related

Key value oriented database vs document oriented database

I have recently started learning NO SQL databases and I came across Key-Value oriented databases and Document oriented databases. Since they have a similar structure, aren't they saved and retrieved the exact same way? And if that is the case then why do we define them as separate types? Otherwise, how they are saved in the file system?
To get started it is better to pin point the least wrong vocabulary. What used to be called nosql is too broad in scope, and often there is no intersection feature-wise between two database that are dubbed nosql except for the fact that they somehow deal with "data". What program does not deal with data?! In the same spirit, I avoid the term Relational Database Management System (RDBMS). It is clear to most speakers and listeners that RDBMS is something among SQL Server, some kind of Oracle database, MySQL, PostgreSQL. It is fuzzy whether that includes SQLite, that is already an indicator, that "relational database" ain't the perfect word to describe the concept behind it. Even more so, what people usually call nosql never forbid relations. Even on top of "key-value" stores, one can build relations. In a Resource Description Framework database, the equivalent of SQL rows are called tuple, triple, quads and more generally and more simply: relations. Another example of relational database are database powered by datalog. So RDBMS and relational database is not a good word to describe the intended concepts, and when used by someone, only speak about the narrow view they have about the various paradigms that exists in the data(base) world.
In my opinion, it is better to speak of "SQL databases" that describe the databases that support a subset or superset of SQL programming language as defined by the ISO standard.
Then, the NoSQL wording makes sense: database that do not provide support for SQL programming language. In particular, that exclude Cassandra and Neo4J, that can be programmed with a language (respectivly CQL and Cypher / GQL) which surface syntax looks like SQL, but does not have the semantic of SQL (neither a superset, nor a subset of SQL). Remains Google BigQuery, which feels a lot like SQL, but I am not familiar enough with it to be able to draw a line.
Key-value store is also fuzzy. memcached, REDIS, foundationdb, wiredtiger, dbm, tokyo cabinet et. al are very different from each other and are used in verrrrrrrrrrry different use-cases.
Sorry, document-oriented database is not precise enough. Historically, they were two main databases, so called document database: ElasticSearch and MongoDB. And those yet-another-time, are very different software, and when used properly, do not solve the same problems.
You might have guessed it already, your question shows a lack of work, and as phrased, and even if I did not want to shave a yak regarding vocabulary related to databases, is too broad.
Since they have a similar structure,
No.
aren't they saved and retrieved the exact same way?
No.
And if that is the case then why do we define them as separate types?
Their programming interface, their deployment strategy and their internal structure, and intended use-cases are much different.
Otherwise, how they are saved in the file system?
That question alone is too broad, you need to ask a specific question at least explain your understanding of how one or more database work, and ask a question about where you want to go / what you want to understand. "How to go from point A-understanding (given), to point B-understanding (question)". In your question point A is absent, and point B is fuzzy or too-broad.
Moar:
First, make sure you have solid understanding of an SQL database, at the very least the SQL language (then dive into indices and at last fine-tuning). Without SQL knowledge, your are worthless on the job market. If you already have a good grasp of SQL, my recommendation is to forgo everything else but FoundationDB.
If you still want "benchmark" databases, first set a situation (real or imaginary) ie. a project that you know well, that requires a database. Try to fit several databases to solve the problems of that project.
Lastly, if you have a precise project in mind, try to answer the following questions, prior to asking another question on database-design:
What guarantees do you need. Question all the properties of ACID: Atomic, Consistent, Isolation, Durability. Look into BASE. You do not necessarily need ACID or BASE, but it is a good basis that is well documented to know where you want / need to go.
What is size of the data?
What is the shape of the data? Are they well defined types? Are they polymorphic types (heterogeneous shapes)?
Workload: Write-once then Read-only, mostly reads, mostly writes, a mix of both. Answer also the question how fast or slow can be writes or reads.
Querying: How queries look like: recursive / deep, columns or rows, or neighboor hood queries (like graphql and SQL without recursive queries do). Again what is the expected time to response.
Do not forgo to at least the review deployement and scaling strategies prior to commit to a particular solution.
On my side, I picked up foundationdb because it is the most versatile in those regards, even if at the moment it requires some code to be a drop-in replacement for all postgresql features.

What is a MongoDB driver in layman's terms?

I've been scouring the MongoDB documentation, Google, Stackoverflow and YouTube... but I still can't seem to understand what a driver is used for in MongoDB.
I do know that different programming language can have one or many different drivers - but why do I need one?
You don't strictly speaking need one, but the alternative is building network packets manually scattered around in your code base... The term 'driver' is a bit irritating, because most people expect some kernel-level program that talks to hardware.
The MongoDB driver is more like an SDK or a helper library that helps you with a number of tasks that you'll almost certainly need to solve when you want to use MongoDB.
In essence, the MongoDB driver does these things:
it implements the MongoDB wire protocol that is used to talk to the database, i.e. it knows what 'messages' the database expects, it knows relevant constants, etc. 'It implements the MongoDB API' if you will.
It also comes with helpers to manage the actual TCP/IP sockets, creating them, resolving replica set addresses, implementing connection pooling, etc.
Next, the drivers contain helpers that make it easier to work with the BSON datatypes from your language, since there normally isn't a 1:1 mapping of types. A mongodb array, for example, could be mapped to an array or some kind of list or set container in most languages; ObjectId and ISODate might need a wrapper, and so on.
Lastly, the driver implements a serializer, that is, a piece of software that can create a copy of an instance 'from the outside', that is, without you having to implement a Serialize() method on each and every class (or whatever concept your language supports) you want to store. Together with 3), this writes the BSON representation of your data.
Serialization in itself isn't trivial, because one quickly has to cope with cyclical references, so a recursive algorithm on a set of unknown properties is required. If that doesn't sound complicated enough, the de-serialization (or hydration) of objects is even more painful, so it's not exactly the type of code that is super rewarding to write, unless it's highly reusable.
I'm sure I forgot something else the drivers do, but I think these are the key pain points they solve. As far as I know, their exact feature set varies from language to language and in some languages, the individual problems might be less or more pronounced, but they generally exist everywhere.

Using MicroORM for read layer in CQRS

Folks,
Im considering using a microORM such as Dapper.net for the read access component of a CQRS application (Asp.Net MVC), with Entity Framework being used for manipulating the domain.
This is CQRS light, I am not using event sourcing etc. I have seen it mentioned several times that the read only model in CQRS should be light/simpleas possible querying the data layer, possible using something like ADO.net
That implies potentially hardcoding SQL Query strings in our code or in some XML file. How should I go about justifying this approach where we have to maintain the domain mappings on one side and SQL statements on another?
Has anyone used MicroORM's in a CQRS solution in this way?
Thanks
Mick
Yes, absolutely you can use Dapper, PetaPoco, Massive, Simple.Data, or any other micro ORM you would like. In the past we have used NHibernate to solve the problem but it was a 10,000 lbs. gorilla compared to what we needed.
One thing that we really liked about Simple.Data and Petapoco in our evaluation of those libraries was that they each could adapt your queries to different database engines (including Mongo) with minimal tweaking necessary, whereas Dapper was basically one big bunch of SQL strings--it was "stringly typed". Don't get me wrong, Dapper's great and is very, very fast and will absolutely work great. Just evaluate your functional and non-functional requirements before committing.
Here are the relative number of downloads using NuGet for each of the primary micro ORMs (as of about 1/1/2012). For us, having a good community with lots of downloads is always a must in order to help iron out issues when the arise:
5568 Simple.Data
4990 Petapoco
4913 Dapper
2203 Massive
1152 OrmLite
Lastly, one thing you may want to investigate is your reasoning behind SQL altogether for your read models. If your domain is publishing events (regardless of event sourcing), and you're writing to simple, flat/non-relational view models, you may be able to get away with something as simple as JSON files that are pushed to the browser which the browser then interprets and uses to populate your HTML templates. There's all kinds of options that are available, you just need to determine what works best in your scenario.

Smart way to evaluate what is the right NoSQL database for me?

There appears to be a myriad of NoSQL databases available these days:
CouchDB
MongoDB
Cassandra
Hadoop
There's also a boundary between these tools and tools such as Redis that work as a memcached replacement.
Without hand waving and throwing too many buzz words - my question is the following:
How does one intelligently decide which tool here makes the most sense for their project? Are the projects similar enough to where the answer to this is subjective, eg: Ruby is better than Python or Python is better than Ruby? Or are we talking Apples and oranges here in that they each of them solve different problems?
What's the best way to educate myself on this new trend?
Perhaps one way to think of it is, programming has recently evolved from using one general-purpose language for everything to using the general-purpose language for most things, plus domain-specific languages for the more appropriate parts. For example, you might use Lua to script artificial intelligence of a character in a game.
NoSQL databases might be similar. SQL is the general purpose database with the longest and broadest adoption. While it could be shoehorned to serve many tasks, programmers are beginning to use NoSQL as a domain-specific database when it is more appropriate.
I would argue, that the 4 major players you named do have quite different featuresets and try to solve different problems with different priority.
For instance, as far as i know Cassandra (and i assume Hadoop) central focus is on large scale installations.
MongoDb tries to be a better scaling alternative to classic SQL servers in providing comparably powerful query functions.
CouchDB's focus is comparably small scale (will not shard at all, "only" replicate), high durability and easy synchronization of data.
You might want to check out http://nosql-database.org/ for some more information.
I am facing pretty much the same problem as you, and i would say there is no real alternative to look at all solutions in detail.
Check out this site: http://cattell.net/datastores/ and in particular the PDF linked at the bottom (CACM Paper). The latter contains an excellent discussion of the relative merits of various data store solutions.
It's easy. NoSQL databases are ACID compliant databases minus some guarantees. So just decide which guarantees you can do without and find the database that fits. If you don't need durability for example, maybe redis is best. Or if you don't need multi-record transactions, then perhaps look into mongodb.

What is the benefit of using a "document-oriented DBMS"?

I must be missing something, because everything I've seen so far suggests that it isn't any more interesting than a single table for storing blobs and a second table for tags that apply to it.
Now I certainly can see some benefit to that from a design pattern, but why would I want to use a "document-oriented DBMS" instead of just building it using a traditional database like SQL Server, Oracle, or Postgres?
I enjoyed listening to the floss weekly episode about CouchDB. Lots of reasoning and ideas there.
Prior to listening, most of the stuff I read about this topic triggered not much insight (for me). Listening to people talking and reasoning about why&where you want to use document-oriented DBs helped me a lot to really get the concepts, reasoning, pros and cons. Now all the articles and statements (IMHO) suddenly make a lot more sense.
Your mileage may vary, but this helped me a lot.
I presume you are meaning products such as Couch DB or Tokyo Cabinet (rather than ECM products like Documentum). I think the attraction for many developers is familiarity.
Firstly, the conceptual model (in most cases) is key-value pairs, like a configuration file. As most frameworks seem to require a lot of configuration-wrangling, front-end/middle tier developers are comfortable with that way of working working. Secondly, these tools offer interfaces in developer-friendly languages like Java, Python, etc.
Whereas, traditional RDBMS products require thinking in a different fashion - relationally. And they require learning not just a weird language, SQL, but a new way of programming: set-based rather than procedural. If you rehearse the arguments for putting business logic in the middle tier rather stored procedures in the database, well a lot of them apply to No SQL as well.
why would I want to use a "document-oriented DBMS" instead of just
building it using a traditional database like SQL Server, Oracle, or
Postgres?
Historically document-oriented DBMS do not allow to span ACID transactions across documents. They have poor support for Foreign Keys. The trade ought to allow for more accessible operations, maintenance and scaling.