I am in a need for a document-oriented database for a project I am working on. I basically have two things I need: Full ACID support and the ability to have references. Scalability is not a major issue since the number of total users is at most 300.
I know MongoDB supports references between documents and CouchDB supports ACID but I have not found one that has both.
I am really trying to avoid implementing either (ACID, References) in the application layer. The obvious fallback is RDBMS with some tree structure implementation which I am also trying to avoid.
Any suggestions?
THANX
You require ACID and full references, and CouchDB is not good for that.
You do not require scalability either. My guess is a database that is well-known wouldn't hurt either.
For those reasons, a relational database sounds appropriate.
Checkout RavenDB - it has both ACID and transaction support AND it supports the notion of relations between documents via Includes and Live Projections. Denormalization can probably come in handy, too.
Don't use an RDBMS if your business logic says it doesn't like it.
You mentioned the constraints - you mentioned correctly what CouchDB/MongoDB give you.
So based on those facts: use your fallback.
Related
I was reading at this article https://www.infoq.com/articles/spring-data-intro to understand how can a data service layer be independent of database(RDBMS /NoSQL). It looks like there's no way to design entity and repository to be independent of database. This article was written on 2012. Do we've any other technologies since then that has implemented this feature?
Before actually answering your question I have to ask: Why do you want to do that? Think twice because abstraction comes at a cost and just doing it in order to have a "clean" design is most certainly not worth it.
Now to your question:
There is no library or framework that does this out of the box.
Probably the thing getting closest to it is spring data which you are obviously aware of. If you stick to the persistent storage independent interfaces your repositories will abstract over the persistence store use at least to some extent. BUT: you will have to provide different kinds of metadata (typically as annotations on the entities) to make it work. So in this sense, it is a really leaky abstraction.
Of course, you can roll your own: create an interface with the operations you need and provide implementations for the different data stores you want to use. Also, include a storage independent way of providing metadata.
So the question becomes: Why hasn't anybody done that yet? And why you probably shouldn't do it either?
It is hard: Just writing SQL in a way that all (relevant) SQL databases understand is difficult, see this for an example.
You loose a lot of power of your stores. For example, RDBMSes are great at joining stuff. But joining is basically a no go for many No-SQL databases. So your API probably shouldn't offer this feature. This basically dumbs everything down to the common denominator, which is going to be really small when you have very different data stores.
It is not worth it. This leads back to my opening question: Why would you want to do it anyway? I certainly see the use of switching different RDBMSes. Some companies only want certain vendors in their datacenter, it is nice to have an in memory variant for testing and so on. But switching a store from for example Oracle to Hazelcast and then to MongoDB to CSV? Why would you want to do that? What is the business value of that?
I'm developing a brand new project in Scala. It's just an application for a bunch of CRUD operations, however, because of some eccentric requirements, Play2 or Lift does not fit the bill, so I'm going to develop the application from the ground up. This means that Anorm or ScalaQuery becomes less obvious choices for database integration, and leaves me with the question: is it time to try something new?
My past technology stacks mostly included Java and PostgreSQL and I have experience with both ORM and plain SQL. Are NoSQL database management systems like MongoDB a good replacement for a typical RDBMS or are they special case application data stores? Also, how does the choice of database effect the greater Scala system design (if at all)? For example, the fact that you are using a JSON-like interface to talk to the database, and JSON between the web and a REST service, does not mean that much if everything in the middle becomes Scala objects, or does it?
I'm basically asking for someone's experience on moving from relational to object/document type databases, using Scala in particular. I know that good RDBMS integration is promised in the upcoming release of SLICK. So, if a company like TypeSafe decides to make a RDBMS integration part of the TypeSafe stack, then will I be swimming upstream by integrating to MongoDB using Casbah for example?
Apologies if this question appears a bit vague. I do hope that someone with the right insights or experience will be able to help though.
Update:
Apologies for not adding links to SLICK (it being fairly new). Here goes:
Quick overview
Project home
Update 2:
My personal first win for a technology is usually developer productivity - this translates to lightweight and simple: quick to learn, easy to maintain, no magic
I am currently in a similar situation, and since I have some experience with web development and SQL databases, I took it as an opportunity to work with MongoDB, Cashbah (and Scalatra). My experience is still very limited and the project and the amount of data I am working with is pretty small, but here are a few observations I've made.
For the few sets of data I have, performance does not seem to motivate either SQL or NoSQL. However, performance in the presence of huge amounts of data is often listed as a reason for using NoSQL, e.g., by Wikipedia
My documents (entries in the database) arise from benchmarking test suits, and mainly have a static structure, and I am optimistic that I could store them in a fixed-schema SQL database. However, a few substructures are not static, e.g., new test cases are added, new statistics are tracked, others are removed. This was my main motivation for trying a schema-free NoSQL database. Also, because I had the feeling that the document approach of MongoDB makes it much more obvious which data belongs together (i.e., to a document), in contrast to entries in a relational database, where the data would be distributed over various tables and rows, and where a full "document" would need to be reconstructed by joins.
Tools such as Lift-Json or Rogue allow you to work with regular Scala objects in a type-safe, although the data is regularly (de-)serialised as (from) JSON. However, this naturally works best if the structure of your data is mainly static, otherwise, you you are left with using strings to access your data (e.g., for expanding the results of a query using Cashbah).
If you are mainly concerned about a coherent representation of data on server and client side, languages such as Opa or Haxe might be of interest, since they compile to code that can executed on both sides. See this page for "multitarget" or "tierless" languages.
Got too long for a comment. Was just trying to relate my short experience with Scala (about 6 months now, since about when Play2 came out--it's quickly become my go to language).
I've enjoyed using Salat/Casbah with MongoDB in my last few projects; most have been in Play2, but the latest was without a webapp framework. It definitely hasn't felt like swimming upstream.
I would say that there are particular use cases for which I wouldn't use mongo, but it works nicely as a general purpose object data store, especially if you expect to query by id or index and don't need transactions (and will need minimal ad-hoc aggregation type stuff).
Expect to require a separate set of servers dedicated to mongodb (or to use a service dedicated to mongodb), but I guess that's normal for most serious database apps.
I've also used Play2/Anorm, which was surprisingly enjoyable to use for some ad-hoc query dashboard-style report pages. I started trying to go the Squeryl route, but Anorm seemed easier to use for one-off aggregation queries. Haven't looked at SLICK, but it sounds interesting.
It's really hard to say without knowing what problems you would like the app to solve.
I've personally found my productivity increased using NoSQL DBs via REST/JSON. Though bear in mind most NoSQL DBs offer REST interfaces which preclude the need for much middleware, Scala or otherwise, unless you intend to write a webapp with a UI.
If this is a learning exercise, I recommend you try multiple things out, as each NoSQL DB has something different to offer to your toolkit, and have personally found CouchDB, Riak, Neo4j, and MongoDb all with various pluses and drawbacks and good for different purposes.
Hope this helps, good luck.
I'm new to MongoDB, can anyone explain how it could be used in efficiently in enterprise applications, so as to give good performance (using joins, indexing etc.)
And perhaps also point me to any MongoDB production applications on the web.
For a good introduction to MongoDB, check out The Little MongoDB Book. Here's a list of sites currently using MongoDB in production.
You talk about Joins and Indexes. It seems your head is still in the RDBMS world. NoSQL and Mongo are not just different Relational Databases there a completely different way of managing and thinking of Data. You need to think of your data schema in terms of Structured Objects rather than rows.
Sounds like you need to start from the beginning. MongoDB.org has a lot of the info you're asking for already available. Specifically, read their page on use cases, and the page on production deployments.
A more specific question would receive more comprehensive answers and fewer downvotes.
What are the major differences between MongoDB and CouchDB, and are there any other major NO-SQL database-servers out there worth mentioning?
I know that CERN uses CouchDB somewhere in their LHC back-end; huge stamp of approval. What are MongoDB - and any other major servers' - references?
Update
One of the major selling points of CouchDB, to me, is the REST-based API and seamless JavaScript integration using JSON as a data-wrapper. Is this possible with any of the other NO-SQL databases mentioned?
There are many more differences, but some quick points:
CouchDB has MVCC (Multi Version Concurrency Control) - each time a document is updated, a NEW version of it is created. Whereas MongoDB is update-in-place.
CouchDB has support for multi-master, so you can write to any server. MongoDB only has 1 server active for write (master-slave) - However: I this this may have changed in the latest release (1.6) so MongoDB may now support multiple servers for writes
To see who's using MongoDB see here (e.g. foursquare, bit.ly, sourceforge....)
To see who's using CouchDB see here.
The most notable other NoSQL database is Cassandra (facebook, twitter)
Then you have HBase, HyperTable, RavenDB, SimpleDB, and more still...
Welcome to some new ground #AdaTheDev covered most of the major ones. There's also Project Voldemort, Tokyo Cabinet/Tyrant, and a whole bunch of wrappers around all of these things. So people are also building MemcacheDB (memcache with a persistence layer).
MongoDB has several hooks to support "REST" APIs (check out "Sleepy Mongoose" and Node.js support). MongoDB and CouchDB have different ways of handling map-reduces (though they are somewhat similar). MongoDB does not have MVCC, but the two systems really have different ways of storing data each with their own set of trade-offs.
MongoDB uses language-specific drivers where CouchDB uses REST (performance trade-off).
For more detailed comparison look here.
MongoDB is probably a little easier for a relational developer to grasp since it uses drivers and has better support for ad hoc queries. CouchDB has very little in common with the old relational ways of doing things.
Both deal with sharding and replication differently.
Having said that, I believe both are conceptually similar enough that it often boils down to personal preference. They are all fun to code with. In fact, we evaluated both for an internal project and went back and forth with our decision.
From MongoDB's webpage I understand that they are not supporting transactions fully, if any.
I wonder if they are ever going to support it in the future so that I can store financial information in them, instead of using a RDBMS for it.
And how is it with CouchDB, do they support transactions?
Neither of these supports transactions in the sense of the more traditional RDMS - and it's unlikely they will - it's a tradeoff, supporting transactions in a distributed system is non-trivial and expensive.
MongoDB does not have ACID properties, and likely never will. CouchDB does give you ACID (I'm not sure if it does by default).
Both allows you to perform simple atomic operations on data, such as simple add/subtract on values though.
See also
Can I do transactions and locks in CouchDB?
MongoDB transactions?
On that note, this podcast with one of the MongoDB guys should give you an brief overview of the problems many NoSQL systems tries to solve, and the tradeoff they make.
Yes, MongoDB does't support transaction out of the box, but you can implement optimistic transactions on your own. I wrote an example and some explanation on a GitHub page. I hope you'll find it useful.
Major development: starting the next version, multi-document ACID transactions with snapshot isolation and all-or-nothing guarantees are supported by MongoDB.
See more in the announcement made by Eliot Horowitz, MongoDB's CTO and co-founder.