I think its fair to say I'm pretty new to MongoDB having just installed the database this afternoon.
I'm managing to get to grips with storing and retrieving objects but am struggling to find the best way of storing objects that have a many-to-many relationship.
I've already come accross the DBRef object and have got that working but this seems to only support a lazy loaded approach.
Is there a way to encourage MongoDB to eagerly load DBRefs?
Is there a better/different way to store the many-to-many relationship?
Many thanks
Rob
So first, I think you need to look at this question over here, which talks about many-to-many relationships.
The other thing to understand is the nature of "DBRefs". The MongoDB database does not provide any join functionality.
The DBRef is just a standard that several library / driver implementers agreed upon a couple of years ago. The DBRef is just a JSON object in a specific notation that provides a pointer to some other document in some collection. So the implementation of "lazy vs. eager" loading is completely specific to the driver / wrapper library that you are using.
That stated, the concept of "eager loading" is rather pointless with MongoDB. In SQL you can potentially save on total queries by using some form of eager loading and doing a join "in advance". Again, the DB does not support joins, so "eager loading" takes the same number of queries as "lazy loading".
Related
First of all, I have extensive experience on Relational DBs but very beginner level knowledge of Document DB. I'm exploring MongoDB but my question is in general to Document DB.
AFA I know (I may be wrong), A Document DB is consisting of containers and containers contain same of different object structures. These object structures are defined such a way that filters and information can be applied in most optimum way. For ex. A is written by Authors. So object of Book will contain list of authors also. This way searching can be made faster and performance can be gained.
What is my problem?
I'm creating an application (yet haven't started as I'm confused here). It's relational DB is something like this....
The problem is I'm not able to design the Document DB structure for this requirement.
Please somebody help my in designing such database or can give me idea on "What approach one should select while designing such database?"
This comes down to answering the following questions:
What are your most common access patterns? It is helpful to think of your API methods, or top 5-10 queries to decide how to organize.
What are your transactional needs? Which of these entity types occur together in transactions and queries?
How often do they change? Should you embed or reference?
If you could include these details, we can help with more targeted suggestions.
http://azure.microsoft.com/documentation/articles/documentdb-modeling-data/ is also worth a read if you haven't already.
The main difference between DocumentDB and relational databases/MongoDB is that collections are more like shards/partitions and not tables.
I just started to use MongoDB and I'm confused to build object models with list property.
I have a User model related to Followers and Following object which are list of User IDs.
So I can think of some object model structures to represent the relation.
Embedded Document. Followers and Following are embedded to User model. In this way, a "current_user" object is generated in many web frameworks in every request, and it's an extra overhead to serialize/deserialize the Follower and Following list property since we seldom use these properties in most requests. We can exclude these properties when "current_user" is generated. However, we need to fetch full "current_user" object again before we do any updates to it.
Use Reference Property in User model. We can have Followers and Following object models themselves, not embedded, but save references to the User object.
Use Reference Property in Followers and Following models. We can save User ID in Follower and Following property for later queries.
There might be some other ways to do it, easier to use or better performance. And my question is:
What's the suggested way to design a model with some related list properties?
For folks coming from the SQL world (such as myself) one of the hardest things to learn about MongoDB is the new style of schema design. In the SQL world, everything goes into third normal form. Folks come to think that there is a single right way to design their schema, because there typically is one.
In the MongoDB world, there is no one best schema design. More accurately, in MongoDB schema design depends on how the application is going to access the data.
Here are the key questions that you need to have answered in order to design a good schema for MongoDB:
How much data do you have?
What are your most common operations? Will you be mostly inserting new data, updating existing data, or doing queries?
What are your most common queries?
What are your most common updates?
How many I/O operations do you expect per second?
Here's how these questions might play out if you are considering one-to-many object relationships.
In SQL you simply create a pair of master/detail tables with a primary key/foreign key relationship. In MongoDB, you have a number of choices: you can embed the data, you can create a linked relationship, you can duplicate and denormalize the data, or you can use a hybrid approach.
The correct approach would depend on a lot of details about the use case of your application.
Here are some good general references on MongoDB schema design.
MongoDB presentations:
http://www.10gen.com/presentations/mongosf2011/schemabasics
http://www.10gen.com/presentations/mongosv-2011/schema-design-by-example
http://www.10gen.com/presentations/mongosf2011/schemascale
http://www.10gen.com/presentations/MongoNYC-2012/Building-a-MongoDB-Power-Chat-Server
Here are a couple of books about MongoDB schema design that I think you would find useful:
http://www.manning.com/banker/ (MongoDB in Action)
http://shop.oreilly.com/product/0636920018391.do (Document Design for MongoDB)
Here are some sample schema designs:
http://docs.mongodb.org/manual/use-cases/
https://openshift.redhat.com/community/blogs/designing-mongodb-schemas-with-embedded-non-embedded-and-bucket-structures
When it comes to NoSQL, there are bewildering number of choices to select a specific NoSQL database as is clear in the NoSQL wiki.
In my application I want to replace mysql with NOSQL alternative. In my application I have user table which has one to many relation with large number of other tables. Some of these tables are in turn related to yet other tables. Also I have a user connected to another user if they are friends.
I do not have documents to store, so this eliminates document oriented NoSQL databases.
I want very high performance.
The NOSQL database should work very well with Play Framework and scala language.
It should be open source and free.
So given above, what NoSQL database I should use?
I think you may be misunderstanding the nature of "document databases". As such, I would recommend MongoDB, which is a document database, but I think you'll like it.
MongoDB stores "documents" which are basically JSON records. The cool part is it understands the internals of the documents it stores. So given a document like this:
{
"name": "Gregg",
"fave-lang": "Scala",
"fave-colors": ["red", "blue"]
}
You can query on "fave-lang" or "fave-colors". You can even index on either of those fields, even the array "fave-colors", which would necessitate a many-to-many in relational land.
Play offers a MongoDB plugin which I have not used. You can also use the Casbah driver for MongoDB, which I have used a great deal and is excellent. The Rogue query DSL for MongoDB, written by FourSquare is also worth looking at if you like MongoDB.
MongoDB is extremely fast. In addition you will save yourself the hassle of writing schemas because any record can have any fields you want, and they are still searchable and indexable. Your data model will probably look much like it does now, with a users "collection" (like a table) and other collections with records referencing a user ID as needed. But if you need to add a field to one of your collections, you can do so at any time without worrying about the older records or data migration. There is technically no schema to MongoDB records, but you do end up organizing similar records into collections.
MongoDB is one of the most fun technologies I have happened to come across in the past few years. In that one happy Saturday I decided to check it out and within 15 minutes was productive and felt like I "got it". I routinely give a demo at work where I show people how to get started with MongoDB and Scala in 15 minutes and that includes installing MongoDB. Shameless plug if you're into web services, here's my blog post on getting started with MongoDB and Scalatra using Casbah: http://janxspirit.blogspot.com/2011/01/quick-webb-app-with-scala-mongodb.html
You should at the very least go to http://try.mongodb.org
That's what got me started.
Good luck!
At this point the answer is none, I'm afraid.
You can't just convert your relational model with joins to a key-value store design and expect it to be a 1:1 mapping. From what you said it seems that you do have joins, some of them recursive, i.e. referencing another row from the same table.
You might start by denormalizing your existing relational schema to move it closer to a design you wish to achieve. Then, you could see more easily if what you are trying to do can be done in a practical way, and which technology to choose. You may even choose to continue using MySQL. Just because you can have joins doesn't mean that you have to, which makes it possible to have a non-relational design in a relational DBMS like MySQL.
Also, keep in mind - non-relational databases were designed for scalability - not performance! If you don't have thousands of users and a server farm a traditional relational database may actually work better for you.
Hmm, You want very high performance of traversal and you use the word "friends". The first thing that comes to mind is Graph Databases. They are specifically made for this exact case.
Try Neo4j http://neo4j.org/
It's is free, open source, but also has commercial support and commercial licensing, has excellent documentation and can be accessed from many languages (REST interface).
It is written in java, so you have native libraries or you can embedd it into your java/scala app.
Regarding MongoDB or Cassendra, you now (Dec. 2016, 5 years late) try longevityframework.org.
Build your domain model using standard Scala idioms such as case classes, companion objects, options, and immutable collections. Tell us about the types in your model, and we provide the persistence.
See "More Longevity Awesomeness with Macro Annotations! " from John Sullivan.
He provides an example on GitHub.
If you've looked at longevity before, you will be amazed at how easy it has become to start persisting your domain objects. And the best part is that everything persistence related is tucked away in the annotations. Your domain classes are completely free of persistence concerns, expressing your domain model perfectly, and ready for use in all portions of your application.
What would be the equivalent of ERD for a NoSQL database such as MongoDB?
It looks like you asked a similar question on Quora.
As mentioned there, the ERD is simply a mapping of the data you intend to store and the relations amongst that data.
You can still make an ERD with MongoDB as you still want to track the data and the relations. The big difference is that MongoDB has no joins, so when you translate the ERD into an actual schema you'll have to make some specific decisions about implement the relationships.
In particular, you'll need to make the "embed vs. reference" decision when deciding how this data will actually be stored. Relations are still allowed, just not enforced. Many of the wrappers for MongoDB actually provide lookups across collections to abstract some of this complexity.
Even though MongoDB does not enforce a schema, it's not recommended to proceed completely at random. Modeling the data you expect to have in the system is still a really good idea and that's what the ERD provides you.
So I guess the equivalent to the ERD is the ERD?
You could just use a UML class diagram instead too.
Moon Modeler supports schema design for MongoDB. It allows users to define diagrams with nested structures.
I know of no standard means of diagramming document-oriented "schema".
I'm sure you could use an ERD to map out your schemata but since document databases do not truly support--or more importantly enforce--relationships between data, it would only be as useful as your code was disciplined to internally enforce such relationships.
I have been thinking about the same issue for quite some time.
And I came to the following conclusion:
If NoSQL databases are generally schemaless, you don't actually have a 'schema' to illustrate in a diagram.
Thus, I think you should take a "by example" approach.
You could draw some mindmaps exemplifying how your data would look like when stored in a NoSQL DB such as MongoDB.
And since these databases are very dynamic you could also create some derived mindmaps to show how the data from today could evolve in time.
Take a look at this topic too.
Confusion about NoSQL Design
MongoDB does support 'joins', just not in the SQL sense of INNER JOIN (the default SQL join). While the concept of 'join' is typically associated with SQL, MongoDB does have the aggregation framework with its data processing pipeline stages. The $lookup pipeline stage is used to create the equivalent of a LEFT JOIN in SQL. That is, all documents on the left of a relationship will be pass through the pipeline, as well as any relating documents on the right side of the relationship. The documents are modified to include the relationship as part of the new documents.
Consequently, I postulate that Entity Relationship Diagrams do have a role in MongoDB. Documents are certainly related to each other in the db, and we should have a visualization of these relationships, including the cardinality relationship, e.g. full participation, partial participation, weak/strong entities, etc.
Of course, MongoDB also introduces the concept of embedded documents and referenced documents, and so I argue it adds additional flavor to the model of the ERD. And I certainly would want to see embedded and referenced relationships mapped out in a visual diagram.
The remaining question is so what is out there? What is out there for Mongoose for NodeJS? Mongoid for Ruby? etc. If you check the respective repositories for their corresponding ORMs (Object Relational Mappers), then you will see there are ERDs for them. But in terms of their completeness, perhaps there is a lot to be desired and the open source community is welcome to make contributions.
https://www.npmjs.com/package/mongoose-erd
https://rubygems.org/gems/railroady
I have an MSSQL database which I am considering porting to CouchDB or MongoDB. I have a many-to-many relationship within the SQL db which has hundreds of thousands rows in the xref table, corresponding to tens of thousands of rows in the tables on each side of the relationship. Will CouchDB and/or MongoDB be able to handle this data, and what would be the best way of formatting the relevant documents for performant querying? Thanks very much.
For CouchDB, I would highly recommend reading this article about Entity Relationships.
One thing I would note in CouchDB is to be careful of attempting to "normalize" a non-relational data model. The document-based storage offers you a great deal of flexibility, and it's seldom the best idea to abstract everything into as many "document types" as you can think of. Many times, it's best to leave much of your data within the same document unless you have clear cases where separate entities exist.
One common use-case of many-to-many relationships is implementing tagging. There are articles about different methods you can use to accomplish this in CouchDB. It may apply to your requirements, it may not, but it's probably worth a read.
Since the 'collection' model of MongoDB is similar to tables you can of course maintain
the m:n relationship inside a dedicated mapping collection (using the _id of the related documents of the referenced documents from other collections).
If you can: consider redesign your application using embedded documents.
http://www.mongodb.org/display/DOCS/Schema+Design
In general: try to turn off your memories to a RDBMS when working with MongoDB.
Blindly copying the database design from RDBMS to MongoDB is neither helpful nor adviceable nor will it work in general.