Is my MongoDB data model the right choice? - mongodb

I'm going to build my first project (genealogy database) with MongoDB and nodejs and I am asking myself, if my data model is the right choice:
people document (simplified):
{
"_id": ObjectId("123"),
"modified": ISODate("2015-02-04T16:52:32.601Z"),
"birth": ISODate("1995-02-04T16:52:32.601Z"),
"name": "peter"
}, {
"_id": ObjectId("456"),
"modified": ISODate("2015-02-04T16:52:32.601Z"),
"birth": ISODate("1999-02-04T16:52:32.601Z"),
"name": "uschi"
}
relations document (simplified):
{
"sourceID": ObjectId("123"),
"targetID": ObjectId("456"),
"type": "Married",
"modified": ISODate("2015-02-04T16:52:32.599Z"),
"startrelation": ISODate("2001-02-04T16:52:32.601Z"),
"endrelation": ISODate("2007-02-04T16:52:32.601Z"),
"_id": ObjectId("54d24e5033bfc203aaaad590")
}
Yesterday I tried to retrieve a list with all people and their related people and got worries about my data model because I needed a lot of code to generate the following result:
items: [
{
"_id": ObjectId("123"),
"modified": ISODate("2015-02-04T16:52:32.601Z"),
"birth": ISODate("1995-02-04T16:52:32.601Z"),
"name": "peter"
"married": [{
"_id": ObjectId("456"),
"modified": ISODate("2015-02-04T16:52:32.601Z"),
"birth": ISODate("1999-02-04T16:52:32.601Z"),
"name": "uschi"
}, ...]
}, ...]
Is there are problem with that solution?

The main problem I see with this solution is that you are using MongoDB to store relational data. I have done this in the past and regretted it. Consider using Postgres. It's a relational db but also has a feature called hstore which allows you to store and query arbitrarily structured json if your schema has some areas that may not be well defined.

It seems that graph-database would be perfect match for you problem domain.
This way you wont have to implement all the logic related to "relations" in your application. GraphDBs natively understand them.
i.e. neo4j
Graph databases allow for
easily handle complex relations
very quick traversal of relations
fast searching for relationships of the type "friends of a friend" or who is Jim in relation to Janet
In general, if you are planning to query your data in various ways looking on relations, graph database is the way to go,

Related

How to filter through possibly infinitely nested data NoSQL?

I'm new to NoSQL so I might be wrong in my thinking process, but I am trying to figure out how to filter through possibly infinitely nested object (comment, replies to comment, replies to replies to comment). I am using MongoDB, but it probably applies to other NoSQL databases too.
This is the structure I wanted to use:
Post
{
"name": "name",
"comments": [
{
"id": "someid"
"author": "author",
"replies": [
{
"id": "someid",
"author": "author",
"replies": [
{
...
}
]
},
{
"id": "someid",
"author": "author",
"replies": null
}
]
}
]
}
As you can see, replies can be infinitely nested. (well, unless i set the limit which doesn't sound that stupid)
But now if user wants to edit / delete comment, I have to filter through them and find the one and I can't find any better way than to loop through all of them, but that would be very slow with a lot of comments.
I was thinking to create ID for each comment that would somewhat help finding it (something inspired from hashmap, but not exactly). It could maybe include depth (how deep nested is comment) and then only filter through comments with at least that depth, but that little help would only increase performance slightly and only in specific cases, in worst case I would have to loop through all of them anyway. ID could also include indexes of comments and replies, but that would be limited since ID can't be infinite and replies can.
I couldn't find any MongoDB query for that.
Is there any solution / algorithm to do it more efficiently?

Best way to represent multilingual database on mongodb

I have a MySQL database to support a multilingual website where the data is represented as the following:
table1
id
is_active
created
table1_lang
table1_id
name
surname
address
What's the best way to achieve the same on mongo database?
You can either design a schema where you can reference or embed documents. Let's look at the first option of embedded documents. With you above application, you might store the information in a document as follows:
// db.table1 schema
{
"_id": 3, // table1_id
"is_active": true,
"created": ISODate("2015-04-07T16:00:30.798Z"),
"lang": [
{
"name": "foo",
"surname": "bar",
"address": "xxx"
},
{
"name": "abc",
"surname": "def",
"address": "xyz"
}
]
}
In the example schema above, you would have essentially embedded the table1_lang information within the main table1document. This design has its merits, one of them being data locality. Since MongoDB stores data contiguously on disk, putting all the data you need in one document ensures that the spinning disks will take less time to seek to a particular location on the disk. If your application frequently accesses table1 information along with the table1_lang data then you'll almost certainly want to go the embedded route. The other advantage with embedded documents is the atomicity and isolation in writing data. To illustrate this, say you want to remove a document which has a lang key "name" with value "foo", this can be done with one single (atomic) operation:
db.table.remove({"lang.name": "foo"});
For more details on data modelling in MongoDB, please read the docs Data Modeling Introduction, specifically Model One-to-Many Relationships with Embedded Documents
The other design option is referencing documents where you follow a normalized schema. For example:
// db.table1 schema
{
"_id": 3
"is_active": true
"created": ISODate("2015-04-07T16:00:30.798Z")
}
// db.table1_lang schema
/*
1
*/
{
"_id": 1,
"table1_id": 3,
"name": "foo",
"surname": "bar",
"address": "xxx"
}
/*
2
*/
{
"_id": 2,
"table1_id": 3,
"name": "abc",
"surname": "def",
"address": "xyz"
}
The above approach gives increased flexibility in performing queries. For instance, to retrieve all child table1_lang documents for the main parent entity table1 with id 3 will be straightforward, simply create a query against the collection table1_lang:
db.table1_lang.find({"table1_id": 3});
The above normalized schema using document reference approach also has an advantage when you have one-to-many relationships with very unpredictable arity. If you have hundreds or thousands of table_lang documents per give table entity, embedding has so many setbacks in as far as spacial constraints are concerned because the larger the document, the more RAM it uses and MongoDB documents have a hard size limit of 16MB.
The general rule of thumb is that if your application's query pattern is well-known and data tends to be accessed only in one way, an embedded approach works well. If your application queries data in many ways or you unable to anticipate the data query patterns, a more normalized document referencing model will be appropriate for such case.
Ref:
MongoDB Applied Design Patterns: Practical Use Cases with the Leading NoSQL Database By Rick Copeland

MongoDB: how to set collection version?

I'm currently using MongoDB and I have a collection called Product. I have a requirement in the system that asks to increment the collection version whenever any change happens to the collection (e.g. add a new product, remove, change price, etc...).
Question: Is there a recommended approach to set versions for collections in MongoDB?
I was expecting to find something like that:
db.collection.Product.setVersion("1.0.0");
and the corresponding get method:
db.collection.Product.getVersion();
I'm not sure if it makes sense. Personally, I would love to have collection metadata provided as a native implementation from MongoDB. Is there any document database that does so?
MongoDB itself is completely "schemaless" and as such does not have any of it's own concepts of document "metadata" or the general "version management" that you seem to be looking for. As such the general implementation is all up to you, and documents store whatever you supply them with.
You could implement such a scheme, generally by wrapping methods to include such things as version management in updates. So on document creation you would do this:
db.collection.myinsert({ "field": 1, "other": 2 })
Which wraps a normal insert to do this:
db.collection.insert({ "field": 1, "other": 2, "__v": 0 })
Having that data any "updates" would need to provide a similar wrapper. So this:
db.collection.myupdate({ "field": 1 },{ "$set": { "other": 4 } })
Actually does a check for the same version as held and "increments" the version at the same time via $inc:
db.collection.update(
{ "field": 1, "__v": 0 },
{
"$set": { "other": 4 },
"$inc": { "__v": 1 }
}
)
That means the document to be modified in the database needs to match the same "version" as what is in memory in order to update. Changing the version number means subsequent updates with stale data would not succeed.
Generally though, there are several Object Document Mapper or ODM implementations available for various languages that have the sort of functionality built in. You would probably be best off looking at the Drivers section of the documentation to find something suitable for your language implementation. Also a little extra reading up on MongoDB would help as well.

Emulating LEFT JOIN on MongoDB using MapReduce/Aggregation

I have a mongo database with few collections such as a user in the system (id, name, email) and list of projects (id, name, list of users who have access)
User
{
"_id": 1,
"name": "John",
"email": "john#domain.com"
}
{
"_id": 2,
"name": "Sam",
"email": "sam#domain.com"
}
Project
{
"_id": 1,
"name": "My Project1",
"users": [1,2]
}
{
"_id": 2,
"name": My Project2",
"users": [2]
}
In my dashboard, I display a list of projects and the names of its users. To support names - I've changed the "users" field to now also include the name:
{
"_id": 2,
"name": "My Project2",
"users": [{"_id":2,"name":"Sam"}]
}
But on several pages, I now need to also print their email address and later on - maybe also display their image.
Since I don't want to start and embed the entire User document in each project, I'm looking for a way to do a LEFT JOIN and pick the values I need from the User collection.
Performances are NOT important so much on those pages and I rather prefer an easy way to manage my data. So basically I'm looking for a way to query for a list of all projects and associated users with different fields from the original User document.
I've read about the map-reduce and aggregation option of mongo and to be honest - I'm not sure which to use and how to achieve what I'm looking for.
MongoDb doesn't support joins in any form even by using MapReduce and Aggregation Framework. Only way you could implement join between collection is in your code. So just implement LEFT JOIN logic in your code.

How to define Spring Security User Schema in MongoDB?

I want to implement Spring Security with MongoDB.
How can I define custom User schema?
One of the greatest awesomeness of MongoDB is that it is schemaless, i.e. you are not forced to use some predefined set of columns. Another MongoDB feature is lack of JOINs.
These two statements mean that you may construct any schema you want, but try to have all required info in one collection. For example I used schema like this in one of my applications:
{
"_id": "student_001",
"password": "65c20e5a89d6b13df450b50576e2edfb",
"firstName": "A",
"lastName": "B",
"secondName": "C",
"email": "a#b.c",
"role": "STUDENT",
"active": true,
"paid": 0.6
}
You can use _id for any unique field of any type (not only ObjectId), I use it for logins. You just need to cover basic org.springframework.security.core.userdetails.UserDetails getters with data from this schema, but also you can add additional fields to the implementing class.