How can we ensure Data integrity in mongoDb? - mongodb

i am trying to migrate from relational database (mysql) data to nosql (mongoDb) . But how can i ensure data integrity in mongodb . what i have found that we cannot do it on server side. what should i use on application side to handle data integrity ?
For eg: i have two tables user and task . Both have userId field common . if i add a new entry in task table it should check if userid present in user table.
this is one of the requirement others like adding constraints , updating values etc

Ultimately, you're screwed. There's no way (in mongodb) to guarantee data integrity in such scenario, since it's lacking relations in general and foreign keys in particular. And there's little point in building application-level checks. No matter how elaborate they are, they can still fail (hence "no guarantee").
So it's either embedding (so that related data is always there, right in the document) or abandoning the hope of consistent data.

MongoDb is nosql and hence no joins.
Data is stored as BSON documents and hence no Foreign key constraints
Steps to ensure Data Integrity:
Check in the application before adding the task document whether it is having a valid user.

MongoDB doesn't support FOREIGN KEY. It's uses to Avoid JOINS.
MongoDB doesn't support server side foreign key relationships. But some times we need to relate So MongoDB applications use one of two methods for relating documents:
Manual references where you save the _id field of one document in another document as a reference. Then your application can run a second query to return the related data. These references are simple and sufficient for most use cases.
DBRefs are references from one document to another using the value of the first document’s _id field, collection name, and, optionally, its database name. By including these names, DBRefs allow documents located in multiple collections to be more easily linked with documents from a single collection.This may be then not so speedy because DB has to make additional queries to read objects but allows for kind of foreign key reference.Still you will have to handle your references manually. Only while looking up your DBRef you will see if it exists, the DB will not go through all the documents to look for the references and remove them if the target of the reference doesn't exist any more. But I think removing all the references after deleting the book would require a single query per collection, no more, so not that difficult really.
Refer to documentation for more info: Database References.
How can I solve this task?
To be clear, MongoDB is not relational. There is no standard "normal form". You should model your database appropriate to the data you store and the queries you intend to run.
For ex-
student
{
_id: ObjectId(...),
name: 'Jane',
courses: [
{ course: 'bio101', mark: 85 },
{ course: 'chem101', mark: 89 }
]
}
course
{
_id: 'bio101',
name: 'Biology 101',
description: 'Introduction to biology'
}
Try to resolve to this
student
{
_id: ObjectId(...),
name: 'Jane',
courses: [
{
name: 'Biology 101',
mark: 85,
id:bio101
},
]
}

Related

Advantage of using connect over updating foreign keys directly

Why use connect?
data:{
'userId': 1
}
the above one is not enough??
Why use
user:{
connect:{
id: 1
}
}
Isn't the result the same? I wonder
To answer your question, why connect exists at all:
connect (and disconnect) provide an alternative interface to relations that you can also achieve by updating the respective fields directly. However, there are many cases, when the API is much more convenient.
E.g.
updating the object that does not store the attribute that represents the relation
updating a many-to-many relation
more advanced interfaces like connectOrCreate
connecting entities on unique attributes other than the primary key

Can we and should we create MongoID for nested objects inside document?

We have a collection of documents, each document has an array of objects
{
"_id":_MONGO_ID_,
"property":"value",
"list":[{...}, {...}, ...]
}
But each object of the list also needs a unique id for the needs of our app.
{"id":213456789, "somestuff":"somevlue" ...}
We do not wish to create a collection for these objects because they are small and would rather store them straight into the document.
Now the question. Right now we generate a unique id based on time which looks like the MongoID. We need an id to make it easier to target each object. Would it be a good idea to generate a MongoID for each object of the list instead? Any pros and cons?
In general, it is wise to separate DB-specific resources from business/data domain resources. You always want to be able to manipulate the data completely independent of the host database and the drivers associated therewith. ObjectId() is relatively lightweight and in fact a BSON type, separate from the MongoDB core objects, but for true arms-length separation and an easier physical implementation, I would recommend a simple string instead. If you don't have extreme space/scale issues, UUIDv4 is good way to get a unique string.

MongoDB, relation between documents or collections

I'm pretty new with NoSQL, MongoDB. How to deal with the many-to-many relation between 2 or multiple collections/documents? we'd better use DBRefs or embed? actually I've already read the MongoDB manual, but I didn't find something about many-to-many relation. I missed some points? or there is no this kind of relation in MongoDB? thx!
Embed versus reference
This is the problem of embedding versus referencing, and it’s a common source of confusion for new users of MongoDB. There’s a simple rule of thumb that works for most schema design scenarios: Embed when the child objects always appear in the context of their parent. Otherwise, store the child objects in a separate collection.
Embed or reference it depends on the application. Suppose you’re building a simple application in MongoDB that stores blog posts and comments.If the comments always appear within a blog post, and if they don’t need to be ordered in arbitrary ways (by post date, comment rank, and so on), then embedding is fine. But if, say, you want to be able to display the most recent comments, regardless of which post they appear on, then you’ll want to reference. Embedding may provide a slight performance advantage, but referencing is far more flexible.
Many-to-many
In RDBMSs, you use a join table to represent many-to-many relationships; in MongoDB, you use array keys. For example each product contains an array of category IDs, and both products and categories get their own collections. If you have two simple category documents
{ _id: ObjectId("4d6574baa6b804ea563c132a"),
title: "Epiphytes"
}
{ _id: ObjectId("4d6574baa6b804ea563c459d"),
title: "Greenhouse flowers"
}
then a product belonging to both categories will look like this:
{ _id: ObjectId("4d6574baa6b804ea563ca982"),
name: "Dragon Orchid",
category_ids: [ ObjectId("4d6574baa6b804ea563c132a"),
ObjectId("4d6574baa6b804ea563c459d") ]
}

How do handle relations in document stores

I do understand that relations are not really needed in document stores, but for some things they can still be useful. Or am I wrong (snowed in on RDBMS)?
For instance:
Let's say that I got a bunch of files and their revision history:
File
Name
Path
CreatedBy
.. etc ..
Revision
Date
Info
CreatedBy
Should I add the User object to CreatedBy for the file and all revisions, or should it be an ID referencing the User document? What's the common practice?
Should I add the User object to CreatedBy for the file and all
revisions, or should it be an ID referencing the User document? What's the common practice?
Both MongoDB and CouchDB have articles regarding this topic and I would say it depends on your scenario, data and DB system you using. If the data you consider to embedd or reference are big, you should reference it because for example CouchDB doesn't support (as far as I know) returning only part of the document in case it's large and you want to retrieve only basic/selected structure. On the other hand embedding can help you during querying since you don't have to look up for the referenced documents, but this really depends on the system you are using.

Correct approach of using embedded and reference in mongoid

I'm building association as follow
person embeds one address
address references one country
address references one province
country embeds many provinces
Is above association is good? I'm too much confused how to build them. I don't know exact use of mongodb and mongoid for building association.
Main concern of mine is when to use embedded and when to use references associations?
Schema design in MongoDB depends on how you will query the data and how you will update the data. There is no general hard rule to determine if an associations should be embedded or referenced. I suggest you have a look at this excellent article.
Concerning your suggested schema you could also make the country an attribute/field on a province document and do less normalization than you would in a relational database. It all depends on how you access your documents.
collection provinces:
{
name : 'Alabama'
country : 'United States'
}