Mongoose: How to handle versioning? - mongodb

Today I came to know about versioning concept in MongooseJS(here Mongoose v3 part 1 :: Versioning). But I have question here, I like the versioning feature of Mongoose, but what should I do when my schema changes ?
For example, initially my schema looks like,
{
"_id": String,
"title": String,
"description": String
}
Since I didn't know about versioning, i didn't add any versionKey option, just used the default versionKey, __v.
I created few documents with this above schema. Later I modified the schema as,
{
"_id": String,
"title": String,
"description": String,
"comments": Array
}
Here comes the problem, If I create any new document after this schema change I could able to add/push comments to the document.
But If I want to add/push comments to the document which were created with initial schema, I couldn't able to do, it throws Versioning Error: No matching document found.
Is there anyway to overcome this problem without disabling or skipping the versioning ?

Related

Good DB-design to reference different collections in MongoDB

I'm regularly facing the similar problem on how to reference several different collections in the same property in MongoDB (or any other NoSQL database). Usually I use Meteor.js for my projects.
Let's take an example for a notes collection that includes some tagIds:
{
_id: "XXXXXXXXXXXXXXXXXXXXXXXX",
message: "This is an important message",
dateTime: "2018-03-01T00:00:00.000Z",
tagIds: [
"123456789012345678901234",
"abcdefabcdefabcdefabcdef"
]
}
So a certain id referenced in tagIds might either be a person, a product or even another note.
Of course the most obvious solutions for this imo is to save the type as well:
...
tagIds: [
{
type: "note",
id: "123456789012345678901234",
},
{
type: "person",
id: "abcdefabcdefabcdefabcdef",
}
]
...
Another solution I'm also thinking about is to use several fields for each collection, but I'm not sure if this has any other benefits (apart from the clear separation):
...
tagIdsNotes: ["123456789012345678901234"],
tagIdsPersons: ["abcdefabcdefabcdefabcdef"],
...
But somehow both solutions feel strange to me as they need a lot of extra information (it would be nice to have this information implicit) and so I wanted to ask, if this is the way to go, or if you know any other solution for this?
If you use Meteor Methods to pull this data, you have a chance to run some code, get from DB, run some mappings, pull again from DB etc and return a result. However, if you use pub/sub, things are different, you need to keep it really simple and light.
So, first question: method or pub/sub?
Your question is really more like: should I embed and how much to embed, or should I not embed and build relations (only keep an id of a tag in the message object) and later use aggregations or should I denormalize (duplicate data): http://highscalability.com/building-scalable-databases-denormalization-nosql-movement-and-digg
All these are ok in Mongo depending on your case: https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-3
The way I do this is to keep a tags Collection indexed by messageId and eventually date (for sorting). When you have a message, you get all tags by querying the Tags Collection rather than mapping over your tags in your message object and send 3 different queries to 3 different Collections (person, product, note).
If you embed your tags data in the message object, let's say in your UX you want to show there are 3 tags and on click you get those 3 tags. You can basically pull those tags when you pulled the message (and might not need that data) or pull the tags on an action such as click. So, you might want to consider what data you need in your view and only pull that. You could keep an Integer as number of tags on the message object and save the tags in either a tags Collection or embed in your message object.
Following the principles of NoSQL it is ok and advisable to save some data multiple times in different collections to make your queries super fast.
So in a Tags Collection you could save as well things related to your original objects. Let's say
// Tags
{
...
messageId: 'xxx',
createdAt: Date,
person: {
firstName: 'John',
lastName: 'Smith',
userId: 'yyyy',
...etc
},
{
...
messageId: 'xxy',
createdAt: Date,
product: {
name: 'product_name',
productId: 'yyzz',
...etc
},
}

nosql inconsistent data structure

I'm new to nosql (MongoDB) so go easy on me.
I'm scraping json-ld from various web pages and want to store/recall the data. However the value types keep changing. For instance sometimes the "author" field uses an "organization" type, other times it's a "person" type sometimes it's simply a string, and sometimes it's just missing.
Should I convert the data to some type of standard?
Should each object be put into it's own collection and referenced?
How do you deal with displays being different.
Looking for words of experience or links to good articles on how to deal with inconsistent data structure.
The whole point of No-Sql database is that its schema less, and the structure can vary from document to other, so I see no issue in here.
I think you are asking on how you should deal with it in your application business logic, so here is my suggestion:
You can save the author as an embedded sub-document which always have a field called “type” (as an enum of values: String, Person, Organization, etc…) and act accordingly when you fetch the data.
For example, if the author is simply a String then the document would look like something like:
{
…,
“author”: {
“type”: “String”,
“text”: <text>
}
}
If its a Person type then:
{
…,
“author”: {
“type”: “Person”,
“first_name”: <first name>,
“last_name”: <last name>
}
}

Inserting multiple key value pair data under single _id in cloudant db at various timings?

My requirement is to get json pair from mqtt subscriber at different timings under single_id in cloudant, but I'm facing error while trying to insert new json pair in existing _id, it simply replace old one. I need at least 10 json pair under one _id. Injecting at different timings.
First, you should make sure about your architectural decision to update a particular document multiple times. In general, this is discouraged, though it depends on your application. Instead, you could consider a way to insert each new piece of information as a separate document and then use a map-reduce view to reflect the state of your application.
For example (I'm going to assume that you have multiple "devices", each with some kind of unique identifier, that need to add data to a cloudant DB)
PUT
{
"info_a":"data a",
"device_id":123
}
{
"info_b":"data b",
"device_id":123
}
{
"info_a":"message a"
"device_id":1234
}
Then you'll need a map function like
_design/device/_view/state
{
function (doc) {
emit(doc.device_id, 1);
}
Then you can GET the results of that view to see all of the "info_X" data that is associated with the particular device.
GET account.cloudant.com/databasename/_design/device/_view/state
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1},
{"id":"eaa710a5fa1ff4ba6156c997ddf6099b","key":1234,"value":1}
]}
Then you can use the query parameters to control the output, for example
GET account.cloudant.com/databasename/_design/device/_view/state?key=123&include_docs=true
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1,"doc":
{"_id":"28324b34907981ba972937f53113ac3f",
"_rev":"1-bac5dd92a502cb984ea4db65eb41feec",
"info_b":"data b",
"device_id":123}
},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1,"doc":
{"_id":"d50553d206d722b960fb176f11841974",
"_rev":"1-a2a6fea8704dfc0a0d26c3a7500ccc10",
"info_a":"data a",
"device_id":123}}
]}
And now you have the complete state for device_id:123.
Timing
Another issue is the rate at which you're updating your documents.
Bottom line recommendation is that if you are only updating the document once per ~minute or less frequently, then it could be reasonable for your application to update a single document. That is, you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database. You must make sure that your are providing the most recent _rev of that document and you should also check for conflicts that could occur if the document is being updated by multiple devices.
If you are acquiring new data for a particular device at a high rate, you'll likely run into conflicts very frequently -- because cloudant is a distributed document store. In this case, you should follow something like the example I gave above.
Example flow for the second approach outlined by #gadamcox for use cases where document updates are not required very frequently:
[...] you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database.
Your application first fetches the existing document by id: (https://docs.cloudant.com/document.html#read)
GET /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1"]
}
Then your application updates the document in memory
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1","2"]
}
and saves it in the database by specifying the _id and _rev (https://docs.cloudant.com/document.html#update)
PUT /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No":["1","2"]
}

How to overwrite object Id's in Mongo db while creating an App in Sails

I am new to Sails and Mongo Db. Currently I am trying to implement a CRUD Function using Sails where I want to save user details in Mongo db.In the model I have the following attributes
"id":{
type:'Integer',
min:100,
autoincrement:true
},
attributes: {
name:{
type:'String',
required:true,
unique:true
},
email_id:{
type:'EMAIL',
required:false,
unique:false
},
age:{
type:'Integer',
required:false,
unique:false
}
}
I want to ensure that the _id is overridden with my values starting from 100 and is auto incremented with each new entry. I am using the waterline model and when I call the Api in DHC, I get the following output
"name": "abc"
"age": 30
"email_id": "abc#gmail.com"
"id": "5587bb76ce83508409db1e57"
Here the Id given is the object Id.Can somebody tell me how to override the object id with an Integer starting from 100 and is auto incremented with every new value.
Attention: Mongo id should be unique as possible in order to scale well. The default ObjectId is consist of a timestamp, machine ID, process ID and a random incrementing value. Leaving it with only the latter would make it collision prone.
However, sometimes you badly want to prettify the never-ending ObjectID value (i.e. to be shown in the URL after encoding). Then, you should consider using an appropriate atomic increment strategy.
Overriding the _id example:
db.testSOF.insert({_id:"myUniqueValue", a:1, b:1})
Making an Auto-Incrementing Sequence:
Use Counters Collection: Basically a separated collection which keeps track the last number of the sequence. Personally, I have found it more cohesive to store the findAndModify function in the system.js collection, although it lacks version control's capabilities.
Optimistic Loop
Edit:
I've found an issue in which the owner of sails-mongo said:
MongoDb doesn't have an auto incrementing attribute because it doesn't
support it without doing some kind of manual sequence increment on a
separate collection or document. We don't currently do this in the
adapter but it could be added in the future or if someone wants to
submit a PR. We do something similar for sails-disk and sails-redis to
get support for autoIncremeting fields.
He mentions the first technique I added in this answer:
Use Counters Collection. In the same issue, lewins shows a workaround.

MongoDB: how to set collection version?

I'm currently using MongoDB and I have a collection called Product. I have a requirement in the system that asks to increment the collection version whenever any change happens to the collection (e.g. add a new product, remove, change price, etc...).
Question: Is there a recommended approach to set versions for collections in MongoDB?
I was expecting to find something like that:
db.collection.Product.setVersion("1.0.0");
and the corresponding get method:
db.collection.Product.getVersion();
I'm not sure if it makes sense. Personally, I would love to have collection metadata provided as a native implementation from MongoDB. Is there any document database that does so?
MongoDB itself is completely "schemaless" and as such does not have any of it's own concepts of document "metadata" or the general "version management" that you seem to be looking for. As such the general implementation is all up to you, and documents store whatever you supply them with.
You could implement such a scheme, generally by wrapping methods to include such things as version management in updates. So on document creation you would do this:
db.collection.myinsert({ "field": 1, "other": 2 })
Which wraps a normal insert to do this:
db.collection.insert({ "field": 1, "other": 2, "__v": 0 })
Having that data any "updates" would need to provide a similar wrapper. So this:
db.collection.myupdate({ "field": 1 },{ "$set": { "other": 4 } })
Actually does a check for the same version as held and "increments" the version at the same time via $inc:
db.collection.update(
{ "field": 1, "__v": 0 },
{
"$set": { "other": 4 },
"$inc": { "__v": 1 }
}
)
That means the document to be modified in the database needs to match the same "version" as what is in memory in order to update. Changing the version number means subsequent updates with stale data would not succeed.
Generally though, there are several Object Document Mapper or ODM implementations available for various languages that have the sort of functionality built in. You would probably be best off looking at the Drivers section of the documentation to find something suitable for your language implementation. Also a little extra reading up on MongoDB would help as well.