How to overwrite object Id's in Mongo db while creating an App in Sails - mongodb

I am new to Sails and Mongo Db. Currently I am trying to implement a CRUD Function using Sails where I want to save user details in Mongo db.In the model I have the following attributes
"id":{
type:'Integer',
min:100,
autoincrement:true
},
attributes: {
name:{
type:'String',
required:true,
unique:true
},
email_id:{
type:'EMAIL',
required:false,
unique:false
},
age:{
type:'Integer',
required:false,
unique:false
}
}
I want to ensure that the _id is overridden with my values starting from 100 and is auto incremented with each new entry. I am using the waterline model and when I call the Api in DHC, I get the following output
"name": "abc"
"age": 30
"email_id": "abc#gmail.com"
"id": "5587bb76ce83508409db1e57"
Here the Id given is the object Id.Can somebody tell me how to override the object id with an Integer starting from 100 and is auto incremented with every new value.

Attention: Mongo id should be unique as possible in order to scale well. The default ObjectId is consist of a timestamp, machine ID, process ID and a random incrementing value. Leaving it with only the latter would make it collision prone.
However, sometimes you badly want to prettify the never-ending ObjectID value (i.e. to be shown in the URL after encoding). Then, you should consider using an appropriate atomic increment strategy.
Overriding the _id example:
db.testSOF.insert({_id:"myUniqueValue", a:1, b:1})
Making an Auto-Incrementing Sequence:
Use Counters Collection: Basically a separated collection which keeps track the last number of the sequence. Personally, I have found it more cohesive to store the findAndModify function in the system.js collection, although it lacks version control's capabilities.
Optimistic Loop
Edit:
I've found an issue in which the owner of sails-mongo said:
MongoDb doesn't have an auto incrementing attribute because it doesn't
support it without doing some kind of manual sequence increment on a
separate collection or document. We don't currently do this in the
adapter but it could be added in the future or if someone wants to
submit a PR. We do something similar for sails-disk and sails-redis to
get support for autoIncremeting fields.
He mentions the first technique I added in this answer:
Use Counters Collection. In the same issue, lewins shows a workaround.

Related

When doing an upsert to MongoDb is it possible to set a field with a timestamp only if other data in the record has changed?

We need to cache records for a service with a terrible API.
This service provides us with API to query for data about our employees, but does not inform us whether employees are new or have been updated. Nor can we filter our queries to them for this information.
Our proposed solution to the problems this creates for us is to periodically (e.g. every 15 minutes) query all our employee data and upsert it into a Mongo database. Then, when we write to the MongoDb, we would like to include an additional property which indicates whether the record is new or whether the record has any changes since the last time it was upserted (obviously not including the field we are using for the timestamp).
The idea is, instead of querying the source directly, which we can't filter by such timestamps, we would instead query our cache which would include said timestamp and use it for a filter.
(Ideally, we'd like to write this in C# using the MongoDb driver, but more important right now is whether we can do this in an upsert call or whether we'd need to load all the records into memory, do comparisons, and then add the timestamps before upserting them....)
There might be a way of doing that, but how efficient that is, still needs to be seen. The update command in MongoDB can take an aggregation pipeline to perform an update operation. We can use the $addFields stage of MongoDB to add a new field denoting the update status, and we can use $function to compute its value. A short example is:
db.collection.update({
key: 1
},
[
{
"$addFields": {
changed: {
"$function": {
lang: "js",
"args": [
"$$ROOT",
{
"key": 1,
data: "somedata"
}
],
"body": "function(originalDoc, newDoc) { return JSON.stringify(originalDoc) !== JSON.stringify(newDoc) }"
}
}
}
}
],
{
upsert: true
})
Here's the playground link.
Some points to consider here, are:
If the order of fields in the old and new versions of the doc is not the same then JSON.stringify will fail.
The function specified in $function will run on the server-side, so ideally it needs to be lightweight. If there is a large number of users, that will get upserted, then it may or may not act as a bottleneck.

Inserting multiple key value pair data under single _id in cloudant db at various timings?

My requirement is to get json pair from mqtt subscriber at different timings under single_id in cloudant, but I'm facing error while trying to insert new json pair in existing _id, it simply replace old one. I need at least 10 json pair under one _id. Injecting at different timings.
First, you should make sure about your architectural decision to update a particular document multiple times. In general, this is discouraged, though it depends on your application. Instead, you could consider a way to insert each new piece of information as a separate document and then use a map-reduce view to reflect the state of your application.
For example (I'm going to assume that you have multiple "devices", each with some kind of unique identifier, that need to add data to a cloudant DB)
PUT
{
"info_a":"data a",
"device_id":123
}
{
"info_b":"data b",
"device_id":123
}
{
"info_a":"message a"
"device_id":1234
}
Then you'll need a map function like
_design/device/_view/state
{
function (doc) {
emit(doc.device_id, 1);
}
Then you can GET the results of that view to see all of the "info_X" data that is associated with the particular device.
GET account.cloudant.com/databasename/_design/device/_view/state
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1},
{"id":"eaa710a5fa1ff4ba6156c997ddf6099b","key":1234,"value":1}
]}
Then you can use the query parameters to control the output, for example
GET account.cloudant.com/databasename/_design/device/_view/state?key=123&include_docs=true
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1,"doc":
{"_id":"28324b34907981ba972937f53113ac3f",
"_rev":"1-bac5dd92a502cb984ea4db65eb41feec",
"info_b":"data b",
"device_id":123}
},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1,"doc":
{"_id":"d50553d206d722b960fb176f11841974",
"_rev":"1-a2a6fea8704dfc0a0d26c3a7500ccc10",
"info_a":"data a",
"device_id":123}}
]}
And now you have the complete state for device_id:123.
Timing
Another issue is the rate at which you're updating your documents.
Bottom line recommendation is that if you are only updating the document once per ~minute or less frequently, then it could be reasonable for your application to update a single document. That is, you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database. You must make sure that your are providing the most recent _rev of that document and you should also check for conflicts that could occur if the document is being updated by multiple devices.
If you are acquiring new data for a particular device at a high rate, you'll likely run into conflicts very frequently -- because cloudant is a distributed document store. In this case, you should follow something like the example I gave above.
Example flow for the second approach outlined by #gadamcox for use cases where document updates are not required very frequently:
[...] you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database.
Your application first fetches the existing document by id: (https://docs.cloudant.com/document.html#read)
GET /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1"]
}
Then your application updates the document in memory
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1","2"]
}
and saves it in the database by specifying the _id and _rev (https://docs.cloudant.com/document.html#update)
PUT /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No":["1","2"]
}

mongodb upsert with conditional field update

I have a script that populates a mongo db from daily server log files. Log files come from a number of servers so the chronological order of the data is not guaranteed. To make this simple, let's say that the document schema is this:
{
_id: <username>,
first_seen: <date>,
last_seen: <date>,
most_recent_ip: <string>
}
that is, documents are indexed by the name of the user who accessed the server. For each user, we keep track of the first time the user was seen and the ip from the last visit.
Right now I handle this very inefficiently: first try an insert. If it fails, retrieve a record by _id, then calculate updated values (e.g. first_seen and most_recent_up), and finally update the record. This is 3 db calls per log entry, which makes the script's running time prohibitively long given the very high volume of data.
I'm wondering if I can replace this with an upsert instead. I can see how to handle first/last_seen: probably something like {$min: {'first_seen': <log_entry_date>}} (hope this works correctly when inserting a new doc). But how do I set most_recent_ip to the new value only when <log_entry_date> > $last_seen.
Is there generally a preferred pattern for my use case?
You can just use $set to set the most_recent_ip, e.g.
db.logs.update(
{_id:"user1"},
{$set:{most_recent_ip:"2.2.2.2"}, $min:{first_seen:new Date()}, $max:{last_seen:new Date()}},
{upsert: true}
)

Generation of _id vs. ObjectId autogeneration in MongoDB

I'm developing an application that create permalinks. I'm not sure how save the documents in MondoDB. Two strategies:
ObjectId autogeneration
MongoDB autogenerates the _id. I need to create an index on the permalink field because I get the information by the permalink. Also I can access to the creation time of the ObjectId, using the getTimestamp() method, so datetime fields seems to be redundant but if I delete this field I need two calls to MongoDB one to take the information and another to take the timestamp.
{
"_id": ObjectId("5210a64f846cb004b5000001"),
"permalink": "ca8W7mc0ZUx43bxTuSGN",
"data": "a lot of stuff",
"datetime": ISODate("2013-08-18T11:47:43.460+-100")
}
Generate _id
I generate the _id with the permalink.
{
"_id": "ca8W7mc0ZUx43bxTuSGN",
"data": "a lot of stuff",
"datetime": ISODate("2013-08-18T11:47:43.460+-100")
}
I not see any advantage to use ObjectIds. Am I missing something?
ObjectIds are there for situations where you don't have a unique key for every document in a collection. They're unique, so you don't have to worry about conflicts and they shard reasonably well in large deployments without too much worry (they have they're pros and cons, read more here).
The ObjectId also contains the timestamp of the client where the ObjectId was generated (unless the DB server is configured to generate all keys). With that, as you noticed, you can use the time stamp to perform some date operations. However, if you plan on using the Aggregation Framework, you'll find that you can't use an ObjectId in any date operations currently (issue). If you want to use the AF, you'll need a second field that contains the date, unfortunately doubly storing it with the ObjectId's internal value.
If you can be assured that the _id you're generating is unique, then there's not much reason to use an ObjectId in your data structure.

Doing an upsert in mongo, can I specify a custom query for the "insert" case? [duplicate]

I am trying to use upsert in MongoDB to update a single field in a document if found OR insert a whole new document with lots of fields. The problem is that it appears to me that MongoDB either replaces every field or inserts a subset of fields in its upsert operation, i.e. it can not insert more fields than it actually wants to update.
What I want to do is the following:
I query for a single unique value
If a document already exists, only a timestamp value (lets call it 'lastseen') is updated to a new value
If a document does not exists, I will add it with a long list of different key/value pairs that should remain static for the remainder of its lifespan.
Lets illustrate:
This example would from my understanding update the 'lastseen' date if 'name' is found, but if 'name' is not found it would only insert 'name' + 'lastseen'.
db.somecollection.update({name: "some name"},{ $set: {"lastseen": "2012-12-28"}}, {upsert:true})
If I added more fields (key/value pairs) to the second argument and drop the $set, then every field would be replaced on update, but would have the desired effect on insert. Is there anything like $insert or similar to perform operations only when inserting?
So it seems to me that I can only get one of the following:
The correct update behavior, but would insert a document with only a subset of the desired fields if document does not exist
The correct insert behavior, but would then overwrite all existing fields if document already exists
Are my understanding correct? If so, is this possible to solve with a single operation?
MongoDB 2.4 has $setOnInsert
db.somecollection.update(
{name: "some name"},
{
$set: {
"lastseen": "2012-12-28"
},
$setOnInsert: {
"firstseen": <TIMESTAMP> # set on insert, not on update
}
},
{upsert:true}
)
There is a feature request for this ( https://jira.mongodb.org/browse/SERVER-340 ) which is resolved in 2.3. Odd releases are actually dev releases so this will be in the 2.4 stable.
So there is no real way in the current stable versions to do this yet. I am afraid the only method is to actually do 3 conditional queries atm: 1 to check the row, then a if to either insert or update.
I suppose if you had real problems with lock here you could do this function with sole JS but that's evil however it would lock this update to a single thread.