Sometimes findOneandUpdate inserts double - mongodb

I am facing a problem that MongoDB inserts the same documents twice. However, it is so rare a situation.
I am coding a live stats website which is including only soccer stats. To insert a document to the database first I check if there is the same match or not. Then, if there is a match update it with a new value. That's it. I can not share all codes but I am using the findOneandUpdate method with upsert option. In the findOneandUpdate method, I have to modify the returned document from the callback then after I save it with "doc.save()". In that script, I am using Cronjob (node-cron library) and it is running every minute. Most of the time I am getting the results that I want updated with new results if there exists. But sometimes results are inserted twice as you can see in the image.
//in async forEach,
await MatchObject.findOneAndUpdate({matchid: id}, // find a document with that filter
modelDoc, // document to insert when nothing was found
{upsert: true, new: true, runValidators: true, useFindAndModify: false}, // options
function (err, doc) { })
//Also inside findOneAndUpdate method i have save method that is modified with returned document from findOneAndUpdate method.
doc.save().then(res => console.log(res));
node v11.15.0
MongoDB shell version v4.0.10
mongoDB version v4.0.10
mongoose 5.6.2

Related

How to avoid mongo from returning duplicated documents by iterating a cursor in a constantly updated big collection?

Context
I have a big collection with millions of documents which is constantly updated with production workload. When performing a query, I have noticed that a document can be returned multiple times; My workload tries to migrate the documents to a SQL system which is set to allow unique row ids, hence it crashes.
Problem
Because the collection is so big and lots of users are updating it after the query is started, iterating over the cursor's result may give me documents with the same id (old and updated version).
What I'v tried
const cursor = db.collection.find(query, {snapshot: true});
while (cursor.hasNext()) {
const doc = cursor.next();
// do some stuff
}
Based on old documentation for the mongo driver (I'm using nodejs but this is applicable to any official mongodb driver), there is an option called snapshot which is said to avoid what is happening to me. Sadly, the driver returns an error indicating that this option does not exists (It was deprecated).
Question
Is there a way to iterate through the documents of a collection in a safe fashion that I don't get the same document twice?
I only see a viable option with aggregation pipeline, but I want to explore other options with standard queries.
Finally I got the answer from a mongo changelog page:
MongoDB 3.6.1 deprecates the snapshot query option.
For MMAPv1, use hint() on the { _id: 1} index instead to prevent a cursor from returning a document more than once if an intervening write operation results in a move of the document.
For other storage engines, use hint() with { $natural : 1 } instead.
So, from my code example:
const cursor = db.collection.find(query).hint({$natural: 1});
while (cursor.hasNext()) {
const doc = cursor.next();
// do some stuff
}

Can we use callbacks in the mongo shell?

I want to insert one document into collection1 and after successfully inserting the document, I want to insert another document into collection2. One of the fields for the document in collection2 will be the _id of the document just inserted into collection1.
I am using a callback:
db.collection1.insert(<document>,function(err,doc)){
db.collection2.insert({collection1_id: doc[0]._id, <field>:<value>})
However, it seems that callback is not available without Node.js.
Is there any workaround?
Callbacks are part of the Node.js async API and are not supported in the mongo shell (as at MongoDB 4.0). However, you can always write the equivalent without callbacks.
The mongo shell's insertOne() method will return an insertedId field with the _id value of the inserted document, so you can either save or reference this value.
For example:
db.collection2.insertOne({
collection1_id: db.collection1.insertOne({}).insertedId,
field: 'value'
})

Watch mongo update events using oplogs

I have to Listen to changes in particular field in mongodb documents and send data to client accordingly
I have tried looking for solutions, but only thing I could find is to query oplogs
(operation logs) of mongodb.
db.collection("oplog.rs", function(err, oplog) {})
Questions:
How are we actually going to make decisions based on oplogs (Capped Collections) and how frequently are we going to query oplogs to see the changes ?
Do we have any alternative solution to this problem may be using mongoose ?
MongoDB keeps track of all the database changes in a collection called "oplogs" (Operation Logs). Oplogs are used when you need to keep track of EACH and EVERY collection of the database. However if you need to keep track of just one collection, you may create tailable cursor on that particular collection using the below code. (Note: Only capped collections can have tailable cursor)
However if you want to use oplogs, In case of multiple MongoDB server instance, oplogs are enabled by default but if you are working on a single instance MongoDB server, you need to enable oplogs in mongoDB by this command in command prompt:
mongod --replSet rs0
Then to get the records from oplogs collection using Mongoose, you can try the following code.
var mongoose = require('mongoose');
var db = mongoose.connect('mongodb://localhost/local'); //local db has oplogs collection
mongoose.connection.on('open', function callback() {
var collection = mongoose.connection.db.collection('oplog.rs'); //or any capped collection
var stream = collection.find({}, {
tailable: true,
awaitdata: true,
numberOfRetries: Number.MAX_VALUE
}).stream();
stream.on('data', function(val) {
console.log('Doc: %j',val);
});
stream.on('error', function(val) {
console.log('Error: %j', val);
});
stream.on('end', function(){
console.log('End of stream');
});
});
Now I guess this may help. The above code is how you implement tailable cursor in Mongoose.

MongoDB: Is is possible to make sure to add a column to a document before inserting to a collection

Let's say I have a collection named products. I want to make sure whenever a document in this collection is inserted or updated, I check if there is a viewCount field present or not. If it is, I let the create/update operation complete. Else, I want to add this field and set the value to zero.
The challenge is, there are a lot of such operations in the application code. So, I am looking for a way to accomplish this at DB level. Is this possible ?
Use findAndModify:
db.products.findAndModify({
query:{ yourQuery},
update:{ fieldsToCreate, $inc: { viewCount:1} },
new: true,
upsert: true
})
where fieldsToCreate is a partial document of the values you want to create if the document does not exist. The new document will be returned with viewCount set to 1, which is correct, since it was viewed 1 time when returned.

Which is the best way to insert data in mongodb

While writing data to mongodb, we are checking if the data is present get the _id and using save update it else using insert add the data. Read save is the best way if you are providing _id in the query while saving it will update/insert based on if the _id is present in the db. Is the save the best method or is there any other way.
If you have all data available to save, just run update() each time but use the upsert functionality. Only one query required:
db.collection.update(
['_id' => $id],
$data,
['upsert' => true]
);
If your _id is generated by mongo you always know there is a record in the database and update is the one to use, but then again you could also save().
If you generated your id's (and thus don't know if it comes from the collection), this will always work without having to run an extra query.
From the documentation
db.collection.save()
Updates an existing document or inserts a new document, depending on its document parameter.
db.collection.insert()
Inserts a document or documents into a collection.
If you use db.collection.insert() in your case you will get duplication key error since it will try to insert new document which has same _id with an existing document. But instead of using save you should use the update method.