Bulk.getOperations() in MongoDB Node driver - mongodb

I'd like to view the results of a bulk operation, specifically to know the IDs of the documents that were updated. I understand that this information is made available through the Bulk.getOperations() method. However, it doesn't appear that this method is available through the MongoDB NodeJS library (at least, the one I'm using).
Could you please let me know if there's something I'm doing wrong here:
const bulk = db.collection('companies').initializeOrderedBulkOp()
const results = getLatestFinanicialResults() // from remote API
results.forEach(result =>
bulk.find({
name: result.companyName,
report: { $ne: result.report }
}).updateOne([
{ $unset: 'prevReport' },
{ $set: { prevReport: '$report' } },
{ $unset: 'report' },
{ $set: { report: result.report } }
]))
await bulk.execute()
await bulk.getOperations() // <-- fails, undefined in Typescript library
I get a static IDE error:
Uncaught TypeError: bulk.getOperations is not a function

I'd like to view the results of a bulk operation, specifically to know the IDs of the documents that were updated
As of currently (MongoDB server v6.x) There is no methods to return IDs for updated documents from a bulk operations (only insert and upsert operations). However, there may be a work around depending on your use case.
The manual that you linked for Bulk.getOperations() is for mongosh, which is a MongoDB Shell application. If you look into the source code for getOperations() in mongosh, it's just a convenient wrapper for batches'. The method batches` returns a list of operations sent for the bulk execution.
As you are utilising ordered bulk operations, MongoDB executes the operations serially. If an error occurs during the processing of one of the write operations, MongoDB will return without processing any remaining write operations in the list.
Depending on the use case, if you modify the bulk.find() part to contain a search for _id for example:
bulk.find({"_id": result._id}).updateOne({$set:{prevReport:"$report"}});
You should be able to see the _id value of the operation in the batches, i.e.
await batch.execute();
console.log(JSON.stringify(batch.batches));
Example output:
{
"originalZeroIndex":0,
"currentIndex":0,
"originalIndexes":[0],
"batchType":2,
"operations":[{"q":{"_id":"634354787d080d3a1e3da51f"},
"u":{"$set":{"prevReport":"$report"}}],
"size":0,
"sizeBytes":0
}
For additional information, you could also retrieve the BulkWriteResult. For example, the getLastOp to retrieve the last operation (in case of a failure)

Related

Mongo db Collection find returns from the first entry not from last from client side

I am using mongodb and i am querying the database with some conditions which are working fine but the results are coming from the first entry to last where as i want to query from the last added entry to the collection in database
TaggedMessages.find({taggedList:{$elemMatch:{tagName:tagObj.tagValue}}}).fetch()
Meteor uses a custom wrapped version of Mongo.Collection and Mongo.Cursor in order to support reactivity out of the box. It also abstracts the Mongo query API to make it easier to work with.
This is why the native way of accessing elements from the end is not working here.
On the server
In order to use $natural correctly with Meteor you can to use the hint property as option (see the last property in the documentation) on the server:
const selector = {
taggedList:{ $elemMatch:{ tagName:tagObj.tagValue } }
}
const options = {
hint: { $natural : -1 }
}
TaggedMessages.find(selector, options).fetch()
Sidenote: If you ever need to access the "native" Mongo driver, you need to use rawCollection
On the client
On the client you have no real access to the Mongo Driver but to a seemingly similar API (called the minimongo package). There you won't have $natural available (maybe in the future), so you need to use sort with a descenging order:
const selector = {
taggedList:{ $elemMatch:{ tagName:tagObj.tagValue } }
}
const options = {
sort: { createdAt: -1 }
}
TaggedMessages.find(selector, options).fetch()

How to simply perform a update/replace many documents operation in MongoDb

This is perhaps one of the most common and basic database operations you can perform. However, how you perform this operation in MongoDb, is elusive.
There's a UpdateManyAsync method in the C# driver. But all I can find for it is examples like;
The following operation updates all documents where violations are greater than 4 and $set a flag for review:
try {
db.restaurant.updateMany(
{ violations: { $gt: 4 } },
{ $set: { "Review" : true } }
);
} catch (e) {
print(e);
}
Frankly I don't know how you guys build your applications but our organization will never embed domain logic like how many violations constitutes a review in our database querying logic. Like all the examples I can find for this operation do.
It seems at best that our option is to use some sort of bulk api for this, which seems unnecessary for such a simple operation. Making the bulk api perform a thousand separate replaces seems ineffective to me, but I might be wrong here?
In other storage technologies and more mature API's all the code necessary to perform this operation is (example Entity framework);
db.UpdateRange(stocks);
await db.SaveChangesAsync();
Is this tracked as a feature request for the Mongo project somewhere?
Bulk API can handle it see code below:
var updates = new List<WriteModel<CosmosDoc>>();
foreach (var doc in docs)
{
updates.Add(new ReplaceOneModel<CosmosDoc>(doc.Id, doc));
}
await MongoDb.DocsCollection.BulkWriteAsync(updates, new BulkWriteOptions() { IsOrdered = false });

Is it possible to perform multiple DB operations in a single transaction in MongoDB?

Suppose I have two collections A and B
I want to perform an operation
db.A.remove({_id:1});
db.B.insert({_id:"1","name":"dev"})
I know MongoDB maintains atomicity at the document level. Is it possible to perform the above set of operation in a single transaction?
Yes, now you can!
MongoDB has had atomic write operations on the level of a single document for a long time. But, MongoDB did not support such atomicity in the case of multi-document operations until v4.0.0. Multi-document operations are now atomic in nature thanks to the release of MongoDB Transactions.
But remember that transactions are only supported in replica sets using the WiredTiger storage engine, and not in standalone servers (but may support on standalone servers too in future!)
Here is a mongo shell example also provided in the official docs:
// Start a session.
session = db.getMongo().startSession( { readPreference: { mode: "primary" } } );
employeesCollection = session.getDatabase("hr").employees;
eventsCollection = session.getDatabase("reporting").events;
// Start a transaction
session.startTransaction( { readConcern: { level: "snapshot" }, writeConcern: { w: "majority" } } );
//As many operations as you want inside this transaction
try {
employeesCollection.updateOne( { employee: 3 }, { $set: { status: "Inactive" } } );
eventsCollection.insertOne( { employee: 3, status: { new: "Inactive", old: "Active" } } );
} catch (error) {
// Abort transaction on error
session.abortTransaction();
throw error;
}
// Commit the transaction using write concern set at transaction start
session.commitTransaction();
session.endSession();
I recommend you reading this and this to better understand how to use!
MongoDB can not guarantee atomicity when more than one document is involved.
Also, MongoDB does not offer any single operations which affect more than one collection.
When you want to do whatever you actually want to do in an atomic manner, you need to merge collections A and B into one collection. Remember that MongoDB is a schemaless database. You can store documents of different types in one collection and you can perform single atomic update operations which perform multiple changes to a document. That means that a single update can transform a document of type A into a document of type B.
To tell different types in the same collection apart, you could have a type field and add this to all of your queries, or you could use duck-typing and identify types by checking if a certain field $exists.

Use allowDiskUse in criteria query with Grails and the MongoDB plugin?

In order to iterate over all the documents in a MongoDB (2.6.9) collection using Grails (2.5.0) and the MongoDB Plugin (3.0.2) I created a forEach like this:
class MyObjectService {
def forEach(Closure func) {
def criteria = MyObject.createCriteria()
def ids = criteria.list { projections { id() } }
ids.each { func(MyObject.get(it)) }
}
}
Then I do this:
class AnalysisService{
def myObjectService
#Async
def analyze(){
MyObject.withStatelessSession {
myObjectService.forEach { myObject ->
doSomethingAwesome(myObject)
}
}
}
}
This works great...until I hit a collection that is large (>500K documents) at which point a CommandFailureException is thrown because the size of the aggregation result is greater than 16MB.
Caused by CommandFailureException: { "serverUsed" : "foo.bar.com:27017" , "errmsg" : "exception: aggregation result exceeds maximum document size (16MB)" , "code" : 16389 , "ok" : 0.0}
In reading about this, I think that one way to handle this situation is to use the option allowDiskUse in the aggregation function that runs on the MongoDB side so that the 16MB memory limit won't apply and I can get a larger aggregation result.
How can I pass this option to my criteria query? I've been reading the docs and the Javadoc for the Grails MongoDB plugin, but I can't seem to find it. Is there is another way to approach the generic problem (iterate over all members of a large collection of domain objects)?
This is not possible with the current implementation of MongoDB Grails plugin. https://github.com/grails/grails-data-mapping/blob/master/grails-datastore-gorm-mongodb/src/main/groovy/org/grails/datastore/mapping/mongo/query/MongoQuery.java#L957
If you look at the above line, then you will see that the default options are being used for building AggregationOptions instance so there is no method to provide an option.
But there is another hackish way to do it using the Groovy's metaclass. Let's do it..:-)
Store the original method reference of builder() method before writing criteria in your service:
MetaMethod originalMethod = AggregationOptions.metaClass.static.getMetaMethod("builder", [] as Class[])
Then, replace the builder method to provide your implementation.
AggregationOptions.metaClass.static.builder = { ->
def builderInstance = new AggregationOptions.Builder()
builderInstance.allowDiskUse(true) // solution to your problem
return builderInstance
}
Now, your service method will be called with criteria query and should not results in the aggregation error you are getting since we have not set the allowDiskUse property to true.
Now, reset the original method back so that it should not affect any other call (optional).
AggregationOptions.metaClass.static.addMetaMethod(originalMethod)
Hope this helps!
Apart from this, why do you pulling all IDs in forEach method and then re getting the instance using get() method? You are wasting the database queries which will impact the performance. Also, if you follow this, you don't have to do the above changes.
An example with the same: (UPDATED)
class MyObjectService {
void forEach(Closure func) {
List<MyObject> instanceList = MyObject.createCriteria().list {
// Your criteria code
eq("status", "ACTIVE") // an example
}
// Don't do any of this
// println(instanceList)
// println(instanceList.size())
// *** explained below
instanceList.each { myObjectInstance ->
func(myObjectInstance)
}
}
}
(I'm not adding the code of AnalysisService since there is no change)
*** The main point is here at this point. So whenever you write any criteria in domain class (without projection and in mongo), after executing the criteria code, Grails/gmongo will not immediately fetch the records from the database unless you call some methods like toString(), 'size()ordump()` on them.
Now when you apply each on that instance list, you will not actually loading all instances into memory but instead you are iterating over Mongo Cursor behind the scene and in MongoDB, cursor uses batches to pull record from database which is extremely memory safe. So you are safe to directly call each on your criteria result which will not blow up the JVM unless you called any of the methods which triggers loading all records from the database.
You can confirm this behaviour even in the code: https://github.com/grails/grails-data-mapping/blob/master/grails-datastore-gorm-mongodb/src/main/groovy/org/grails/datastore/mapping/mongo/query/MongoQuery.java#L1775
Whenever you write any criteria without projection, you will get an instance of MongoResultList and there is a method named initializeFully() which is being called on toString() and other methods. But, you can see the MongoResultList is implementing iterator which is in turn calling MongoDB cursor method for iterating over the large collection which is again, memory safe.
Hope this helps!

In Mongo any way to do check and setting like atomic operation?

Is in Mongo any way to do check and setting like atomic operation ? I am making booking for hotels and if there is free room you can reserve, but what if two or more people want to reserve in same time. Is there anything similar to transaction in Mongo or any way to solve this problem ?
Yes, that's the classic use case for MongoDB's findAndModify command.
Specifically for pymongo: find_and_modify.
All updates are atomic operations over a document. Now find_and_modify locks that document and returns it back in the same operation.
This allows you to combine a lock over the document during find and then applies the update operation.
You can find more about atomic operations:
http://www.mongodb.org/display/DOCS/Atomic+Operations
Best,
Norberto
The answers reference findAndModify documentation. But a practical example given the OP's requirements will do justice:
const current = new ISODate();
const timeAgoBy30Minutes = new Date(current.getTime() - 1000 * 30 ).toISOString();
db.runCommand(
{
findAndModify: "rooms",
query: {
"availability" : true,
"lastChecked" : {
"$lt": timeAgoBy30Minutes
}
},
update: { $set: { availability: false, lastChecked: current.toISOString() } }
}
)
In the above example, my decision to use db.runCommand verses db.rooms.findAndModify was strategic. db.runCommand will return a status code as to whether the document was updated, which allows me to perform additional work if the return value was true. findAndModify simply returns the old document, unless the new flag is passed to the argument list by which it will return the updated document.