Updating an element in all documents in a MongoDB collection - mongodb

I am running the following query with the purpose of updating a single element in all the existing documents in the collection. I am basically trying to clear their value to "0".
Here is the code:
MongoCollection collection = db.GetCollection(DataAccessConfiguration.Settings.CollectionName);
var query = Query.Exists("ElementName", true);
var update = Update.Set("ElementName", "0");
collection.Update(query, update);
It only updates a single document.
How can I update all elements at once?

Updates in MongoDB affect 0 or 1 documents by default (0 only if the query specifier doesn't match anything). To update all documents, you need to pass UpdateFlags.Multi as the third argument Update. There is also a 4-argument version of Update which accepts the "safe mode" flag as the fourth argument.
(Safe mode bundles a getLastError command with the update, and causes the driver to wait until the server acknowledges that the write has succeeded. There are various options to safe mode that will wait for acknowledgement from multiple servers if you are using a replica set, that will wait only for a certain period of time and then return with an error, etc).
Also be sure to see the C# driver documentation for details on the API.

Related

Atomically query for all collection documents + watching for further changes

Our Java app saves its configurations in a MongoDB collections. When the app starts it reads all the configurations from MongoDB and caches them in Maps. We would like to use the change stream API to be able also to watch for updates of the configurations collections.
So, upon app startup, first we would like to get all configurations, and from now on - watch for any further change.
Is there an easy way to execute the following atomically:
A find() that retrieves all configurations (documents)
Start a watch() that will send all further updates
By atomically I mean - without potentially missing any update (between 1 and 2 someone could update the collection with new configuration).
To make sure I lose no update notifications, I found that I can use watch().startAtOperationTime(serverTime) (for MongoDB of 4.0 or later), as follows.
Query the MongoDB server for its current time, using command such as Document hostInfoDoc = mongoTemplate.executeCommand(new Document("hostInfo", 1))
Query for all interesting documents: List<C> configList = mongoTemplate.findAll(clazz);
Extract the server time from hostInfoDoc: BsonTimestamp serverTime = (BsonTimestamp) hostInfoDoc.get("operationTime");
Start the change stream configured with the saved server time ChangeStreamIterable<Document> changes = eventCollection.watch().startAtOperationTime(serverTime);
Since 1 ends before 2 starts, we know that the documents that were returned by 2 were at least same or fresher than the ones on that server time. And any updates that happened on or after this server time will be sent to us by the change stream (I don't care to run again redundant updates, because I use map as cache, so extra add/remove won't make a difference, as long as the last action arrives).
I think I could also use watch().resumeAfter(_idOfLastAddedDoc) (didn't try). I did not use this approach because of the following scenario: the collection is empty, and the first document is added after getting all (none) documents, and before starting the watch(). In that scenario I don't have previous document _id to use as resume token.
Update
Instead of using "hostInfo" for getting the server time, which couldn't be used in our production, I ended using "dbStats" like that:
Document dbStats= mongoOperations.executeCommand(new Document("dbStats", 1));
BsonTimestamp serverTime = (BsonTimestamp) dbStats.get("operationTime");

In Couchbase, expired document included to query view list with NULL contents

Recently we started to use couchbase, We are using java spring-data-couchbase with Jersey to access couchbase. Accessing low level java-sdk-api we set expire time (TTL) to a particular document with the KEY(id). It's working fine. The code is as follows.
// define couchbaseTemplate for lower-level access to Java SDK
#Autowired
CouchbaseTemplate couchbaseTemplate;
// setExpiry method update expiry given a doc ID
#Override
public void setExpiry(String key, int expN) throws RepositoryException {
couchbaseTemplate.getCouchbaseClient().touch(key, expN);
}
The problem we face is when we try to get list of documents using query, the list contains the expired documents. And when we try to access the documents from the list we found it to be null.
But if we execute query after a while the expired document no longer include to the list.
Example: When the expN = 10 seconds, and we execute query around 10 seconds after setting the TTL, the expired documents included
If we execute query around 20 seconds after setting the TTL, the expired documents no longer included
in stale options we set
Query.setStale(Stale.false)
We have tried to manipulate
Query.setIncludeDocs
But no luck, any help....
Couchbase Server does expiries lazily. There are three ways an item can be expired:
When a document is accessed (get operation) the expiration value is checked
When the expiry pager runs
When disk compaction process runs (Only in Couchbase Server 3 and onwards)
As a result of this views will not be updated until one of these three processes has happened.
For this use case you could simple do a range query against the view using the current time so it only returns documents that have not expired. Assuming the time is the same on the cluster are well as the client and the view being used is this one:
function (doc, meta) {
emit(meta.expiration, null);
}
The meta.expiration is an epoch timestamp, so the following query could be used:
String currentEpoch = String.valueOf((System.currentTimeMillis()/1000));
bucket.query(ViewQuery.from("designdoc", "myview").startkey(currentEpoch));
Please note that this will return all alive documents that have an expiration set.
If you want to do something more interesting with date formats have a look at the Date and time selection half way down in the View and query examples chapter in the Couchbase Server manual.
As the Couchbase official documents said,
Detecting Expired Documents in Result Sets : If you are using views for indexing items from Couchbase Server, items that have not yet been removed as part of the expiry pager maintenance process will be part of a result set returned by querying the view. To exclude these items from a result set you should use query parameter include_doc set to true.
For expired documents, if you set include_doc=true , Couchbase Server returns a result set indicating the document does not exist anymore. Specifically, the key that had expired but had not yet been removed by the cleanup process will appear in the result set as a row where "doc":null :
So, this is how Couchbase works with expired documents.
For your case, just filter out the items where the doc is null, the rest will be your expected result.

Conditional update for MongoDB (Meteor)

TLDR: Is there a way I can conditionally update a Meteor Mongo record inside a collection, so that if I use the id as a selector, I want to update if that matches and only if the revision number is greater than what already exists, or perform an upsert if there is no id match?
I am having an issue with updates to server side Meteor Mongo collections, whereby it seems the added() function callback in the Observers is being triggered on an upsert.
Here is what I am trying to do in a nutshell.
My meteor js app boots and then connects to an endpoint, fetching data and then upserting it into the collection.
collection.update({'sys.id': item.sys.id}, item, {upsert: true});
The 'sys.id' selector checks to see if the item exists, and then updates if it does or adds if it does not.
I have an observer monitoring the above collection, which then acts when an item has been added/updated to the collection.
collection.find({}).observeChanges({
added: this.itemAdded.bind(this),
changed: this.itemChanged.bind(this),
removed: this.itemRemoved.bind(this)
});
The first thing that puzzles me is that when the app is closed and then booted again, the 'added()' callback is fired when the collection is observed. What I would hope to happen is that the changed() callback is fired.
Going back to my original update - is it possible in Mongo to conditionally update something, so you have the selector, then the item, but only perform the update when another condition is met?
// Incoming item
var item = {
sys: {
id: 1,
revision: 5
}
};
collection.update({'sys.id': item.sys.id, 'sys.revision': {$gt: item.sys.revision}, item, {upsert: true});
If you look at the above code, what this is going to do is try to match the sys.id which is fine, but then the revisions will of course be different which means the update function will see it as a different document and then perform a new insert, thus creating duplicate data.
How do I fix this?
To your main question:
What you want is called findAndModify. First, look for the the document meeting the specs, and then update accordingly. This is a really powerful idea because if you did it in 2 queries, the document you found could be deleted/updated before you got to update it. Luckily for you, someone made a package (I really wish this existed a year ago!) https://github.com/fongandrew/meteor-find-and-modify
If you were to do this without using findAndModify you'd have to use javascript to find the doc, see if it matches your criteria, and then update it. In your use case, this would probably work, but there will always be that "what if" in the back of your mind.
Regarding observeChanges, the added is called each time the local minimongo receives a document (it's just reading what the DDP is telling it). Since a refresh will delete your local collection, you have to add those docs one by one. What you could do is wait until all added callbacks have fired, and then run your server method. In doing so, you get a ton of adds, and then a couple more changes will trickle in afterwards.
As Matt K said, you want findAndModify. There are some gotchas to be aware of:
findAndModify is about 100x slower than a find followed by an update. Find+modify is, obviously, not atomic and so won't do what you need, but be aware of the speed hit. (This is based off experience with MongoDB v2.4, so run some benchmarks to confirm under your own version.)
If your query matches multiple items, findAndModify will only act on the first one. In this case, you're querying on a unique id, but be aware of the issue for future use.
findAndModify will return the document after doing its thing, but by default it returns the pre-modification version. If you want the modified one, you need to pass the 'new: true' in your query.

paginated data with the help of mongo inbound adapter in spring integration

I am using mongo inbound adapter for retrieving data from mongo. Currently I am using below configuration.
<int-mongo:inbound-channel-adapter
id="mongoInboundAdapter" collection-name="updates_IPMS_PRICING"
mongo-template="mongoTemplatePublisher" channel="ipmsPricingUpdateChannelSplitter"
query="{'flagged' : false}" entity-class="com.snapdeal.coms.publisher.bean.PublisherVendorProductUpdate">
<poller max-messages-per-poll="2" fixed-rate="10000"></poller>
</int-mongo:inbound-channel-adapter>
I have around 20 records in my data base which qualifies the mentioned query but as I am giving max-messages-per-poll value 2 I was expecting that i will get maximum 2 records per poll.
but I am getting all the records which qualifies the mentioned query. Not sure what I am doing wrong.
Actually I'd suggest to raise a New Feature JIRA ticket for that query-expression to allow to specify org.springframework.data.mongodb.core.query.Query builder, which has skip() and limit() options and from there your issue can be fixed like:
<int-mongo:inbound-channel-adapter
query-expression="new BasicQuery('{\'flagged\' : false}').limit(2)"/>
The mongo adapter is designed to return a single message containing a collection of query results per poll. So max-messages-per-poll makes no difference here.
max-messages-per-poll is used to short-circuit the poller and, in your case, the second poll is done immediately rather than waiting 10 seconds again. After 2 polls, we wait again.
In order to implement paging, you will need to use a query-expression instead of query and maintain some state somewhere that can be included in the query on each poll.
For example, if the documents have some value that increments you can store off that value in a bean and use the value in the next poll to get the next one.

Multi-collection, multi-document 'transactions' in MongoDB

I realise that MongoDB, by it's very nature, doesn't and probably never will support these kinds of transactions. However, I have found that I do need to use them in a somewhat limited fashion, so I've come up with the following solution, and I'm wondering: is this the best way of doing it, and can it be improved upon? (before I go and implement it in my app!)
Obviously the transaction is controlled via the application (in my case, a Python web app). For each document in this transaction (in any collection), the following fields are added:
'lock_status': bool (true = locked, false = unlocked),
'data_old': dict (of any old values - current values really - that are being changed),
'data_new': dict (of values replacing the old (current) values - should be an identical list to data_old),
'change_complete': bool (true = the update to this specific document has occurred and was successful),
'transaction_id': ObjectId of the parent transaction
In addition, there is a transaction collection which stores documents detailing each transaction in progress. They look like:
{
'_id': ObjectId,
'date_added': datetime,
'status': bool (true = all changes successful, false = in progress),
'collections': array of collection names involved in the transaction
}
And here's the logic of the process. Hopefully it works in such a way that if it's interupted, or fails in some other way, it can be rolled back properly.
1: Set up a transaction document
2: For each document that is affected by this transaction:
Set lock_status to true (to 'lock' the document from being modified)
Set data_old and data_new to their old and new values
Set change_complete to false
Set transaction_id to the ObjectId of the transaction document we just made
3: Perform the update. For each document affected:
Replace any affected fields in that document with the data_new values
Set change_complete to true
4: Set the transaction document's status to true (as all data has been modified successfully)
5: For each document affected by the transaction, do some clean up:
remove the data_old and data_new, as they're no longer needed
set lock_status to false (to unlock the document)
6: Remove the transaction document set up in step 1 (or as suggested, mark it as complete)
I think that logically works in such a way that if it fails at any point, all data can be either rolled back or the transaction can be continued (depending on what you want to do). Obviously all rollback/recovery/etc. is performed by the application and not the database, by using the transaction documents and the documents in the other collections with that transaction_id.
Is there any glaring error in this logic that I've missed or overlooked? Is there a more efficient way of going about it (e.g. less writing/reading from the database)?
As a generic response multi-document commits on MongoDB can be performed as two phase commits, which have been somewhat extensively documented in the manual (See: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/).
The pattern suggested by the manual is briefly to following:
Set up a separate transactions collection, that includes target document, source document, value and state (of the transaction)
Create new transaction object with initial as the state
Start making a transaction and update state to pending
Apply transactions to both documents (target, source)
Update transaction state to committed
Use find to determine whether documents reflect the transaction state, if ok, update transaction state to done
In addition:
You need to manually handle failure scenarios (something didn't happen as described below)
You need to manually implement a rollback, basically by introducing a name state value canceling
Some specific notes for your implementation:
I would discourage you from adding fields like lock_status, data_old, data_new into source/target documents. These should be properties of the transactions, not the documents themselves.
To generalize the concept of target/source documents, I think you could use DBrefs: http://www.mongodb.org/display/DOCS/Database+References
I don't like the idea of deleting transaction documents when they are done. Setting state to done seems like a better idea since this allows you to later debug and find out what kind of transactions have been performed. I'm pretty sure you won't run out of disk space either (and for this there are solutions as well).
In your model how do you guarantee that everything has been changed as expected? Do you inspect the changes somehow?
MongoDB 4.0 adds support for multi-document ACID transactions.
Java Example:
try (ClientSession clientSession = client.startSession()) {
clientSession.startTransaction();
collection.insertOne(clientSession, docOne);
collection.insertOne(clientSession, docTwo);
clientSession.commitTransaction();
}
Note, it works for replica set. You can still have a replica set with one node and run it on local machine.
https://stackoverflow.com/a/51396785/4587961
https://docs.mongodb.com/manual/tutorial/deploy-replica-set-for-testing/
MongoDB 4.0 is adding (multi-collection) multi-document transactions: link