Atomically query for all collection documents + watching for further changes - mongodb

Our Java app saves its configurations in a MongoDB collections. When the app starts it reads all the configurations from MongoDB and caches them in Maps. We would like to use the change stream API to be able also to watch for updates of the configurations collections.
So, upon app startup, first we would like to get all configurations, and from now on - watch for any further change.
Is there an easy way to execute the following atomically:
A find() that retrieves all configurations (documents)
Start a watch() that will send all further updates
By atomically I mean - without potentially missing any update (between 1 and 2 someone could update the collection with new configuration).

To make sure I lose no update notifications, I found that I can use watch().startAtOperationTime(serverTime) (for MongoDB of 4.0 or later), as follows.
Query the MongoDB server for its current time, using command such as Document hostInfoDoc = mongoTemplate.executeCommand(new Document("hostInfo", 1))
Query for all interesting documents: List<C> configList = mongoTemplate.findAll(clazz);
Extract the server time from hostInfoDoc: BsonTimestamp serverTime = (BsonTimestamp) hostInfoDoc.get("operationTime");
Start the change stream configured with the saved server time ChangeStreamIterable<Document> changes = eventCollection.watch().startAtOperationTime(serverTime);
Since 1 ends before 2 starts, we know that the documents that were returned by 2 were at least same or fresher than the ones on that server time. And any updates that happened on or after this server time will be sent to us by the change stream (I don't care to run again redundant updates, because I use map as cache, so extra add/remove won't make a difference, as long as the last action arrives).
I think I could also use watch().resumeAfter(_idOfLastAddedDoc) (didn't try). I did not use this approach because of the following scenario: the collection is empty, and the first document is added after getting all (none) documents, and before starting the watch(). In that scenario I don't have previous document _id to use as resume token.
Update
Instead of using "hostInfo" for getting the server time, which couldn't be used in our production, I ended using "dbStats" like that:
Document dbStats= mongoOperations.executeCommand(new Document("dbStats", 1));
BsonTimestamp serverTime = (BsonTimestamp) dbStats.get("operationTime");

Related

In Couchbase, expired document included to query view list with NULL contents

Recently we started to use couchbase, We are using java spring-data-couchbase with Jersey to access couchbase. Accessing low level java-sdk-api we set expire time (TTL) to a particular document with the KEY(id). It's working fine. The code is as follows.
// define couchbaseTemplate for lower-level access to Java SDK
#Autowired
CouchbaseTemplate couchbaseTemplate;
// setExpiry method update expiry given a doc ID
#Override
public void setExpiry(String key, int expN) throws RepositoryException {
couchbaseTemplate.getCouchbaseClient().touch(key, expN);
}
The problem we face is when we try to get list of documents using query, the list contains the expired documents. And when we try to access the documents from the list we found it to be null.
But if we execute query after a while the expired document no longer include to the list.
Example: When the expN = 10 seconds, and we execute query around 10 seconds after setting the TTL, the expired documents included
If we execute query around 20 seconds after setting the TTL, the expired documents no longer included
in stale options we set
Query.setStale(Stale.false)
We have tried to manipulate
Query.setIncludeDocs
But no luck, any help....
Couchbase Server does expiries lazily. There are three ways an item can be expired:
When a document is accessed (get operation) the expiration value is checked
When the expiry pager runs
When disk compaction process runs (Only in Couchbase Server 3 and onwards)
As a result of this views will not be updated until one of these three processes has happened.
For this use case you could simple do a range query against the view using the current time so it only returns documents that have not expired. Assuming the time is the same on the cluster are well as the client and the view being used is this one:
function (doc, meta) {
emit(meta.expiration, null);
}
The meta.expiration is an epoch timestamp, so the following query could be used:
String currentEpoch = String.valueOf((System.currentTimeMillis()/1000));
bucket.query(ViewQuery.from("designdoc", "myview").startkey(currentEpoch));
Please note that this will return all alive documents that have an expiration set.
If you want to do something more interesting with date formats have a look at the Date and time selection half way down in the View and query examples chapter in the Couchbase Server manual.
As the Couchbase official documents said,
Detecting Expired Documents in Result Sets : If you are using views for indexing items from Couchbase Server, items that have not yet been removed as part of the expiry pager maintenance process will be part of a result set returned by querying the view. To exclude these items from a result set you should use query parameter include_doc set to true.
For expired documents, if you set include_doc=true , Couchbase Server returns a result set indicating the document does not exist anymore. Specifically, the key that had expired but had not yet been removed by the cleanup process will appear in the result set as a row where "doc":null :
So, this is how Couchbase works with expired documents.
For your case, just filter out the items where the doc is null, the rest will be your expected result.

Conditional update for MongoDB (Meteor)

TLDR: Is there a way I can conditionally update a Meteor Mongo record inside a collection, so that if I use the id as a selector, I want to update if that matches and only if the revision number is greater than what already exists, or perform an upsert if there is no id match?
I am having an issue with updates to server side Meteor Mongo collections, whereby it seems the added() function callback in the Observers is being triggered on an upsert.
Here is what I am trying to do in a nutshell.
My meteor js app boots and then connects to an endpoint, fetching data and then upserting it into the collection.
collection.update({'sys.id': item.sys.id}, item, {upsert: true});
The 'sys.id' selector checks to see if the item exists, and then updates if it does or adds if it does not.
I have an observer monitoring the above collection, which then acts when an item has been added/updated to the collection.
collection.find({}).observeChanges({
added: this.itemAdded.bind(this),
changed: this.itemChanged.bind(this),
removed: this.itemRemoved.bind(this)
});
The first thing that puzzles me is that when the app is closed and then booted again, the 'added()' callback is fired when the collection is observed. What I would hope to happen is that the changed() callback is fired.
Going back to my original update - is it possible in Mongo to conditionally update something, so you have the selector, then the item, but only perform the update when another condition is met?
// Incoming item
var item = {
sys: {
id: 1,
revision: 5
}
};
collection.update({'sys.id': item.sys.id, 'sys.revision': {$gt: item.sys.revision}, item, {upsert: true});
If you look at the above code, what this is going to do is try to match the sys.id which is fine, but then the revisions will of course be different which means the update function will see it as a different document and then perform a new insert, thus creating duplicate data.
How do I fix this?
To your main question:
What you want is called findAndModify. First, look for the the document meeting the specs, and then update accordingly. This is a really powerful idea because if you did it in 2 queries, the document you found could be deleted/updated before you got to update it. Luckily for you, someone made a package (I really wish this existed a year ago!) https://github.com/fongandrew/meteor-find-and-modify
If you were to do this without using findAndModify you'd have to use javascript to find the doc, see if it matches your criteria, and then update it. In your use case, this would probably work, but there will always be that "what if" in the back of your mind.
Regarding observeChanges, the added is called each time the local minimongo receives a document (it's just reading what the DDP is telling it). Since a refresh will delete your local collection, you have to add those docs one by one. What you could do is wait until all added callbacks have fired, and then run your server method. In doing so, you get a ton of adds, and then a couple more changes will trickle in afterwards.
As Matt K said, you want findAndModify. There are some gotchas to be aware of:
findAndModify is about 100x slower than a find followed by an update. Find+modify is, obviously, not atomic and so won't do what you need, but be aware of the speed hit. (This is based off experience with MongoDB v2.4, so run some benchmarks to confirm under your own version.)
If your query matches multiple items, findAndModify will only act on the first one. In this case, you're querying on a unique id, but be aware of the issue for future use.
findAndModify will return the document after doing its thing, but by default it returns the pre-modification version. If you want the modified one, you need to pass the 'new: true' in your query.

Import "normal" MongoDB collections into DerbyJS 0.6

Same situation like this question, but with current DerbyJS (version 0.6):
Using imported docs from MongoDB in DerbyJS
I have a MongoDB collection with data that was not saved through my
Derby app. I want to query against that and pull it into my Derby app.
Is this still possible?
The accepted answer there links to a dead link. The newest working link would be this: https://github.com/derbyjs/racer/blob/0.3/lib/descriptor/query/README.md
Which refers to the 0.3 branch for Racer (current master version is 0.6).
What I tried
Searching the internets
The naïve way:
var query = model.query('projects-legacy', { public: true });
model.fetch(query, function() {
query.ref('_page.projects');
})
(doesn't work)
A utility was written for this purpose: https://github.com/share/igor
You may need to modify it to only run against a single collection instead of the whole database, but it essentially goes through every document in the database and modifies it with the necessary livedb metadata and creates a default operation for it as well.
In livedb every collection has a corresponding operations collection, for example profiles will have a profiles_ops collection which holds all the operations for the profiles.
You will have to convert the collection to use it with Racer/livedb because of the metadata on the document itself.
An alternative if you dont want to convert is to use traditional AJAX/REST to get the data from your mongo database and then just put it in your local model. This will not be real-time or synced to the server but it will allow you to drive your templates from data that you dont want to convert for some reason.

How to guard against repeated request?

we have a button in a web game for the users to collect reward. That should only be clicked once, and upon receiving the request, we'll mark it collected in DB.
we've already blocked the buttons in the client from repeated clicking. But that won't help if people resend the package multiple times to our server in short period of time.
what I want is a method to block this from server side.
we're using Playframework 2 (2.0.3-RC2) for server side and so far it's stateless, I'm tempted to use a Set to guard like this:
if processingSet has userId then BadRequest
else put userId in processingSet and handle request
after that remove userId from that Set
but then I'd have to face problem like Updating Scala collections thread-safely and still fail to block the user once we have more than one server behind load balancing.
one possibility I'm thinking about is to have a table in DB in place of the processingSet above, but that would incur 1+ DB operation per request, are there any better solution~?
thanks~
Additional DB operation is relatively 'cheap' solution in that case. You should use it if you'e planning to save the buttons state permanently.
If the button is disabled only for some period of time (for an example until the game is over) you can also consider using the cache API however keep in mind that's not dedicated for solutions which should be stored for long time (it should not be considered as DB alternative).
Given that you're using Mongo and so don't have transactions spanning separate collections, I think you can probably implement this guard using an atomic operation - namely "Update if current", which is effectively CompareAndSwap.
Assuming you've got a collection like "rewards" which has a "collected" attribute, you can update the collected flag to true only if it is currently false and if that operation doesn't fail you can proceed to apply the reward knowing that for any other requests the same operation will fail.

Updating an element in all documents in a MongoDB collection

I am running the following query with the purpose of updating a single element in all the existing documents in the collection. I am basically trying to clear their value to "0".
Here is the code:
MongoCollection collection = db.GetCollection(DataAccessConfiguration.Settings.CollectionName);
var query = Query.Exists("ElementName", true);
var update = Update.Set("ElementName", "0");
collection.Update(query, update);
It only updates a single document.
How can I update all elements at once?
Updates in MongoDB affect 0 or 1 documents by default (0 only if the query specifier doesn't match anything). To update all documents, you need to pass UpdateFlags.Multi as the third argument Update. There is also a 4-argument version of Update which accepts the "safe mode" flag as the fourth argument.
(Safe mode bundles a getLastError command with the update, and causes the driver to wait until the server acknowledges that the write has succeeded. There are various options to safe mode that will wait for acknowledgement from multiple servers if you are using a replica set, that will wait only for a certain period of time and then return with an error, etc).
Also be sure to see the C# driver documentation for details on the API.