This is the sample of my code:
var changesDB = new mongoose.Schema({
eventId: String,
date: Date
})
changesDB.index({ title: 1 }, { expireAfterSeconds : 60*60*24*30 });
It works fine, but I need to delete all the files connected to this collection, so I have to catch this event with nodejs.
How can I make it?
As of MongoDB 3.0, there's no callback mechanism of any sort in MongoDB; in particular, there is no such mechanism for TTL indexes. The TTL enforcement is just a background thread that queries every minute for documents that have expired, then deletes them. If you have related data that you need to expire, I'd suggest just mimicking the TTL index's operation in your application, where you can perform whatever extra logic is necessary to clean up related data.
Alternatively, you could make all related documents expire at the same time, so they will all be deleted at approximately the same time (within the same TTL pass).
Related
I have a firestore DB where I'm storing polls in one collection and responses to polls in another collection. I want to get a document from the poll collection that isn't referenced in the responses collection for a particular user.
The naive approach would be to get all of the poll documents and all of the responses filtered by user ID then filter the polls on the client side. The problem is that there may be quite a few polls and responses so those queries would have to pull down a lot of data.
So my question is, is there a way to structure my data so that I can query for polls that haven't been completed by a user without having to pull down the collections in their entirety? Or more generally, is there some pattern to use when you need to query for documents in one collection that aren't referenced by another?
The documents in each of the collections look something like this:
Polls:
{
question: string;
answers: Answer[];
}
Responses:
{
userId: string;
pollId: string;
answerId: string;
}
Anyhelp would be much appreciated!
Queries in Firestore can only return documents from one collection (or from all collections with the same name) and can only contain conditions on the data that they actually return.
Since there's no way to filter based on a condition in some other documents, you'll need to include the information that you want to filter on in the polls documents.
For example, you could include a completionCount field in each poll document, that you initially set to 0, and then update only every poll completion. With that in place, the query becomes a simple query on the completionCount field of the polls collection.
For a specific user I'd actually add all polls to their profile document, and remove them from there. Duplicating data is usually the easiest (and sometimes only) way to implement use-cases such as this.
If you're worried about having to add each new poll to each new user profile when it is created, you can also query all polls on their creation timestamp when you next load a user profile and perform that sync at that moment.
load user profile,
check when they were last active,
query for new polls,
add them to user profile.
Our Java app saves its configurations in a MongoDB collections. When the app starts it reads all the configurations from MongoDB and caches them in Maps. We would like to use the change stream API to be able also to watch for updates of the configurations collections.
So, upon app startup, first we would like to get all configurations, and from now on - watch for any further change.
Is there an easy way to execute the following atomically:
A find() that retrieves all configurations (documents)
Start a watch() that will send all further updates
By atomically I mean - without potentially missing any update (between 1 and 2 someone could update the collection with new configuration).
To make sure I lose no update notifications, I found that I can use watch().startAtOperationTime(serverTime) (for MongoDB of 4.0 or later), as follows.
Query the MongoDB server for its current time, using command such as Document hostInfoDoc = mongoTemplate.executeCommand(new Document("hostInfo", 1))
Query for all interesting documents: List<C> configList = mongoTemplate.findAll(clazz);
Extract the server time from hostInfoDoc: BsonTimestamp serverTime = (BsonTimestamp) hostInfoDoc.get("operationTime");
Start the change stream configured with the saved server time ChangeStreamIterable<Document> changes = eventCollection.watch().startAtOperationTime(serverTime);
Since 1 ends before 2 starts, we know that the documents that were returned by 2 were at least same or fresher than the ones on that server time. And any updates that happened on or after this server time will be sent to us by the change stream (I don't care to run again redundant updates, because I use map as cache, so extra add/remove won't make a difference, as long as the last action arrives).
I think I could also use watch().resumeAfter(_idOfLastAddedDoc) (didn't try). I did not use this approach because of the following scenario: the collection is empty, and the first document is added after getting all (none) documents, and before starting the watch(). In that scenario I don't have previous document _id to use as resume token.
Update
Instead of using "hostInfo" for getting the server time, which couldn't be used in our production, I ended using "dbStats" like that:
Document dbStats= mongoOperations.executeCommand(new Document("dbStats", 1));
BsonTimestamp serverTime = (BsonTimestamp) dbStats.get("operationTime");
I have a collection containing around 100k documents. I want to add an auto incrementing "custom_id" field to my documents, and keep adding my documents by incrementing that field from now on.
What's the best approach for this? I've seen some examples in the official document (http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/) however they're only for adding new documents, not for updating an existing collection.
Example code I created based on the link above to increment my counter:
function incrementAndGetNext(counter, callback) {
counters.findAndModify({
name: counter
}, [["_id", 1]], {
$inc: {
"count": 1
}
}, {
"new": true
}, function (err, doc) {
if (err) return console.log(err);
callback(doc.value);
})
}
On the above code counters is db.counters collection and I have this document there:
{_id:"...",name:"post",count:"0"}
Would love to know.
Thank you.
P.S. I'm using native mongojs driver for js
Well, using the link you mentionned, I'd rather use the counters collection approach.
The counters collections approach has some drawbacks including :
It always generates multiples request (two): one to get the sequence number, another to do the insertion using the id you got via the sequence,
If you are using sharding features of mongodb, a document responsible for storing a counter state may be used a lot, and each time it will reach the same server.
However it should be appropriate for most uses.
The approach you mentionned ("the optimistic loop") should not break IMO, and I don't guess why you have a problem with it. However I'd not recommend it. What happens if you execute the code on multiple mongo clients, if one has a lot of latency and others keep taking IDs? I'd not like to encounter this kind of problem... Furthermore, there are at least two request per successful operation, but no maximum of retries before a success...
TLDR: Is there a way I can conditionally update a Meteor Mongo record inside a collection, so that if I use the id as a selector, I want to update if that matches and only if the revision number is greater than what already exists, or perform an upsert if there is no id match?
I am having an issue with updates to server side Meteor Mongo collections, whereby it seems the added() function callback in the Observers is being triggered on an upsert.
Here is what I am trying to do in a nutshell.
My meteor js app boots and then connects to an endpoint, fetching data and then upserting it into the collection.
collection.update({'sys.id': item.sys.id}, item, {upsert: true});
The 'sys.id' selector checks to see if the item exists, and then updates if it does or adds if it does not.
I have an observer monitoring the above collection, which then acts when an item has been added/updated to the collection.
collection.find({}).observeChanges({
added: this.itemAdded.bind(this),
changed: this.itemChanged.bind(this),
removed: this.itemRemoved.bind(this)
});
The first thing that puzzles me is that when the app is closed and then booted again, the 'added()' callback is fired when the collection is observed. What I would hope to happen is that the changed() callback is fired.
Going back to my original update - is it possible in Mongo to conditionally update something, so you have the selector, then the item, but only perform the update when another condition is met?
// Incoming item
var item = {
sys: {
id: 1,
revision: 5
}
};
collection.update({'sys.id': item.sys.id, 'sys.revision': {$gt: item.sys.revision}, item, {upsert: true});
If you look at the above code, what this is going to do is try to match the sys.id which is fine, but then the revisions will of course be different which means the update function will see it as a different document and then perform a new insert, thus creating duplicate data.
How do I fix this?
To your main question:
What you want is called findAndModify. First, look for the the document meeting the specs, and then update accordingly. This is a really powerful idea because if you did it in 2 queries, the document you found could be deleted/updated before you got to update it. Luckily for you, someone made a package (I really wish this existed a year ago!) https://github.com/fongandrew/meteor-find-and-modify
If you were to do this without using findAndModify you'd have to use javascript to find the doc, see if it matches your criteria, and then update it. In your use case, this would probably work, but there will always be that "what if" in the back of your mind.
Regarding observeChanges, the added is called each time the local minimongo receives a document (it's just reading what the DDP is telling it). Since a refresh will delete your local collection, you have to add those docs one by one. What you could do is wait until all added callbacks have fired, and then run your server method. In doing so, you get a ton of adds, and then a couple more changes will trickle in afterwards.
As Matt K said, you want findAndModify. There are some gotchas to be aware of:
findAndModify is about 100x slower than a find followed by an update. Find+modify is, obviously, not atomic and so won't do what you need, but be aware of the speed hit. (This is based off experience with MongoDB v2.4, so run some benchmarks to confirm under your own version.)
If your query matches multiple items, findAndModify will only act on the first one. In this case, you're querying on a unique id, but be aware of the issue for future use.
findAndModify will return the document after doing its thing, but by default it returns the pre-modification version. If you want the modified one, you need to pass the 'new: true' in your query.
my MongoDB collection is used as a job queue, and there are 3 C++ machines that read from this collection. The problem is that those three can't perform the same job. All jobs need to be made only once.
I fetch all un-done jobs by searching the collection for all records with 'isDone:False' and then update this document, 'isDone:True'. But if 2 machines find the same document at the same time, they would to the same job both. How can I avoid this?
Edit: My question is - do findAndModify really solves that problem?
(After reading A way to ensure exclusive reads in MongoDb's findAndModify?)
Yes, findAndModify solve it.
Ref: MongoDB findAndModify from multiple clients
"...
Note: This command obtains a write lock on the affected database and will block other operations until it has completed; however, typically the write lock is short lived and equivalent to other similar update() operations.
..."
Ref: http://docs.mongodb.org/manual/reference/method/db.collection.update/#db.collection.update
"...
For unsharded collections, you can override this behavior with the $isolated isolation operator, which isolates the update operation and blocks other write operations during the update. See the isolation operator.
..."
Ref: http://docs.mongodb.org/manual/reference/operator/isolated/
Regards,
Moacy
Yes, find-and-modify will solve your problem:
db.collection.findAndModify( {
query: { isDone: false },
update: { $set: { isDone: true } },
new: true,
upsert: false # never create new docs
} );
This will return a single document that it just updated from false to true.
But you have a serious problem if your C++ clients ever have a hiccup (the box dies, they are killed, the code has an error, etc.) Imagine if your TCP connection drops just after the update on the server, but before the C++ code gets the job. It's generally better to have multi-phase approach:
change "isDone" to "isInProgress", then when it's done, delete the document. (Now, you can see the stack of "todo" and "being done". If something is "being done" for a long time, the client probably died.
change "isDone" to "phase" and atomically set it from "new" to "started" (and later set it to "finished"). Now you can see if something is "started" for a long time, the client may have died.
If you're really sophisticated, you can make a partial index. For example, "Only index documents with "phase:{ $ne: 'finished'}". Now you don't need to waste space indexing the millions of finished documents. The index only holds the handful of new/in-progress documents, so it's smaller/faster.