I wan't to calculate sum the documents in my collection satisfying a query. I dont want to poll my collection. How can you do this with mongodb changestream?
For example there are documents in the database and they all have some property: {"destination": "Target1"} And i want to know the amount of documents which are satisfying this previous requirement.
I don't want to run a query on every change of a collection. Because the documents changing very often
I am looking for a similar to oracle's cqn
You can use changestream and watch changes as follow:
watchCursor = db.getSiblingDB("mydatabase").mycollection.watch()
while (!watchCursor.isExhausted()){
if (watchCursor.hasNext()){
printjson(watchCursor.next());
}
}
changeStream docs
But perhaps you may do some query and use some good indexes?
It seems you can just execute:
db.collection.count({destination:"Target1"})
and if you have index on "destination" field it will be pretty quick ...
Related
I need to skip a number of documents (offset) from a query, and only return limit number of documents that go after. I know the following naive approach:
collection.find(BSONDocument())
.cursor[T].collect[List](offset+limit).map(_.drop(offset))
but it is not really desired because it will load offset+limit number of documents in JVM memory, whereas I'd like to filter them on the "database" side.
Solution: use QueryOpts. Example:
collection.find(BSONDocument())
.options(QueryOpts(skipN = offset))
.cursor[T].collect[List](limit)
Note that using skip is not very efficient because mongodb does not support effective pagination, it will just skip the desired number by iterating through all the documents.
VasyaNovikov answer is certainly correct. Reactive mongo offers a more intuitive API:
collection.find(BSONDocument())
.skip(offset)
.cursor[T]
.collect[List](limit, Cursor.FailOnError[List[T]]())
In my MongoDB 3.2 based application I want to perform the documents processing. In order to avoid the repeated processing on the same document I want to update its flag and update this document in the database.
The possible approach is:
Query the data: FindIterable<Document> documents = db.collection.find(query);.
Perform some business logic on these documents.
Iterate over the documents, update each document and store it in a new collection.
Push the new collection to the database with db.collection.updateMany();.
Theoretically, this approach should work but I'm not sure that it is the optimal scenario.
Is there any way in MongoDB Java API to perform the followings two operations:
to query documents (to get them from the DB and to pass to the separate method);
to update them and then store the updated version in DB;
in a more elegant way comparing to the proposed above approach?
You can update document inplace using update:
db.collection.update(
{query},
{update},
{multi:true}
);
It will iterate over all documents in the collection which match the query and updated fields specified in the update.
EDIT:
To apply some business logic to individual documents you can iterate over matching documents as following:
db.collection.find({query}).forEach(
function (doc) {
// your logic business
if (doc.question == "Great Question of Life") {
doc.answer = 42;
}
db.collection.save(doc);
}
)
I have some unused collections in the MongoDb database. I've to find out when the CRUD operations done against collections in the database. We have our own _id field instead of mongo's default object_id. We dont have any time filed in the collections to find out the modification time. is there any way to find out the modification time of collections in mongodb from meta data? Is there any data dictionay informations like in oracle to find out this? please give some idea/workarounds
To make a long story short: MongoDB has a flexible schema. Simply add a date field. Since older entries don't have it, they can not be the last entry.
Let's call that field mtime.
So after adding a date field to your schema definition, we generate an index in descending order on the new field:
db.yourCollction.createIndex({mtime:-1})
Finding the last mtime for a collection now is easy:
db.yourCollection.find({"mtime":{"$exists":true}}).sort({"mtime":-1}).limit(1)
Do this for every collection. When the above query does not return a value within the timeframe you defined for purging a collection, simply drop it, since it has not been modified since you introduced the mtime field.
After your collections are cleaned up, you may remove the mtime field from your schema definition. To remove it from the documents, you can run a simple query:
db.yourCollection.update(
{ "mtime":{ $exists:true} },
{ "$unset":{ "mtime":""} },
{ multi: true}
)
There is no "data dictionary" to get this information in MongoDB.
If you've enabled the profiling level in advance to log all operations (db.setProfilingLevel(2)) and you haven't had many operations to log, so that the system.profile capped collection hasn't overwritten whatever logs you are interested in, you can get the information you need there—but otherwise it's gone.
I was wondering if the justOne keyword helps the speed of a remove query even if you are querying by a unique field (i.e. there is only one instance of the document).
For instance using pymongo:
for id in list_of_ids:
db.remove({"_id":id})
Does it still speed up the query if I use the justOne argument?
for id in list_of_ids:
db.remove({"_id":id},justOne=True)
It wouldn't make sense, but I don't know if mongo is smart enough to know that this is the unique id so of course there will only be one.
J
No, this will not speed up the query. First of all, Mongo will retrieve all documents, that match your condition and then perform one delete. Since Mongo will retrieve just one document, so - no speedup there.
Is there any way to find last update Document in Collection? in other way sort collection by update
somethings like this
people = Person.objects.order_by_update()
or i must add update time for each doc?
I use mongodb, mongoengine, flask
You must add a field such as last_updated_time if you want to be able to sort in this way. Also, since you're sorting on it, you should probably index it.
The only thing that mongodb stores by default is _id, which can be used roughly as a created_time timestamp.