Mongo aggregate is not updating the actual document - mongodb

As can be seen from the below example when I do aggregation it spits
out the required result but the actual result is not getting replaces.
Could some tell me how to persist aggregate o/p?
> db.demo95.find();
{ "_id" : ObjectId("5eed924ae3fc5c755e1198a2"), "Id" : "5ab9cbe531c2ab715d42129a" }
> db.demo95.aggregate([ { "$addFields": { "Id" : { "$toObjectId": "$Id" } }} ])
{ "_id" : ObjectId("5eed924ae3fc5c755e1198a2"), "Id" : ObjectId("5ab9cbe531c2ab715d42129a") }
> db.demo95.find();
{ "_id" : ObjectId("5eed924ae3fc5c755e1198a2"), "Id" : "5ab9cbe531c2ab715d42129a" }

Aggregate is supposed to read the data from a collection. You can write the output to another collection by using a $out or $merge stage.
Only from v4.4 (not generally available yet as of June 20th, 2020), you can use a $merge stage to output to the same collection.
However, starting from version 4.2, you can use "updates with aggregation pipeline". The syntax for the pipeline is the same, but you can use only selected stages.
Your query can be translated to:
db.demo95.updateMany({}, [ { "$addFields": { "Id" : { "$toObjectId": "$Id" } }} ])
Refer to updateMany with aggregation pipeline for more information.
If you have an issue with updateMany, you can refer to another answer by #whoami on a different question:
As of now, aggregation-pipeline in .updateMany() is not supported by
many clients even few mongo shell versions - back then my ticket to
them got resolved by using .update(), if it doesn't work then try to
use update + { multi : true }.

Related

find() return the latest value only on MongoDB

I have this collection in MongoDB that contains the following entries. I'm using Robo3T to run the query.
{
"_id" : ObjectId("xxx1"),
"Evaluation Date" : "2021-09-09",
"Results" : [
{
"Name" : "ABCD",
"Version" : "3.2.x"
}
]
"_id" : ObjectId("xxx2"),
"Evaluation Date" : "2022-09-09",
"Results" : [
{
"Name" : "ABxD",
"Version" : "5.2.x"
}
]
}
This document contains multiple entries of similar format. Now, I need to extract the latest value for "Version".
Expected output:
5.2.x
Measures I've taken so far:
(1) I've only tried findOne() and while I was able to extract the value of "Version": db.getCollection('TestCollectionName').findOne().Results[0].Version
...only the oldest entry was returned.
3.2.x
(2) Using the find().sort().limit() like below, returns the entire document for the latest entry and not just the data value that I wanted; db.getCollection('TestCollectionName').find({}).sort({"Results.Version":-1}).limit(1)
Results below:
"_id" : ObjectId("xxx2"),
"Evaluation Date" : "2022-09-09",
"Results" : [
{
"Name" : "ABxD",
"Version" : "5.2.x"
}
]
(3) I've tried to use sort() and limit() alongside findOne() but I've read that findOne is maybe deprecated and also not compatible with sort. And thus, resulting to an error.
(4) Finally, if I try to use sort and limit on find like this: db.getCollection('LD_exit_Evaluation_Result_MFC525').find({"Results.New"}).sort({_id:-1}).limit(1) I would get an unexpected token error.
What would be a good measure for this?
Did I simply mistake to/remove a bracket or need to reorder the syntax?
Thanks in advance.
I'm not sure if I understood well, but maybe this could be what are you looking for:
db.collection.aggregate([
{
"$project": {
lastResult: {
"$last": "$Results"
},
},
},
{
"$project": {
version: "$lastResult.Version",
_id: 0
}
}
])
It uses aggregate with some operators: the first $project calculate a new field called lastResult with the last element of each array using $last operator. The second $project is just to clean the output. If you need the _id reference, just remove _id: 0 or change its value to 1.
You can check how it works here: https://mongoplayground.net/p/jwqulFtCh6b
Hope I helped

MongoDB Sorting: Equivalent Aggregation Query

I have following students collection
{ "_id" : ObjectId("5f282eb2c5891296d8824130"), "name" : "Rajib", "mark" : "1000" }
{ "_id" : ObjectId("5f282eb2c5891296d8824131"), "name" : "Rahul", "mark" : "1200" }
{ "_id" : ObjectId("5f282eb2c5891296d8824132"), "name" : "Manoj", "mark" : "1000" }
{ "_id" : ObjectId("5f282eb2c5891296d8824133"), "name" : "Saroj", "mark" : "1400" }
My requirement is to sort the collection basing on 'mark' field in descending order. But it should not display 'mark' field in final result. Result should come as:
{ "name" : "Saroj" }
{ "name" : "Rahul" }
{ "name" : "Rajib" }
{ "name" : "Manoj" }
Following query I tried and it works fine.
db.students.find({},{"_id":0,"name":1}).sort({"mark":-1})
My MongoDB version is v4.2.8. Now question is what is the equivalent Aggregation Query of the above query. I tried following two queries. But both didn't give me desired result.
db.students.aggregate([{"$project":{"name":1,"_id":0}},{"$sort":{"mark":-1}}])
db.students.aggregate([{"$project":{"name":1,"_id":0,"mark":1}},{"$sort":{"mark":-1}}])
Why it is working in find()?
As per Cursor.Sort, When a set of results are both sorted and projected, the MongoDB query engine will always apply the sorting first.
Why it isn't working in aggregate()?
As per Aggregation Pipeline, The MongoDB aggregation pipeline consists of stages. Each stage transforms the documents as they pass through the pipeline. Pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents.
You need to correct:
You should change pipeline order, because if you have not selected mark field in $project then it will no longer available in further pipelines and it will not affect $sort operation.
db.students.aggregate([
{ "$sort": { "mark": -1 } },
{ "$project": { "name": 1, "_id": 0 } }
])
Playground: https://mongoplayground.net/p/xtgGl8AReeH

Sorting on index of array mongodb

I have a collection where i have objects like:
{
"_id" : ObjectId("5ab212249a639865c58b744e"),
"levels" : [
{
"levelId" : 0,
"siteId" : "5a0ff11dc7bd083ea6a706b1",
"title" : "Hospital Services"
},
{
"levelId" : 1,
"siteId" : "5a0ff220c7bd083ea6a706d0",
"title" : "Reference Testing"
},
{
"levelId" : 2,
"siteId" : "5a0ff24fc7bd083ea6a706da",
"title" : "Des Moines(Reference Testing)"
}
]
}
I want to sort on the title field of 2nd object of levels array e.g. levels.2.title
Currently my mongo query looks like:
db.getCollection('5aaf63a69a639865c58b2ab9').aggregate([
{$sort : {'levels.2.title':1}}
])
But it is not giving desired results.
Please help.
You can try below query in 3.6.
db.col.aggregate({$sort:{"levels.2.title":1}});
This aggregation and find semantics are different in 3.4. More on jira here
So
db.col.find().sort({"levels.2.title":1})
works as expected and aggregation sort is not working as expected.
Use below aggregation in 3.4.
Use $arrayElemAt to project the second element in $addFields to keep the computed value as the extra field in the document followed by $sort sort on field.
$project with exclusion to drop the sort field to get expected output.
db.col.aggregate([
{"$addFields":{ "sort_element":{"$arrayElemAt":["$levels", 2]}}},
{"$sort":{"sort_element.title":-1}},
{"$project":{"sort_element":0}}
])
Also, You can use $let expression to output the title field directly in $addFields stage.
db.col.aggregate([
{"$addFields":{ "sort_field":{"$let:{"vars":{"ele":{$arrayElemAt":["$levels", 2]}}, in:"$$ele.title"}}}},
{"$sort":{"sort_field":-1}},
{"$project":{"sort_field":0}}
])

group in aggregate framework stopped working properly

I hate this kind of questions but maybe you can point me to obvious. I'm using Mongo 2.2.2.
I have a collection (in replica set) with 6M documents which has string field called username on which I have index. The index was non-unique but recently I made it unique. Suddenly following query gives me false alarms that I have duplicates.
db.users.aggregate(
{ $group : {_id : "$username", total : { $sum : 1 } } },
{ $match : { total : { $gte : 2 } } },
{ $sort : {total : -1} } );
which returns
{
"result" : [
{
"_id" : "davidbeges",
"total" : 2
},
{
"_id" : "jesusantonio",
"total" : 2
},
{
"_id" : "elesitasweet",
"total" : 2
},
{
"_id" : "theschoolofbmx",
"total" : 2
},
{
"_id" : "longflight",
"total" : 2
},
{
"_id" : "thenotoriouscma",
"total" : 2
}
],
"ok" : 1
}
I tested this query on sample collection with few documents and it works as expected.
One of 10gen responded in their JIRA.
Are there any updates on this collection? If so, I'd try adding {$sort: {username:1}} to the front of the pipeline. That will ensure that you only see each username once if it is unique.
If there are updates going on, it is possible that aggregation would see a document twice if it moves due to growth. Another possibility is that a document was deleted after being seen by the aggregation and a new one was inserted with the same username.
So sorting by username before grouping helped.
I think the answer may lie in the fact that your $group is not using an index, it's just doing a scan over the entire collection. These operators can use and index currently in the aggregation framework:
$match $sort $limit $skip
And they will work if placed before:
$project $unwind $group
However, $group by itself will not use an index. When you do your find() test I am betting you are using the index, possibly as a covered index (you can verify by looking at an explain() for that query), rather than scanning the collection. Basically my theory is that your index has no dupes, but your collection does.
Edit: This likely happens because a document is updated/moved during the aggregation operation and hence is seen twice, not because of dupes in the collection as originally thought.
If you add an operator earlier in the pipeline that can use the index but not alter the results fed into $group, then you can avoid the issue.

Save Subset of MongoDB Collection to Another Collection

I have a set like so
{date: 20120101}
{date: 20120103}
{date: 20120104}
{date: 20120005}
{date: 20120105}
How do I save a subset of those documents with the date '20120105' to another collection?
i.e db.subset.save(db.full_set.find({date: "20120105"}));
I would advise using the aggregation framework:
db.full_set.aggregate([ { $match: { date: "20120105" } }, { $out: "subset" } ])
It works about 100 times faster than forEach at least in my case. This is because the entire aggregation pipeline runs in the mongod process, whereas a solution based on find() and insert() has to send all of the documents from the server to the client and then back. This has a performance penalty, even if the server and client are on the same machine.
Here's the shell version:
db.full_set.find({date:"20120105"}).forEach(function(doc){
db.subset.insert(doc);
});
Note: As of MongoDB 2.6, the aggregation framework makes it possible to do this faster; see melan's answer for details.
Actually, there is an equivalent of SQL's insert into ... select from in MongoDB. First, you convert multiple documents into an array of documents; then you insert the array into the target collection
db.subset.insert(db.full_set.find({date:"20120105"}).toArray())
The most general solution is this:
Make use of the aggregation (answer given by #melan):
db.full_set.aggregate({$match:{your query here...}},{$out:"sample"})
db.sample.copyTo("subset")
This works even when there are documents in "subset" before the operation and you want to preserve those "old" documents and just insert a new subset into it.
Care must be taken, because the copyTo() command replaces the documents with the same _id.
There's no direct equivalent of SQL's insert into ... select from ....
You have to take care of it yourself. Fetch documents of interest and save them to another collection.
You can do it in the shell, but I'd use a small external script in Ruby. Something like this:
require 'mongo'
db = Mongo::Connection.new.db('mydb')
source = db.collection('source_collection')
target = db.collection('target_collection')
source.find(date: "20120105").each do |doc|
target.insert doc
end
Mongodb has aggregate along with $out operator which allow to save subset into new collection. Following are the details :
$out Takes the documents returned by the aggregation pipeline and writes them to a specified collection.
The $out operation creates a new collection in the current database if one does not already exist.
The collection is not visible until the aggregation completes.
If the aggregation fails, MongoDB does not create the collection.
Syntax :
{ $out: "<output-collection>" }
Example
A collection books contains the following documents:
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
The following aggregation operation pivots the data in the books collection to have titles grouped by authors and then writes the results to the authors collection.
db.books.aggregate( [
{ $group : { _id : "$author", books: { $push: "$title" } } },
{ $out : "authors" }
] )
After the operation, the authors collection contains the following documents:
{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }
In the asked question, use following query and you will get new collection named 'col_20120105' in your database
db.products.aggregate([
{ $match : { date : "20120105" } },
{ $out : "col_20120105" }
]);
You can also use $merge aggregation pipeline stage.
db.full_set.aggregate([
{$match: {...}},
{ $merge: {
into: { db: 'your_db', coll: 'your_another_collection' },
on: '_id',
whenMatched: 'keepExisting',
whenNotMatched: 'insert'
}}
])