Return whole document from aggregation - mongodb

I'm using the following query to fetch one most recent comment for every post in database:
db.comments.aggregate([
{
"$match": {
"post_id": {
"$in": [ObjectId("52c5ce24dca32d32740c1435"), ObjectId("52c5ce24dca32d32740c15ad")]
}
}
},
{
"$sort": {"_id": -1}
},
{
"$group": {
"_id": "$post_id",
"lastComment": {
"$first": "$_id"
}
}
}
])
I expect it to return the whole comment's document but it only returns the _id field of each document. So what would be the proper way to get all most recent comments as a whole document (or at least include some other fields)?

Currently you cannot get the whole comment document via single $first operator. But you can include other necessary fields (similar to _id field) during $group step:
{
"$group": {
_id: "$post_id",
lastComment: { "$first": "$_id" },
field_1: { "$first": "$field_1" },
field_2: { "$first": "$field_2" },
// ...
field_N: { "$first": "$field_N" }
}
}
According to this JIRA ticket: https://jira.mongodb.org/browse/SERVER-5916, the whole document will be available to return from aggregation operations from 2.5.3 version. It will be possible using new variables: $$ROOT or $$CURRENT:
{
"$group": {
_id: "$post_id",
lastComment: { "$first": "$$CURRENT" }
}
}

As suggested, we can do :
{
"$group": {
_id: "$post_id",
lastComment: { "$first": "$$CURRENT" }
}
}
and then do use { '$replaceRoot': { 'newRoot': '$lastComment' } } on any mongodb server 3.4 or above to unwrap object from {lastComment:{actualEntireObj}},{lastComment:{actualEntireObj}} to {},{} this way it will get embedded $$ROOT document to the top level and replaces all other fields like _id returning from $group stage of aggregation.
db.collection.aggregate([
{
"$match": {
"post_id": {
"$in": [ObjectId("52c5ce24dca32d32740c1435"), ObjectId("52c5ce24dca32d32740c15ad")]
}
}
},
{
"$sort": { "_id": -1 }
},
{
"$group": {
_id: "$post_id",
lastComment: { "$first": "$$CURRENT" }
}
},
{ '$replaceRoot': { 'newRoot': '$lastComment' } }
])

Related

MongoDB document merge without a-priori knowledge of fields

I would like to merge several documents. Most of the fields have the same values but there might be one or two fields that have different values. These fields are unknown beforehand. Ideally I would like to merge all the documents keeping the fields that are the same as is but creating an array of values only for those fields that have some variation.
For my first approach I grouped by a common field to my documents and kept the first document, this however discards some information that varies in other fields.
group_documents = {
"$group": {
"_id": "$0020000E.Value",
"doc": {
"$first": "$$ROOT"
}
}
}
merge_documents = {
"$replaceRoot": {
"newRoot": "$doc"
}
}
write_collection = { "$out": { "db": "database", "coll": "records_nd" } }
objects = coll.aggregate(pipeline)
IF the fields that have different values where known I would have done something like this,
merge_sol1
or
merge_sol2
or
merge_sol3
The third solution is actually very close to my desired output and I could tweak it a bit. But these answers assume a-priori knowledge of the fields to be merged.
You can first convert $$ROOT to array of k-v tuples by $objectToArray. Then, $group all fields by $addToSet to put all distinct values into an array first. Then, check the size of the result array and conditionally pick the first item if the array size is 1 (i.e. the value is the same for every documents in the field); Otherwise, keep the result array. Finally, revert back to original document form by $arrayToObject.
db.collection.aggregate([
{
$project: {
_id: "$key",
arr: {
"$objectToArray": "$$ROOT"
}
}
},
{
"$unwind": "$arr"
},
{
$match: {
"arr.k": {
$nin: [
"key",
"_id"
]
}
}
},
{
$group: {
_id: {
id: "$_id",
k: "$arr.k"
},
v: {
"$addToSet": "$arr.v"
}
}
},
{
$project: {
_id: "$_id.id",
arr: [
{
k: "$_id.k",
v: {
"$cond": {
"if": {
$gt: [
{
$size: "$v"
},
1
]
},
"then": "$v",
"else": {
$first: "$v"
}
}
}
}
]
}
},
{
"$project": {
doc: {
"$arrayToObject": "$arr"
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
{
_id: "$_id"
},
"$doc"
]
}
}
}
])
Mongo Playground

MongoDB: How to merge all documents into a single document in an aggregation pipeline

I have the current aggregation output as follows:
[
{
"courseCount": 14
},
{
"registeredStudentsCount": 1
}
]
The array has two documents. I would like to combine all the documents into a single document having all the fields in mongoDB
db.collection.aggregate([
{
$group: {
_id: 0,
merged: {
$push: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: {
"$mergeObjects": "$merged"
}
}
}
])
Explained:
Group the output documents in one field with push
Replace the document root with the merged objects
Plyaground
{
$group: {
"_id": "null",
data: {
$push: "$$ROOT"
}
}
}
When you add this as the last pipeline, it will put all the docs under data, but here data would be an array of objects.
In your case it would be
{ "data":[
{
"courseCount": 14
},
{
"registeredStudentsCount": 1
}
] }
Another approach would be,
db.collection.aggregate([
{
$group: {
"_id": "null",
f: {
$first: "$$ROOT",
},
l: {
$last: "$$ROOT"
}
}
},
{
"$project": {
"output": {
"courseCount": "$f.courseCount",
"registeredStudentsCount": "$l.registeredStudentsCount"
},
"_id": 0
}
}
])
It's not dynamic as first one. As you have two docs, you can use this approach. It outputs
[
{
"output": {
"courseCount": 14,
"registeredStudentsCount": 1
}
}
]
With extra pipeline in the second approach
{
"$replaceRoot": {
"newRoot": "$output"
}
}
You will get the output as
[
{
"courseCount": 14,
"registeredStudentsCount": 1
}
]

Return original documents only from mongoose group/aggregation operation

I have a filter + group operation on a bunch of documents (books). The grouping is to return only latest versions of books that share the same book_id (name). The below code works, but it's untidy since it returns redundant information:
return Book.aggregate([
{ $match: generateMLabQuery(rawQuery) },
{
$sort: {
"published_date": -1
}
},
{
$group: {
_id: "$book_id",
books: {
$first: "$$ROOT"
}
}
}
])
I end up with an array of objects that looks like this:
[{ _id: "aedrtgt6854earg864", books: { singleBookObject } }, {...}, {...}]
Essentially I only need the singleBookObject part, which is the original document (and what I'd be getting if I had done only the $match operation). Is there a way to get rid of the redundant _id and books parts within the aggregation pipeline?
You can use $replaceRoot
Book.aggregate([
{ "$match": generateMLabQuery(rawQuery) },
{ "$sort": { "published_date": -1 }},
{ "$group": {
"_id": "$book_id",
"books": { "$first": "$$ROOT" }
}},
{ "$replaceRoot": { "newRoot": "$books" } }
])

$push with $group in mongo aggregation

I wrote this query but its not exactly i want how to use push operator for expected result-
Is it not possible to use push with addFields and project pipeline.
db.getCollection("event").aggregate([ {$match:{"name":"Add
to Cart"}}, {$group:{"_id":"browser",count:{$sum:1}}}]);
output:
{_id:chrome.count:3}
{_id:firefox,count:1}
{_id:edge,count:1}
expect output:
{
browser:[
{name:"chrome",count:3},
{name:"firefox",count:1},
{name:"egde",count:1}
]
}
my collection:
{
_id:1,
name:"Add to Cart"
"browser":"chrome"
}
{
_id:2,
name:"Searched",
"browser":"chrome"
}
{
_id:3,
name:"Add To Cart",
"browser":"edge"
}
{
_id:4,
name:"Item View",
"browser":"chrome"
}
{
_id:5,
name:"Add To Cart",
"browser":"Firefox"
}
You need to use one more $group stage here
db.collection.aggregate([
{ "$group": {
"_id": "$browser",
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": null,
"browser": {
"$push": {
"name": "$_id",
"count": "$count"
}
}
}},
{ "$project": { "_id": 0 }}
])

How to get top level elements as well as one level down array elements aggregate in one mongo query?

I have 2 mongo aggregate queries that work well separately -
db.transfer_orders.aggregate([
{
$match: {
"request_timestamp": { $gte: ISODate("2017-10-00T00:00:00.000Z") },
"request_timestamp": { $lt: ISODate("2017-10-12T00:00:00.000Z") },
"purpose": "POSITIONING"
}
},
{
$group: {
_id: null,
to_count: { $sum: 1 },
qty: { $sum: "$quantity" }
}
},
{
$project: {
_id: 0,
"to_count": "$to_count",
"qty": "$qty"
}
}
])
and
db.transfer_orders.aggregate([
{
$match: {
"request_timestamp": { $gte: ISODate("2017-10-00T00:00:00.000Z") },
"request_timestamp": { $lt: ISODate("2017-10-12T00:00:00.000Z") },
"purpose": "POSITIONING"
}
},
{
$unwind: "$adjustments"
},
{
$group: {
_id: null,
totalChangeQty: { $sum: "$adjustments.change_in_quantity"}
}
},
{
$project: {
_id: 0,
"adjusted_quantity": "$totalChangeQty"
}
}
])
The first query returns aggregate data of elements at the top level of the document, { "to_count" : 7810, "qty" : 19470 }
The second query returns aggregate data of elements at one level below the top level for the "adjustments" array - { "adjusted_quantity" : -960 }
Is there a way to write this as one query that will return both sets of data since the match criteria is the same for both?
The following aggregate operation should suffice since it has a pipeline after the $match step that introduces the new field adjusted_quantity. This is made possible using the $sum which returns the sum of the specified list of expressions for each document.
Once it reaches the $group stage, you can retain the value using the $sum operator.
db.transfer_orders.aggregate([
{
"$match": {
"request_timestamp": { "$gte": ISODate("2017-10-00T00:00:00.000Z") },
"request_timestamp": { "$lt": ISODate("2017-10-12T00:00:00.000Z") },
"purpose": "POSITIONING"
}
},
{
"$addFields": {
"adjusted_quantity": {
"$sum": "$adjustments.change_in_quantity"
}
}
},
{
"$group": {
"_id": null,
"to_count": { "$sum": 1 },
"qty": { "$sum": "$quantity" },
"adjusted_quantity": { "$sum": "$adjusted_quantity" }
}
}
])