Aggregate objects' nested array occurrences count - mongodb

I'm trying to aggregate logs in that way, so I can get count of how many times keywords were favorited by particular user. What I came up is following query:
db.a.aggregate([
{$unwind: "$keywords"},
{$group : {_id : {word : "$keywords", user : "$favorited_by"}, count : {$sum : 1}}}
]);
But it produces output:
{ "_id" : { "word" : "another", "user" : "too_creepy" }, "count" : 1 }
{ "_id" : { "word" : "test", "user" : "too_creepy" }, "count" : 2 }
Whilst I want to get something like this:
INPUT
{
_id: ObjectId("5475cf117ccee624583ba94a"),
favorited_by: "too_creepy",
keywords: [
"test"
]
},
{
_id: ObjectId("5475cf117ccee624583ba949"),
favorited_by: "too_creepy",
keywords: [
"test"
]
},
{
_id: ObjectId("5475cf117ccee624583ba949"),
favorited_by: "too_creepy",
keywords: [
"anotherone"
]
},
{
_id: ObjectId("5475cf117ccee624583ba09a"),
favorited_by: "hello_world",
keywords: [
"test"
]
}
OUTPUT
{
favorited_by: "too_creepy",
keywords: [
{keyword: "test", count: 2},
{keyword: "anotherone", count: 1}
]
},
{
favorited_by: "hello_world",
keywords: [
{keyword: "test", count: 1}
]
}
Any ideas how can to write this query if it's even possible?

You can do that by adding a second $group to your pipeline followed up with a final $project to reshape the output a bit:
db.a.aggregate([
{$unwind: "$keywords"},
{$group: {_id: {word: "$keywords", user: "$favorited_by"}, count: {$sum: 1}}},
// Group again on just user, and use $push to assemble an array of their keywords
{$group: {
_id: '$_id.user',
keywords: {$push: {keyword: '$_id.word', count: '$count'}}
}},
// Reshape the output
{$project: {favorited_by: '$_id', keywords: 1, _id: 0}}
]);
Output:
{
"keywords" : [
{
"keyword" : "anotherone",
"count" : 1
},
{
"keyword" : "test",
"count" : 2
}
],
"favorited_by" : "too_creepy"
},
{
"keywords" : [
{
"keyword" : "test",
"count" : 1
}
],
"favorited_by" : "hello_world"
}

Related

Maintain order as original document while using addToSet in MongoDb

I read the documentation and found that addToSet doesn't guarantee order.
But is there any way I can preserve the order as the original document.
My Query is :-
aggregate([{$match: {
$or:[{"Name.No":"119"},{"Name.No":"120"}]
}}, {$project: {
x:{$objectToArray:"$Results"}
}},{$unwind: "$x"},{$group: {_id: "$x.k", distinctVals: {$addToSet: "$x.v.TCR"}}}])
Sample Data:
{"Name" : {"No." : "119","Time" : "t"},
"Results":{"K1" : {"Counters" : x, "TCR" : [{"Name" : "K11", "Result" : "PASSED"},
{"Name" : "K12","Result" : "FAILED"},
{"Name" : "K13","Result" : "PASSED"}]
},
"K2" : {"Counters": y, "TCR" : [{"Name" : "K21","Result" : "PASSED"},
{"Name" : "K22","Result" : "PASSED"}]
}
}
}
}
Job2;
{"Name" : {"No." : "120","Time" : "t1"},
"Results":{"K1" : {"Counters" : x, "TCR" : [{"Name" : "K11", "Result" : "PASSED"},
{"Name" : "K12","Result" : "PASSED"},
{"Name" : "K13","Result" : "FAILED"}]
},
"K3" : {"Counters": y, "TCR" : [{"Name" : "K31","Result" : "PASSED"},
{"Name" : "K32","Result" : "PASSED"}]
}
}
}
Expected;
{"Name" : {"No." : "119-120","Time" : "lowest(t,t1)"},
"Results":{"K1" : {"Counters" : x, "TCR" : [{"Name" : "K11", "Result" : "PASSED"},
{"Name" : "K12","Result" : "PASSED"},
{"Name" : "K13","Result" : "PASSED"}]
},
"K2" : {"Counters": y, "TCR" : [{"Name" : "K21","Result" : "PASSED"},
{"Name" : "K22","Result" : "PASSED"}]
},
"K3" : {"Counters": y, "TCR" : [{"Name" : "K31","Result" : "PASSED"},
{"Name" : "K32","Result" : "PASSED"}]
}
}
}
I want to maintain the order same as original document, also every time document would change,so I cant sort based on any parameter.
convert Results object to array format using $objectToArray
$unwind deconstruct Results array
$unwind deconstruct Results.v.TCR array
$match to filter PASSED Result
$group by Results.k and get first Name, get first Counters, construct array of Results.v.TCR
$group by null and get minimum Time, construct unique array of No, construct Results array in key-value pair, $reduce to iterate loop of TCR and remove duplicate documents
$project to show required fields, convert Results array to object using $arrayToObject, convert No array to string and concat with "-"
db.collection.aggregate([
{ $addFields: { Results: { $objectToArray: "$Results" } } },
{ $unwind: "$Results" },
{ $unwind: "$Results.v.TCR" },
{ $match: { "Results.v.TCR.Result": "PASSED" } },
{
$group: {
_id: "$Results.k",
Name: { $first: "$Name" },
Counters: { $first: "$Results.v.Counters" },
TCR: { $push: "$Results.v.TCR" }
}
},
{
$group: {
_id: null,
Time: { $min: "$Name.Time" },
No: { $addToSet: "$Name.No" },
Results: {
$push: {
k: "$_id",
v: {
Counters: "$Counters",
TCR: {
$reduce: {
input: "$TCR",
initialValue: [],
in: {
$cond: [
{
$in: [
{
Name: "$$this.Name",
Result: "$$this.Result"
},
"$$value"
]
},
"$$value",
{
$concatArrays: [
"$$value",
[
{
Name: "$$this.Name",
Result: "$$this.Result"
}
]
]
}
]
}
}
}
}
}
}
}
},
{
$project: {
_id: 0,
Results: { $arrayToObject: "$Results" },
Name: {
Time: "$Time",
No: {
$reduce: {
input: "$No",
initialValue: "",
in: {
$concat: [
"$$value",
{ $cond: [{ $eq: ["$$value", ""]}, "", "-"] },
"$$this"
]
}
}
}
}
}
}
])
Playground
The "." (dot) in "No." field is not valid, it may cause issue in mongodb query operations, i would suggest do not use "." (dot) as field name.

Aggregate multiple arrays by field value [duplicate]

This question already has an answer here:
mongodb aggregate multiple arrays
(1 answer)
Closed 4 years ago.
With a document as shown below, I am trying to aggregate the data so my final output is the sum of each users received and sent values.
Document
{
"_id" : 1,
"received" : [
{ "name" : "david", "value" : 15 },
{ "name" : "sarah", "value" : 10 },
{ "name" : "sarah", "value" : 15 }
],
"sent" : [
{ "name" : "david", "value" : 10 },
{ "name" : "sarah", "value" : 20 },
{ "name" : "david", "value" : 15 }
]
}
Desired Result (or similar)
{
"name": "david",
"received": 15,
"sent": 25
},
{
"name": "sarah",
"received": 25,
"sent": 20
}
I have tried to unwind received and sent, but I am ending up with a lot of duplicates and honestly I have no idea if this sort of output can even be created without bringing the dataset into my client first.
Further searching of StackOverflow has lead me to mongodb aggregate multiple arrays that provides a suitable answer. I have flagged this as a duplicate.
My final solution, created by following the above post, is as follows;
[
{
'$addFields': {
'received.type': 'received',
'sent.type': 'sent'
}
}, {
'$project': {
'movements': {
'$concatArrays': [
'$received', '$sent'
]
}
}
}, {
'$unwind': {
'path': '$movements'
}
}, {
'$project': {
'name': '$movements.name',
'type': '$movements.type',
'value': '$movements.value'
}
}, {
'$group': {
'_id': '$name',
'sent': {
'$sum': {
'$cond': {
'if': {
'$eq': [
'$type', 'sent'
]
},
'then': '$value',
'else': 0
}
}
},
'received': {
'$sum': {
'$cond': {
'if': {
'$eq': [
'$type', 'received'
]
},
'then': '$value',
'else': 0
}
}
}
}
}
]
add $match stage on the top to filter the documents
$facet to compute two different result by sent and received on same document
$group to merge the by sent and received fields of previous stage
$unwind & $unwind to unwind the merged array of array
$replaceRoot to replace root with byBoth
$group to merge the results back
$project to filter and project required fields only
aggregation pipeline
db.ttt.aggregate([
{$facet : {
"byReceived" :[
{$unwind : "$received"},
{$group: {_id : "$received.name", received : {$sum : "$received.value"}}}
],
"bySent" :[
{$unwind : "$sent"},
{$group: {_id : "$sent.name", sent : {$sum : "$sent.value"}}}
]
}},
{$group: {_id:null, byBoth : {$push :{$concatArrays : ["$bySent", "$byReceived"]}}}},
{$unwind : "$byBoth"},
{$unwind : "$byBoth"},
{$replaceRoot: { newRoot: "$byBoth" }},
{$group : {_id : "$_id", sent : {$sum : "$sent"}, received : {$sum : "$received"}}},
{$project : {_id:0, name:"$_id", sent:"$sent", received:"$received"}}
])
result
{ "name" : "david", "sent" : 25, "received" : 15 }
{ "name" : "sarah", "sent" : 20, "received" : 25 }

How to join two Aggregation results in MongoDB?

I have a data set looks as
{"BrandId":"a","SessionId":100,"Method": "POST"}
{"BrandId":"a","SessionId":200,"Method": "PUT"}
{"BrandId":"a","SessionId":200,"Method": "GET"}
{"BrandId":"b","SessionId":300,"Method": "GET"}
I wrote aggregation count distinct session id by brandid:
db.collection.aggregate([
{$group: {
"_id": {
brand: "$BrandId",
session: "$SessionId"
},
count: {$sum: 1}
}},
{$group: {
_id: "$_id.brand",
countSession:{$sum:1}
}}
])
The expected result of the query is :
{ "_id" : "a", "countSession" : 2 }
{ "_id" : "b", "countSession" : 1 }
Another query is to count where the Method is POST by brand:
db.collection.aggregate([
{$match: {Method:"POST"}},
{$group: {
_id: '$BrandId',
countPOST:{$sum:1}
}}
])
The expected result:
{ "_id" : "a", "countPOST" : 1 }
{ "_id" : "b", "countSession" : 0 }
And now, I want to combine these two query and get the expected result as following:
{"BrandId:"a","countSession":2,"countPOST":1}
{"BrandId:"b","countSession":1,"countPOST":0}
I do not how to combine these two result of two aggregation, anyone can help?
You can use $cond operator as follows.
db.Collection.aggregate(
{
'$group': {
'_id': {'BrandId':'$BrandId','Session': '$SessionId'},
'countPOST':{
'$sum':{
'$cond': [{'$eq':['$Method','POST']},1,0]
}
}
}
},
{
'$group': {
'_id': '$_id.BrandId',
'countSession': {'$sum':1},
'countPOST': {'$sum': '$countPOST'}
}
}
)
Ouput:
{
"result" : [
{
"_id" : "a",
"countSession" : 2,
"countPOST" : 1
},
{
"_id" : "b",
"countSession" : 1,
"countPOST" : 0
}
],
"ok" : 1
}

MongoDB Aggregation using nested element

I have a collection with documents like this:
"_id" : "15",
"name" : "empty",
"location" : "5th Ave",
"owner" : "machine",
"visitors" : [
{
"type" : "M",
"color" : "blue",
"owner" : "Steve Cooper"
},
{
"type" : "K",
"color" : "red",
"owner" : "Luis Martinez"
},
// A lot more of these
]
}
I want to group by visitors.owner to find which owner has the most visits, I tried this:
db.mycol.aggregate(
[
{$group: {
_id: {owner: "$visitors.owner"},
visits: {$addToSet: "$visits"},
count: {$sum: "comments"}
}},
{$sort: {count: -1}},
{$limit: 1}
]
)
But I always get count = 0 and visits not corresponding to one owner :/
Please help
Try the following aggregation pipeline:
db.mycol.aggregate([
{
"$unwind": "$visitors"
},
{
"$group": {
"_id": "$visitors.owner",
"count": { "$sum": 1}
}
},
{
"$project": {
"_id": 0,
"owner": "$_id",
"visits": "$count"
}
}
]);
Using the sample document you provided in your question, the result is:
/* 0 */
{
"result" : [
{
"owner" : "Luis Martinez",
"visits" : 1
},
{
"owner" : "Steve Cooper",
"visits" : 1
}
],
"ok" : 1
}

Mongodb creating alias in a query

What is the mongodb's equivalent to this query:
SELECT "foo" as bar, id as "spec" from tablename
It is possible to create new field with given name and value taken from another field with $project:
{
"_id" : 1,
title: "abc123",
isbn: "0001122223334",
author: { last: "zzz", first: "aaa" },
copies: 5
}
The following $project stage adds the new fields isbn, lastName, and copiesSold:
db.books.aggregate(
[
{
$project: {
title: 1,
isbn: {
prefix: { $substr: [ "$isbn", 0, 3 ] },
group: { $substr: [ "$isbn", 3, 2 ] },
publisher: { $substr: [ "$isbn", 5, 4 ] },
title: { $substr: [ "$isbn", 9, 3 ] },
checkDigit: { $substr: [ "$isbn", 12, 1] }
},
lastName: "$author.last",
copiesSold: "$copies"
}
}
]
)
http://docs.mongodb.org/manual/reference/operator/aggregation/project/#pipe._S_project
You can use any operator like toUpper or toLower or concat or any other operator you feel like which you think you can work on and create an alias.
Example:
In the following example created_time is a field in the collection.
(I am not good with syntax so you can correct it, but this is the approach)
{$project {
"ALIAS_one" : {"$concat" : "$created_time"},
"ALIAS_two" : {"$concat" : "$created_time"},
"ALIAS_three" : {"$concat" : "$created_time"}
}}
So using an operator in that fashion you can create as many as aliases as your like.
you can use this, maybe help
database data
{ "_id" : "5ab0f445edf197158835be63", "userid" : "5aaf15c28264ee17fe869ad8", "lastmodified" : ISODate("2018-03-21T07:04:41.735Z") }
{ "_id" : "5ab0f445edf197158835be64", "userid" : "5aaf15c28264ee17fe869ad8", "lastmodified" : ISODate("2018-02-20T12:31:08.896Z") }
{ "_id" : "5ab0f445edf197158835be65", "userid" : "5aaf15c28264ee17fe869ad7", "lastmodified" : ISODate("2018-02-20T02:31:08.896Z") }
mongo command
db.zhb_test.aggregate(
[{
$group: {
_id: {
$dateToString: {
format: "%Y-%m",
date: "$lastmodified"
}
},
count: {
$sum: 1
}
}
},
{
$project: {
"month": "$_id",
count: 1,
"_id": 0
}
}])
result
{ "count" : 2, "month" : "2018-02" }
{ "count" : 1, "month" : "2018-03" }