MongoDB grouping aggregate query with relationship - mongodb

Let's say that I have the following documents in the association collection:
{
"id" : 1,
"parentId" : 1,
"position" : {
"x" : 1,
"y" : 1
},
"tag" : "Beta"
},
{
"id" : 2,
"parentId" : 2,
"position" : {
"x" : 2,
"y" : 2
},
"tag" : "Alpha"
},
{
"id" : 3,
"parentId" : 1,
"position" : {
"x" : 3,
"y" : 3
},
"tag" : "Delta"
},
{
"id" : 4,
"parentId" : 1,
"position" : {
"x" : 4,
"y" : 4
},
"tag" : "Gamma"
},
{
"id" : 5,
"parentId" : 2,
"position" : {
"x" : 5,
"y" : 6
},
"tag" : "Epsilon"
}
I would like to create an aggregate query to produce the following result:
{
"_id" : 2,
"position" : {
"x" : 2,
"y" : 2
},
"tag" : "Alpha",
"children" : [
{
"position" : {
"x" : 5,
"y" : 6
},
"tag" : "Epsilon"
}
]
},
{
"_id" : 1,
"position" : {
"x" : 1,
"y" : 1
},
"tag" : "Beta"
"children" : [
{
"position" : {
"x" : 3,
"y" : 3
},
"tag" : "Delta"
},
{
"position" : {
"x" : 4,
"y" : 4
},
"tag" : "Gamma"
}
]
}
However, I was able only to create the following grouping query which puts "all-the-related" documents in children array:
db.association.aggregate([{
$group : {
_id : "$parentId",
children : {
$push : {
position : "$position",
tag :"$tag"
}
}
}
}])
I don't know how to filter out "position" and "tag" specific to "parent" points and put them at the top level.

Although Valijon's answer is working, it needs to be sorted before.
Here's a solution without the need of sorting, but using graphLookup stage (which is perfect to achieve what you need)
db.collection.aggregate([
{
$graphLookup: {
from: "collection",
startWith: "$id",
connectFromField: "id",
connectToField: "parentId",
as: "children",
}
},
{
$match: {
$expr: {
$gt: [
{
$size: "$children"
},
0
]
}
}
},
{
$addFields: {
children: {
$filter: {
input: "$children",
as: "child",
cond: {
$ne: [
"$id",
"$$child.id"
]
}
}
}
}
}
])
The first stage is doing the job.
The second one is here to filter documents that don't have any child.
The third is present only to remove parent from children array. But if you can remove self-reference in the parent, this last stage will not be needed anymore.
You can try it here

By making sure the documents are sorted (parent - children 1 - children 2 ... - children n), we can merge grouped document with the 1st child (which is parent). In the last step, we need to remove parent from children array.
Try this one:
db.association.aggregate([
{
$sort: {
parentId: 1,
id: 1
}
},
{
$group: {
_id: "$parentId",
children: {
$push: {
position: "$position",
tag: "$tag"
}
}
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$$ROOT",
{
$arrayElemAt: [
"$children",
0
]
}
]
}
}
},
{
$addFields: {
children: {
$slice: [
"$children",
1,
{
$size: "$children"
}
]
}
}
}
])
MongoPlayground

Related

MongoDB - Group by and count value, but treat per record as one

I want to group by and count follow_user.tags.tag_id per record, so no matter how many times the same tag_id show up on the same record, it only counts as 1.
My database structure looks like this:
{
"external_userid" : "EXID1",
"follow_user" : [
{
"userid" : "USERID1",
"tags" : [
{
"tag_id" : "TAG1"
}
]
},
{
"userid" : "USERID2",
"tags" : [
{
"tag_id" : "TAG1"
},
{
"tag_id" : "TAG2"
}
]
}
]
},
{
"external_userid" : "EXID2",
"follow_user" : [
{
"userid" : "USERID1",
"tags" : [
{
"tag_id" : "TAG2"
}
]
}
]
}
Here's my query:
[
{ "$unwind": "$follow_user" }, { "$unwind": "$follow_user.tags" },
{ "$group" : { "_id" : { "follow_user᎐tags᎐tag_id" : "$follow_user.tags.tag_id" }, "COUNT(_id)" : { "$sum" : 1 } } },
{ "$project" : { "total" : "$COUNT(_id)", "tagId" : "$_id.follow_user᎐tags᎐tag_id", "_id" : 0 } }
]
What I expected:
{
"total" : 1,
"tagId" : "TAG1"
},
{
"total" : 2,
"tagId" : "TAG2"
}
What I get:
{
"total" : 2,
"tagId" : "TAG1"
},
{
"total" : 2,
"tagId" : "TAG2"
}
$set - Create a new field follow_user_tags.
1.1. $setUnion - To distinct the value from the Result 1.1.1.
1.1.1. $reduce - Add the value of follow_user.tags.tag_id into array.
$unwind - Deconstruct follow_user_tags array field to multiple documents.
$group - Group by follow_user_tags and perform total count via $sum.
$project - Decorate output document.
db.collection.aggregate([
{
$set: {
follow_user_tags: {
$setUnion: {
"$reduce": {
"input": "$follow_user.tags",
"initialValue": [],
"in": {
"$concatArrays": [
"$$value",
"$$this.tag_id"
]
}
}
}
}
}
},
{
$unwind: "$follow_user_tags"
},
{
$group: {
_id: "$follow_user_tags",
total: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
tagId: "$_id",
total: 1
}
}
])
Sample Mongo Playground

Mongodb find maximum scored student embeddded array

I am new in mongodb ,Please help me out
I have more than 500 students details like this..
{
"_id" : 7,
"name" : "Salena Olmos",
"scores" : [
{
"score" : 90.37826509157176,
"type" : "exam"
},
{
"score" : 42.48780666956811,
"type" : "quiz"
},
{
"score" : 96.52986171633331,
"type" : "homework"
}
]
},
/* 2 */
{
"_id" : 8,
"name" : "Daphne Zheng",
"scores" : [
{
"score" : 22.13583712862635,
"type" : "exam"
},
{
"score" : 14.63969941335069,
"type" : "quiz"
},
{
"score" : 75.94123677556644,
"type" : "homework"
}
]
}
Need to find one student details who got highest marks in "type" exam
Output as follows...
{
"_id" : 7,
"name" : "Salena Olmos",
"scores" : [
{
"score" : 90.37826509157176,
"type" : "exam"
},
{
"score" : 42.48780666956811,
"type" : "quiz"
},
{
"score" : 96.52986171633331,
"type" : "homework"
}
]
}
I need one student details from whole collection. The problem I am facing that need to search in embedded array "score" as well as "type".
Someone please help me
Try this
db.collection.aggregate([
{
$group: {
_id: "$_id",
scores: {
$first: "$scores"
},
data: {
$push: "$$ROOT"
}
}
},
{
$unwind: "$data"
},
{
$match: {
"data.scores.type": "exam"
}
},
{
$sort: {
"data.scores.score": -1
}
},
{
$project: {
_id: 1,
name: "$data.name",
scores: "$scores"
}
},
{
$limit: 1
}
])
Sample Playground
While this doesn't answer the question, it is related. This one filters out all the subdocuments which match the conditions "greater or equal 90" and type "exam"
db.collection.aggregate([
{
$match: {
"scores.score": {
$gte: 90
},
"scores.type": "exam"
}
},
{
$project: {
name: true,
list: {
$filter: {
input: "$scores",
as: "list",
cond: {
$and: [
{
$gt: [
"$$list.score",
90
]
},
{
$eq: [
"$$list.type",
"exam"
]
}
]
}
}
}
}
}
])
which returns
[
{
"_id": 7,
"list": [
{
"score": 90.37826509157176,
"type": "exam"
}
],
"name": "Salena Olmos"
}
]
https://mongoplayground.net/p/hYnVzZbuNFI
If you want the entire document, then add doc: "$$ROOT", to the projection.

Partition data around a match query during aggregation

What I have been trying to get my head around is to perform some kind of partitioning(split by predicate) in a mongo query. My current query looks like:
db.posts.aggregate([
{"$match": { $and:[ {$or:[{"toggled":false},{"toggled":true, "status":"INACTIVE"}]} , {"updatedAt":{$gte:1549786260000}} ] }},
{"$unwind" :"$interests"},
{"$group" : {"_id": {"iid": "$interests", "pid":"$publisher"}, "count": {"$sum" : 1}}},
{"$project":{ _id: 0, "iid": "$_id.iid", "pid": "$_id.pid", "count": 1 }}
])
This results in the following output:
{
"count" : 3.0,
"iid" : "INT456",
"pid" : "P789"
}
{
"count" : 2.0,
"iid" : "INT789",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P123"
}
All good so far, but then I had realized that for the documents that match the specific filter {"toggled":true, "status":"INACTIVE"}, I would rather decrement the count (-1). (considering the eventual value can be negative as well.)
Is there a way to somehow partition the data after match to make sure different grouping operations are performed for both the collection of documents?
Something that sounds similar to what I am looking for is
$mergeObjects, or maybe $reduce, but not much that I can relate from the documentation examples.
Note: I can sense, one straightforward way to deal with this would be to perform two queries, but I am looking for a single query to perform the operation.
Sample documents for the above output would be:
/* 1 */
{
"_id" : ObjectId("5d1f7******"),
"id" : "CON123",
"title" : "Game",
"content" : {},
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P789",
"interests" : [
"INT456"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 2 */
{
"_id" : ObjectId("5d1f8******"),
"id" : "CON456",
"title" : "Home",
"content" : {},
"status" : "INACTIVE",
"toggle":true,
"publisher" : "P789",
"interests" : [
"INT456",
"INT789"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 3 */
{
"_id" : ObjectId("5d0e9******"),
"id" : "CON654",
"title" : "School",
"content" : {},
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P789",
"interests" : [
"INT123",
"INT456",
"INT789"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 4 */
{
"_id" : ObjectId("5d207*******"),
"id" : "CON789",
"title":"Stack",
"content" : { },
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P123",
"interests" : [
"INT123"
],
"updatedAt" : NumberLong(1582078628264)
}
What I am looking forward to as a result though is
{
"count" : 1.0, (2-1)
"iid" : "INT456",
"pid" : "P789"
}
{
"count" : 0.0, (1-1)
"iid" : "INT789",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P123"
}
This aggregation gives the desired result.
db.posts.aggregate( [
{ $match: { updatedAt: { $gte: 1549786260000 } } },
{ $facet: {
FALSE: [
{ $match: { toggle: false } },
{ $unwind : "$interests" },
{ $group : { _id : { iid: "$interests", pid: "$publisher" }, count: { $sum : 1 } } },
],
TRUE: [
{ $match: { toggle: true, status: "INACTIVE" } },
{ $unwind : "$interests" },
{ $group : { _id : { iid: "$interests", pid: "$publisher" }, count: { $sum : -1 } } },
]
} },
{ $project: { result: { $concatArrays: [ "$FALSE", "$TRUE" ] } } },
{ $unwind: "$result" },
{ $replaceRoot: { newRoot: "$result" } },
{ $group : { _id : "$_id", count: { $sum : "$count" } } },
{ $project:{ _id: 0, iid: "$_id.iid", pid: "$_id.pid", count: 1 } }
] )
[ EDIT ADD ]
The output from the query using the input data from the question post:
{ "count" : 1, "iid" : "INT123", "pid" : "P789" }
{ "count" : 1, "iid" : "INT123", "pid" : "P123" }
{ "count" : 0, "iid" : "INT789", "pid" : "P789" }
{ "count" : 1, "iid" : "INT456", "pid" : "P789" }
[ EDIT ADD 2 ]
This query gets the same result with different approach (code):
db.posts.aggregate( [
{
$match: { updatedAt: { $gte: 1549786260000 } }
},
{
$unwind : "$interests"
},
{
$group : {
_id : {
iid: "$interests",
pid: "$publisher"
},
count: {
$sum: {
$switch: {
branches: [
{ case: { $eq: [ "$toggle", false ] },
then: 1 },
{ case: { $and: [ { $eq: [ "$toggle", true] }, { $eq: [ "$status", "INACTIVE" ] } ] },
then: -1 }
]
}
}
}
}
},
{
$project:{
_id: 0,
iid: "$_id.iid",
pid: "$_id.pid",
count: 1
}
}
] )
[ EDIT ADD 3 ]
NOTE:
The facet query runs the two facets (TRUE and FALSE) on the same set of documents; it is like two queries running in parallel. But, there is some duplication of code as well as additional stages for shaping the documents down the pipeline to get the desired output.
The second query avoids the code duplication, and there are much lesser stages in the aggregation pipeline. This will make difference when the input dataset has a large number of documents to process - in terms of performance. In general, lesser stages means lesser iterations of the documents (as a stage has to scan the documents which are output from the previous stage).

How to $setDifference in array & Object using Mongo DB

UserDetails
{
"_id" : "5c23536f807caa1bec00e79b",
"UID" : "1",
"name" : "A",
},
{
"_id" : "5c23536f807caa1bec00e78b",
"UID" : "2",
"name" : "B",
},
{
"_id" : "5c23536f807caa1bec00e90",
"UID" : "3",
"name" : "C"
}
UserProducts
{
"_id" : "5c23536f807caa1bec00e79c",
"UPID" : "100",
"UID" : "1",
"status" : "A"
},
{
"_id" : "5c23536f807caa1bec00e79c",
"UPID" : "200",
"UID" : "2",
"status" : "A"
},
{
"_id" : "5c23536f807caa1bec00e52c",
"UPID" : "300",
"UID" : "3",
"status" : "A"
}
Groups
{
"_id" : "5bb20d7556db6915846da55f",
"members" : {
"regularStudent" : [
"200" // UPID
],
}
},
{
"_id" : "5bb20d7556db69158468878",
"members" : {
"regularStudent" : {
"0" : "100" // UPID
}
}
}
Step 1
I have to take UID from UserDetails check with UserProducts then take UPID from UserProducts
Step 2
we have to check this UPID mapped to Groups collection or not ?.
members.regularStudent we are mapped UPID
Step 3
Suppose UPID not mapped means i want to print the UPID from from UserProducts
I have tried but couldn't complete this, kindly help me out on this.
Expected Output:
["300"]
Note: Expected Output is ["300"] , because UserProducts having UPID 100 & 200 but Groups collection mapped only 100& 200.
My Code
var queryResult = db.UserDetails.aggregate(
{
$lookup: {
from: "UserProducts",
localField: "UID",
foreignField: "UID",
as: "userProduct"
}
},
{ $unwind: "$userProduct" },
{ "$match": { "userProduct.status": "A" } },
{
"$project": { "_id" : 0, "userProduct.UPID" : 1 }
},
{
$group: {
_id: null,
userProductUPIDs: { $addToSet: "$userProduct.UPID" }
}
});
let userProductUPIDs = queryResult.toArray()[0].userProductUPIDs;
db.Groups.aggregate([
{
$unwind: "$members.regularStudent"
},
{
$group: {
_id: null,
UPIDs: { $addToSet: "$members.regularStudent" }
}
},
{
$project: {
members: {
$setDifference: [ userProductUPIDs , "$UPIDs" ]
},
_id : 0
}
}
])
My Output
{
"members" : [
"300",
"100"
]
}
You need to fix that second aggregation and get all UPIDs as an array. To achieve that you can use $cond and based on $type either return an array or use $objectToArray to run the conversion, try:
db.Groups.aggregate([
{
$project: {
students: {
$cond: [
{ $eq: [ { $type: "$members.regularStudent" }, "array" ] },
"$members.regularStudent",
{ $map: { input: { "$objectToArray": "$members.regularStudent" }, as: "x", in: "$$x.v" } }
]
}
}
},
{
$unwind: "$students"
},
{
$group: {
_id: null,
UPIDs: { $addToSet: "$students" }
}
},
{
$project: {
members: {
$setDifference: [ userProductUPIDs , "$UPIDs" ]
},
_id : 0
}
}
])

MongoDB aggregate array of objects together by object id and count occurences

I'm trying to figure out what I'm doing wrong, I have collected the following, "Subset of data", "Desired output"
This is how my data objects look
[{
"survey_answers": [
{
"id": "9ca01568e8dbb247", // As they are, this is the key to groupBy
"option_answer": 5, // Represent the index of the choosen option
"type": "OPINION_SCALE" // Opinion scales are 0-10 (meaning elleven options)
},
{
"id": "ba37125ec32b2a99",
"option_answer": 3,
"type": "LABELED_QUESTIONS" // Labeled questions are 0-x (they can change it from survey to survey)
}
],
"survey_id": "test"
},
{
"survey_answers": [
{
"id": "9ca01568e8dbb247",
"option_answer": 0,
"type": "OPINION_SCALE"
},
{
"id": "ba37125ec32b2a99",
"option_answer": 3,
"type": "LABELED_QUESTIONS"
}
],
"survey_id": "test"
}]
My desired output is:
[
{
id: '9ca01568e8dbb247'
results: [
{ _id: 5, count: 1 },
{ _id: 0, count: 1 }
]
},
{
id: 'ba37125ec32b2a99'
results: [
{ _id: 3, count: 2 }
]
}
]
Active query
Model.aggregate([
{
$match: {
'survey_id': survey_id
}
},
{
$unwind: "$survey_answers"
},
{
$group: {
_id: "$survey_answers.option_answer",
count: {
$sum: 1
}
}
}
])
Current output
[
{
"_id": 0,
"count": 1
},
{
"_id": 3,
"count": 2
},
{
"_id": 5,
"count": 1
}
]
I added your records to my db. Post that I tried your commands one by one.
$unwind results you similar to -
> db.survey.aggregate({$unwind: "$survey_answers"})
{ "_id" : ObjectId("5c3859e459875873b5e6ee3c"), "survey_answers" : { "id" : "9ca01568e8dbb247", "option_answer" : 5, "type" : "OPINION_SCALE" }, "survey_id" : "test" }
{ "_id" : ObjectId("5c3859e459875873b5e6ee3c"), "survey_answers" : { "id" : "ba37125ec32b2a99", "option_answer" : 3, "type" : "LABELED_QUESTIONS" }, "survey_id" : "test" }
{ "_id" : ObjectId("5c3859e459875873b5e6ee3d"), "survey_answers" : { "id" : "9ca01568e8dbb247", "option_answer" : 0, "type" : "OPINION_SCALE" }, "survey_id" : "test" }
{ "_id" : ObjectId("5c3859e459875873b5e6ee3d"), "survey_answers" : { "id" : "ba37125ec32b2a99", "option_answer" : 3, "type" : "LABELED_QUESTIONS" }, "survey_id" : "test" }
I am not adding code for match since that is okay in your query as well
The grouping would be -
> db.survey.aggregate({$unwind: "$survey_answers"},{$group: { _id: { 'optionAnswer': "$survey_answers.option_answer", 'id':"$survey_answers.id"}, count: { $sum: 1}}})
{ "_id" : { "optionAnswer" : 0, "id" : "9ca01568e8dbb247" }, "count" : 1 }
{ "_id" : { "optionAnswer" : 3, "id" : "ba37125ec32b2a99" }, "count" : 2 }
{ "_id" : { "optionAnswer" : 5, "id" : "9ca01568e8dbb247" }, "count" : 1 }
You can group on $survey_answers.id to bring it into projection.
The projection is what you're missing in your query -
> db.survey.aggregate({$unwind: "$survey_answers"},{$group: { _id: { 'optionAnswer': "$survey_answers.option_answer", 'id':'$survey_answers.id'}, count: { $sum: 1}}}, {$project : {answer: '$_id.optionAnswer', id: '$_id.id', count: '$count', _id:0}})
{ "answer" : 0, "id" : "9ca01568e8dbb247", "count" : 1 }
{ "answer" : 3, "id" : "ba37125ec32b2a99", "count" : 2 }
{ "answer" : 5, "id" : "9ca01568e8dbb247", "count" : 1 }
Further you can add a group on id and add results to a set. And your final query would be -
db.survey.aggregate(
{$unwind: "$survey_answers"},
{$group: {
_id: { 'optionAnswer': "$survey_answers.option_answer", 'id':'$survey_answers.id'},
count: { $sum: 1}
}},
{$project : {
answer: '$_id.optionAnswer',
id: '$_id.id',
count: '$count',
_id:0
}},
{$group: {
_id:{id:"$id"},
results: { $addToSet: {answer: "$answer", count: '$count'} }
}},
{$project : {
id: '$_id.id',
answer: '$results',
_id:0
}})
Hope this helps.