Nested grouping with MongoDB - mongodb

Given a database the form of
[
{ gender: "m", age: 1, name: "A" },
{ gender: "f", age: 2, name: "B" },
{ gender: "m", age: 3, name: "C" },
{ gender: "f", age: 1, name: "D" },
{ gender: "m", age: 2, name: "E" },
{ gender: "f", age: 3, name: "F" },
{ gender: "m", age: 1, name: "G" },
{ gender: "f", age: 2, name: "H" },
{ gender: "m", age: 3, name: "I" },
{ gender: "f", age: 1, name: "J" }
]
I want to first group by age and secondly group by gender so that I get a nested result looking something like
[{
_id: "1",
children: [
{ _id: "f" },
{ _id: "m" }
]
}, {
_id: "2",
children: [
{ _id: "f" },
{ _id: "m" }
]
}, {
_id: "3",
children: [
{ _id: "f" },
{ _id: "m" }
]
}]
Here is what I tried so far:
db.example.aggregate(
{ $group: { _id: "$age", children: { $addToSet: {
age: "$age", gender: "$gender", name: "$name"
}}}},
{ $group: { _id: "$children.gender"}}
)
But this returns an {_id: null} as its result. Is this possible and in case yes, how?

Something like this should do it;
db.example.aggregate(
{
$group: {
_id: { age: "$age", gender: "$gender" },
names: { $addToSet: "$name" }
}
},
{
$group: {
_id: { age: "$_id.age" },
children: { $addToSet: { gender: "$_id.gender", names:"$names" } }
}
}
)
...which gives the result;
{
"_id" : {
"age" : 1
},
"children" : [
{ "gender" : "m", "names" : [ "G", "A" ] },
{ "gender" : "f", "names" : [ "J", "D" ] }
]
},
...
If you want the age as _id as in your example, just replace the second grouping's _id by;
_id: "$_id.age",

Related

Field combination in an array where another field is the same in MongoDB?

I want to find matches with the same gender and insert them into a new field array aka names but I am unable to solution using MongoDB. Or mongooese.
Input example:
db.students.insertMany([
{ id: 1, name: "Ryan", gender: "M" },
{ id: 2, name: "Joanna", gender: "F" },
{ id: 3, name: "Andy", gender: "M" },
{ id: 4, name: "Irina", gender: "F" }
]);
Desired output:
[
{ gender: "M", names: ["Ryan","Andy"]},
{ gender: "F", names: ["Joanna","Irina"]}
]
Note: the table has many records and I do not know those gender/name pairs in advance
I try this but no results. I don't know how I should write this query.
db.students.aggregate([
{
$group:{
names : {$push:"$name"},
}
},
{ "$match": { "gender": "$gender" } }
])
You did not specify how to group. Try this one:
db.students.aggregate([
{
$group: {
_id: "$gender",
names: { $push: "$name" }
}
},
{
$set: {
gender: "$_id",
_id: "$$REMOVE"
}
}
])

Count nested and outer data

I have the following mongo data structure:
[
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b1',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b3',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a2",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
]
Assume this is the Session collection. Now, each session document has some irrelevant _id and libraryName key. Furthermore, each document has array of stages documents. Each stage document has some irrelevant _id and type. I want to count 2 things.
First - I want to count for each libraryName, how many session objects it has.
The solution for this query would be:
const services = await Session.aggregate(
[
{
$group: {
_id: "$libraryName",
count: { $sum: 1 },
},
}
]
);
Second - I want, per libaryName to count for each stage type how many nested stages documents it has.
So the final result I wish to retrieve is:
[
{
libraryName: 'a1',
count: 456,
stages: [
{
type: 'b1',
count: 43,
},
{
type: 'b2',
count: 44,
}
],
},
{
libraryName: 'a2',
count: 4546,
stages: [
{
type: 'b1',
count: 43
},
{
type: 'b3',
count: 44
}
]
}
]
Changed to:
[
{
"_id": "a1",
"count": 2,
"stages": [
{
"count": 1,
"type": "b3"
},
{
"count": 3,
"type": "b1"
},
{
"count": 4,
"type": "b2"
}
]
},
{
"_id": "a2",
"count": 1,
"stages": [
{
"count": 1,
"type": "b1"
},
{
"count": 3,
"type": "b2"
}
]
}
]
Using the sample data in the question post and the aggregation query:
db.collection.aggregate([
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$libraryName", type: "$stages.type" },
type_count: { "$sum": 1 }
}
},
{
$group: {
_id: { libraryName: "$_id.libraryName" },
count: { "$sum": "$type_count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
{
$project: {
libraryName: "$_id.libraryName",
count: 1,
stages: 1,
_id: 0
}
}
])
I get the following results:
{
"libraryName" : "a2",
"count" : 4,
"stages" : [
{
"type" : "b1",
"count" : 1
},
{
"type" : "b2",
"count" : 3
}
]
}
{
"libraryName" : "a1",
"count" : 8,
"stages" : [
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
},
{
"type" : "b2",
"count" : 4
}
]
}
[ EDIT - ADD ] : This is an answer after the question post's expected result is modified. This query uses the question post's sample documents as input.
db.collection.aggregate([
{
$group: {
_id: { libraryName: "$libraryName" },
count: { "$sum": 1 },
stages: { $push: "$stages" }
}
},
{
$unwind: "$stages"
},
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$_id.libraryName", type: "$stages.type" },
type_count: { "$sum": 1 },
count: { $first: "$count" }
}
},
{
$group: {
_id: "$_id.libraryName",
count: { $first: "$count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
])
The result:
{
"_id" : "a2",
"count" : 1,
"stages" : [
{
"type" : "b2",
"count" : 3
},
{
"type" : "b1",
"count" : 1
}
]
}
{
"_id" : "a1",
"count" : 2,
"stages" : [
{
"type" : "b2",
"count" : 4
},
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
}
]
}

MongoDB Aggregate how to pair relevant records for processing

I've got some event data captured in a MongoDB database, and some of these events occur in pairs.
Eg: DOOR_OPEN and DOOR_CLOSE are two events that occur in pairs
Events collection:
{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: t }
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: t+5 }
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:t+10 }
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:t+30 }
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:t+35 }
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:t+40 }
...
Assuming the records are sorted on the timestamp, the _id: 1 and _id: 3 are a "pair" for "user1. _id: 2 and _id: 6 for "user2".
I'd like to take all these DOOR_OPEN & DOOR_CLOSE pairs per user and calculate the average duration etc. the door has been opened by each user.
Can this be achieved using the aggregate framework?
You can use $lookup and $group for achieving this.
db.getCollection('TestColl').aggregate([
{ $match: {"name": { $in: [ "DOOR_OPEN", "DOOR_CLOSE" ] } }},
{ $lookup:
{
from: "TestColl",
let: { userID_lu: "$userID", name_lu: "$name", timestamp_lu :"$timestamp" },
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$userID", "$$userID_lu" ] },
{ $eq: [ "$$name_lu", "DOOR_OPEN" ]},
{ $eq: [ "$name", "DOOR_CLOSE" ]},
{ $gt: [ "$timestamp", "$$timestamp_lu" ] }
]
}
}
},
],
as: "close_dates"
}
},
{ $addFields: { "close_time": { $arrayElemAt: [ "$close_dates.timestamp", 0 ] } } },
{ $addFields: { "time_diff": { $divide: [ { $subtract: [ "$close_time", "$timestamp" ] }, 1000 * 60 ]} } }, // Minutes
{ $group: { _id: "$userID" ,
events: { $push: { "eventId": "$_id", "name": "$name", "timestamp": "$timestamp" } },
averageTimestamp: {$avg: "$time_diff"}
}
}
])
Sample Data:
[
{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: ISODate("2019-10-24T08:00:00Z") },
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: ISODate("2019-10-24T08:05:00Z") },
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:10:00Z") },
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:ISODate("2019-10-24T08:30:00Z") },
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:ISODate("2019-10-24T08:35:00Z") },
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:ISODate("2019-10-24T08:40:00Z") },
{ _id: 7, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:50:00Z") },
{ _id: 8, name: "DOOR_OPEN", userID: "user2", timestamp:ISODate("2019-10-24T08:55:00Z") }
]
Result:
/* 1 */
{
"_id" : "user2",
"events" : [
{
"eventId" : 2.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:05:00.000Z")
},
{
"eventId" : 6.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:40:00.000Z")
},
{
"eventId" : 8.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:55:00.000Z")
}
],
"averageTimestamp" : 35.0
}
/* 2 */
{
"_id" : "user1",
"events" : [
{
"eventId" : 1.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:00:00.000Z")
},
{
"eventId" : 3.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:10:00.000Z")
},
{
"eventId" : 4.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:30:00.000Z")
},
{
"eventId" : 7.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:50:00.000Z")
}
],
"averageTimestamp" : 15.0
}
You could use the $group operator of the aggregate framework to group by userID and calculate the averages:
db.events.aggregate([{
$group: {
_id: "$userID",
averageTimestamp: {$avg: "$timestamp"}
}
}]);
If you also want to discard any other event other than DOOR_OPEN or DOOR_CLOSED, you can add a filter adding a $match in the aggregate pipeline:
db.events.aggregate([{
$match: {
$or: [{name: "DOOR_OPEN"},{name: "DOOR_CLOSE"}]
}
}, {
$group: {
_id: "$userID",
averageTimestamp: {$avg: "$timestamp"}
}
}]);

Grouping the results of a group query in mongodb

I have the following sample data:
{
"name": "Bob",
"mi": "K",
"martialStatus": "M",
"age": 30,
"city": "Paris",
"job": "Engineer"
}
{
"name": "Chad",
"mi": "M",
"martialStatus": "W",
"age": 31,
"city": "Paris",
"job": "Doctor"
}
{
"name": "Mel",
"mi": "A",
"martialStatus": "D",
"age": 31,
"city": "London",
"job": "Doctor"
}
{
"name": "Frank",
"mi": "F",
"martialStatus": "S",
"age": 30,
"city": "London",
"job": "Engineer"
}
I am trying to write a mongo query that would return results in the following format:
"peopleCount": 4,
"jobsList": {
"job": "Doctor",
"ageList": [
{
"age": 31,
"cityList": [
{
"city": "London",
"people": [
{
"name": "Mel",
"martialStatus": "D"
}
]
},
{
"city": "Paris",
"people": [
{
"name": "Chad",
"martialStatus": "W"
}
]
},
{
"city": "Berlin",
...
...
]
}
]
}
To try on the first two level (jobsList and ageList), I am trying the below
db.colName.aggregate([
{
$group: {
_id: { job: "$job" },
jobsList: {
$push: {
age: "$age",
city: "$city",
name: "$name",
martialStatus: "$martialStatus"
}
}
}
},
{
$group: {
_id: { age: "$age" },
ageList: {
$push: {
city: "$city",
name: "$name",
martialStatus: "$martialStatus"
}
}
}
}
]);
The above however does not work although the first group/push part works... Any hints on how to get that output format/groupping?
db.colName.aggregate([
{
$group: {
_id: { job: "$job", age: "$age", city: "$city" },
people: { $push: { name: "$name", martialStatus: "$martialStatus" } }
}
},
{
$group: {
_id: { job: "$_id.job", age: "$_id.age" },
peopleCount: { $sum: { $size: "$people" } },
cityList: { $push: { city: "$_id.city", people: "$people" } },
}
},
{
$group: {
_id: { job: "$_id.job" },
peopleCount: { $sum: "$peopleCount" },
agesList: { $push: { age: "$_id.age", cityList: "$cityList" } }
}
},
{
$group: {
_id: null,
peopleCount: { $sum: "$peopleCount" },
jobsList: { $push: { job: "$_id.job", agesList: "$agesList" } }
}
},
{
$project: { _id: 0, peopleCount: 1, jobsList: 1 }
}
]);
on the provided by you collection gives me the result
{
"peopleCount" : 4,
"jobsList" :
[
{
"job" : "Engineer",
"agesList" :
[
{
"age" : 30,
"cityList" :
[
{
"city" : "London",
"people" :
[
{ "name" : "Frank", "martialStatus" : "S" }
]
},
{
"city" : "Paris",
"people" :
[
{ "name" : "Bob", "martialStatus" : "M" }
]
}
]
}
]
},
{
"job" : "Doctor",
"agesList" :
[
{
"age" : 31,
"cityList" :
[
{
"city" : "London",
"people" :
[
{ "name" : "Mel", "martialStatus" : "D" }
]
},
{
"city" : "Paris",
"people" :
[
{ "name" : "Chad", "martialStatus" : "W" }
]
}
]
}
]
}
]
}
that seems to be correct. Thought, I am not sure it's the best solution. I am new to aggregation-framework.

Mongodb : elemMatch

My problem cans be described as the following.
Give the following data (copied from mongo manual), how can i find the document which has zipcode 63109 and has 2 studends named "John", and "Jeff". I try
db.schools.find( { zipcode: 63109 },
{ students: { $elemMatch: { name : {"John", "Jeff"} } } } )
But it doesn't work. Could you help me, please ?
Thank you in advanced
{
_id: 1,
zipcode: 63109,
students: [
{ name: "john", school: 102, age: 10 },
{ name: "jess", school: 102, age: 11 },
{ name: "jeff", school: 108, age: 15 }
]
}
{
_id: 2,
zipcode: 63110,
students: [
{ name: "ajax", school: 100, age: 7 },
{ name: "achilles", school: 100, age: 8 },
]
}
{
_id: 3,
zipcode: 63109,
students: [
{ name: "ajax", school: 100, age: 7 },
{ name: "achilles", school: 100, age: 8 },
]
}
{
_id: 4,
zipcode: 63109,
students: [
{ name: "barney", school: 102, age: 7 },
]
}
I don't think it's possible to use $elemMatch here
The only solution I see is:
db.schools.find( { zipcode: 63109, $and: [{"students.name": "john"}, {"students.name": "jeff"}]} )