Nested grouping with MongoDB

Nested grouping with MongoDB - mongodb

Given a database the form of
[
{ gender: "m", age: 1, name: "A" },
{ gender: "f", age: 2, name: "B" },
{ gender: "m", age: 3, name: "C" },
{ gender: "f", age: 1, name: "D" },
{ gender: "m", age: 2, name: "E" },
{ gender: "f", age: 3, name: "F" },
{ gender: "m", age: 1, name: "G" },
{ gender: "f", age: 2, name: "H" },
{ gender: "m", age: 3, name: "I" },
{ gender: "f", age: 1, name: "J" }
]
I want to first group by age and secondly group by gender so that I get a nested result looking something like
[{
_id: "1",
children: [
{ _id: "f" },
{ _id: "m" }
]
}, {
_id: "2",
children: [
{ _id: "f" },
{ _id: "m" }
]
}, {
_id: "3",
children: [
{ _id: "f" },
{ _id: "m" }
]
}]
Here is what I tried so far:
db.example.aggregate(
{ $group: { _id: "$age", children: { $addToSet: {
age: "$age", gender: "$gender", name: "$name"
}}}},
{ $group: { _id: "$children.gender"}}
)
But this returns an {_id: null} as its result. Is this possible and in case yes, how?

Something like this should do it;
db.example.aggregate(
{
$group: {
_id: { age: "$age", gender: "$gender" },
names: { $addToSet: "$name" }
}
},
{
$group: {
_id: { age: "$_id.age" },
children: { $addToSet: { gender: "$_id.gender", names:"$names" } }
}
}
)
...which gives the result;
{
"_id" : {
"age" : 1
},
"children" : [
{ "gender" : "m", "names" : [ "G", "A" ] },
{ "gender" : "f", "names" : [ "J", "D" ] }
]
},
...
If you want the age as _id as in your example, just replace the second grouping's _id by;
_id: "$_id.age",

Related

Field combination in an array where another field is the same in MongoDB?

I want to find matches with the same gender and insert them into a new field array aka names but I am unable to solution using MongoDB. Or mongooese.
Input example:
db.students.insertMany([
{ id: 1, name: "Ryan", gender: "M" },
{ id: 2, name: "Joanna", gender: "F" },
{ id: 3, name: "Andy", gender: "M" },
{ id: 4, name: "Irina", gender: "F" }
]);
Desired output:
[
{ gender: "M", names: ["Ryan","Andy"]},
{ gender: "F", names: ["Joanna","Irina"]}
]
Note: the table has many records and I do not know those gender/name pairs in advance
I try this but no results. I don't know how I should write this query.
db.students.aggregate([
{
$group:{
names : {$push:"$name"},
}
},
{ "$match": { "gender": "$gender" } }
])

You did not specify how to group. Try this one:
db.students.aggregate([
{
$group: {
_id: "$gender",
names: { $push: "$name" }
}
},
{
$set: {
gender: "$_id",
_id: "$$REMOVE"
}
}
])

Count nested and outer data

I have the following mongo data structure:
[
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b1',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b3',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a2",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
]
Assume this is the Session collection. Now, each session document has some irrelevant _id and libraryName key. Furthermore, each document has array of stages documents. Each stage document has some irrelevant _id and type. I want to count 2 things.
First - I want to count for each libraryName, how many session objects it has.
The solution for this query would be:
const services = await Session.aggregate(
[
{
$group: {
_id: "$libraryName",
count: { $sum: 1 },
},
}
]
);
Second - I want, per libaryName to count for each stage type how many nested stages documents it has.
So the final result I wish to retrieve is:
[
{
libraryName: 'a1',
count: 456,
stages: [
{
type: 'b1',
count: 43,
},
{
type: 'b2',
count: 44,
}
],
},
{
libraryName: 'a2',
count: 4546,
stages: [
{
type: 'b1',
count: 43
},
{
type: 'b3',
count: 44
}
]
}
]
Changed to:
[
{
"_id": "a1",
"count": 2,
"stages": [
{
"count": 1,
"type": "b3"
},
{
"count": 3,
"type": "b1"
},
{
"count": 4,
"type": "b2"
}
]
},
{
"_id": "a2",
"count": 1,
"stages": [
{
"count": 1,
"type": "b1"
},
{
"count": 3,
"type": "b2"
}
]
}
]

Using the sample data in the question post and the aggregation query:
db.collection.aggregate([
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$libraryName", type: "$stages.type" },
type_count: { "$sum": 1 }
}
},
{
$group: {
_id: { libraryName: "$_id.libraryName" },
count: { "$sum": "$type_count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
{
$project: {
libraryName: "$_id.libraryName",
count: 1,
stages: 1,
_id: 0
}
}
])
I get the following results:
{
"libraryName" : "a2",
"count" : 4,
"stages" : [
{
"type" : "b1",
"count" : 1
},
{
"type" : "b2",
"count" : 3
}
]
}
{
"libraryName" : "a1",
"count" : 8,
"stages" : [
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
},
{
"type" : "b2",
"count" : 4
}
]
}
[ EDIT - ADD ] : This is an answer after the question post's expected result is modified. This query uses the question post's sample documents as input.
db.collection.aggregate([
{
$group: {
_id: { libraryName: "$libraryName" },
count: { "$sum": 1 },
stages: { $push: "$stages" }
}
},
{
$unwind: "$stages"
},
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$_id.libraryName", type: "$stages.type" },
type_count: { "$sum": 1 },
count: { $first: "$count" }
}
},
{
$group: {
_id: "$_id.libraryName",
count: { $first: "$count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
])
The result:
{
"_id" : "a2",
"count" : 1,
"stages" : [
{
"type" : "b2",
"count" : 3
},
{
"type" : "b1",
"count" : 1
}
]
}
{
"_id" : "a1",
"count" : 2,
"stages" : [
{
"type" : "b2",
"count" : 4
},
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
}
]
}

MongoDB Aggregate how to pair relevant records for processing

I've got some event data captured in a MongoDB database, and some of these events occur in pairs.
Eg: DOOR_OPEN and DOOR_CLOSE are two events that occur in pairs
Events collection:
{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: t }
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: t+5 }
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:t+10 }
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:t+30 }
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:t+35 }
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:t+40 }
...
Assuming the records are sorted on the timestamp, the _id: 1 and _id: 3 are a "pair" for "user1. _id: 2 and _id: 6 for "user2".
I'd like to take all these DOOR_OPEN & DOOR_CLOSE pairs per user and calculate the average duration etc. the door has been opened by each user.
Can this be achieved using the aggregate framework?

You can use $lookup and $group for achieving this.
db.getCollection('TestColl').aggregate([
{ $match: {"name": { $in: [ "DOOR_OPEN", "DOOR_CLOSE" ] } }},
{ $lookup:
{
from: "TestColl",
let: { userID_lu: "$userID", name_lu: "$name", timestamp_lu :"$timestamp" },
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$userID", "$$userID_lu" ] },
{ $eq: [ "$$name_lu", "DOOR_OPEN" ]},
{ $eq: [ "$name", "DOOR_CLOSE" ]},
{ $gt: [ "$timestamp", "$$timestamp_lu" ] }
]
}
}
},
],
as: "close_dates"
}
},
{ $addFields: { "close_time": { $arrayElemAt: [ "$close_dates.timestamp", 0 ] } } },
{ $addFields: { "time_diff": { $divide: [ { $subtract: [ "$close_time", "$timestamp" ] }, 1000 * 60 ]} } }, // Minutes
{ $group: { _id: "$userID" ,
events: { $push: { "eventId": "$_id", "name": "$name", "timestamp": "$timestamp" } },
averageTimestamp: {$avg: "$time_diff"}
}
}
])
Sample Data:
[
{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: ISODate("2019-10-24T08:00:00Z") },
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: ISODate("2019-10-24T08:05:00Z") },
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:10:00Z") },
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:ISODate("2019-10-24T08:30:00Z") },
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:ISODate("2019-10-24T08:35:00Z") },
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:ISODate("2019-10-24T08:40:00Z") },
{ _id: 7, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:50:00Z") },
{ _id: 8, name: "DOOR_OPEN", userID: "user2", timestamp:ISODate("2019-10-24T08:55:00Z") }
]
Result:
/* 1 */
{
"_id" : "user2",
"events" : [
{
"eventId" : 2.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:05:00.000Z")
},
{
"eventId" : 6.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:40:00.000Z")
},
{
"eventId" : 8.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:55:00.000Z")
}
],
"averageTimestamp" : 35.0
}
/* 2 */
{
"_id" : "user1",
"events" : [
{
"eventId" : 1.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:00:00.000Z")
},
{
"eventId" : 3.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:10:00.000Z")
},
{
"eventId" : 4.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:30:00.000Z")
},
{
"eventId" : 7.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:50:00.000Z")
}
],
"averageTimestamp" : 15.0
}

You could use the $group operator of the aggregate framework to group by userID and calculate the averages:
db.events.aggregate([{
$group: {
_id: "$userID",
averageTimestamp: {$avg: "$timestamp"}
}
}]);
If you also want to discard any other event other than DOOR_OPEN or DOOR_CLOSED, you can add a filter adding a $match in the aggregate pipeline:
db.events.aggregate([{
$match: {
$or: [{name: "DOOR_OPEN"},{name: "DOOR_CLOSE"}]
}
}, {
$group: {
_id: "$userID",
averageTimestamp: {$avg: "$timestamp"}
}
}]);

Grouping the results of a group query in mongodb

I have the following sample data:
{
"name": "Bob",
"mi": "K",
"martialStatus": "M",
"age": 30,
"city": "Paris",
"job": "Engineer"
}
{
"name": "Chad",
"mi": "M",
"martialStatus": "W",
"age": 31,
"city": "Paris",
"job": "Doctor"
}
{
"name": "Mel",
"mi": "A",
"martialStatus": "D",
"age": 31,
"city": "London",
"job": "Doctor"
}
{
"name": "Frank",
"mi": "F",
"martialStatus": "S",
"age": 30,
"city": "London",
"job": "Engineer"
}
I am trying to write a mongo query that would return results in the following format:
"peopleCount": 4,
"jobsList": {
"job": "Doctor",
"ageList": [
{
"age": 31,
"cityList": [
{
"city": "London",
"people": [
{
"name": "Mel",
"martialStatus": "D"
}
]
},
{
"city": "Paris",
"people": [
{
"name": "Chad",
"martialStatus": "W"
}
]
},
{
"city": "Berlin",
...
...
]
}
]
}
To try on the first two level (jobsList and ageList), I am trying the below
db.colName.aggregate([
{
$group: {
_id: { job: "$job" },
jobsList: {
$push: {
age: "$age",
city: "$city",
name: "$name",
martialStatus: "$martialStatus"
}
}
}
},
{
$group: {
_id: { age: "$age" },
ageList: {
$push: {
city: "$city",
name: "$name",
martialStatus: "$martialStatus"
}
}
}
}
]);
The above however does not work although the first group/push part works... Any hints on how to get that output format/groupping?

db.colName.aggregate([
{
$group: {
_id: { job: "$job", age: "$age", city: "$city" },
people: { $push: { name: "$name", martialStatus: "$martialStatus" } }
}
},
{
$group: {
_id: { job: "$_id.job", age: "$_id.age" },
peopleCount: { $sum: { $size: "$people" } },
cityList: { $push: { city: "$_id.city", people: "$people" } },
}
},
{
$group: {
_id: { job: "$_id.job" },
peopleCount: { $sum: "$peopleCount" },
agesList: { $push: { age: "$_id.age", cityList: "$cityList" } }
}
},
{
$group: {
_id: null,
peopleCount: { $sum: "$peopleCount" },
jobsList: { $push: { job: "$_id.job", agesList: "$agesList" } }
}
},
{
$project: { _id: 0, peopleCount: 1, jobsList: 1 }
}
]);
on the provided by you collection gives me the result
{
"peopleCount" : 4,
"jobsList" :
[
{
"job" : "Engineer",
"agesList" :
[
{
"age" : 30,
"cityList" :
[
{
"city" : "London",
"people" :
[
{ "name" : "Frank", "martialStatus" : "S" }
]
},
{
"city" : "Paris",
"people" :
[
{ "name" : "Bob", "martialStatus" : "M" }
]
}
]
}
]
},
{
"job" : "Doctor",
"agesList" :
[
{
"age" : 31,
"cityList" :
[
{
"city" : "London",
"people" :
[
{ "name" : "Mel", "martialStatus" : "D" }
]
},
{
"city" : "Paris",
"people" :
[
{ "name" : "Chad", "martialStatus" : "W" }
]
}
]
}
]
}
]
}
that seems to be correct. Thought, I am not sure it's the best solution. I am new to aggregation-framework.

Mongodb : elemMatch

My problem cans be described as the following.
Give the following data (copied from mongo manual), how can i find the document which has zipcode 63109 and has 2 studends named "John", and "Jeff". I try
db.schools.find( { zipcode: 63109 },
{ students: { $elemMatch: { name : {"John", "Jeff"} } } } )
But it doesn't work. Could you help me, please ?
Thank you in advanced
{
_id: 1,
zipcode: 63109,
students: [
{ name: "john", school: 102, age: 10 },
{ name: "jess", school: 102, age: 11 },
{ name: "jeff", school: 108, age: 15 }
]
}
{
_id: 2,
zipcode: 63110,
students: [
{ name: "ajax", school: 100, age: 7 },
{ name: "achilles", school: 100, age: 8 },
]
}
{
_id: 3,
zipcode: 63109,
students: [
{ name: "ajax", school: 100, age: 7 },
{ name: "achilles", school: 100, age: 8 },
]
}
{
_id: 4,
zipcode: 63109,
students: [
{ name: "barney", school: 102, age: 7 },
]
}

I don't think it's possible to use $elemMatch here
The only solution I see is:
db.schools.find( { zipcode: 63109, $and: [{"students.name": "john"}, {"students.name": "jeff"}]} )

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Nested grouping with MongoDB - mongodb

Related

Field combination in an array where another field is the same in MongoDB?

Count nested and outer data

MongoDB Aggregate how to pair relevant records for processing

Grouping the results of a group query in mongodb

Mongodb : elemMatch

Categories

Resources