Count nested and outer data - mongodb

I have the following mongo data structure:
[
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b1',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b3',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a1",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
{
_id: "......",
libraryName: "a2",
stages: [
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b2',
},
{
_id: '....',
type: 'b1',
},
],
},
]
Assume this is the Session collection. Now, each session document has some irrelevant _id and libraryName key. Furthermore, each document has array of stages documents. Each stage document has some irrelevant _id and type. I want to count 2 things.
First - I want to count for each libraryName, how many session objects it has.
The solution for this query would be:
const services = await Session.aggregate(
[
{
$group: {
_id: "$libraryName",
count: { $sum: 1 },
},
}
]
);
Second - I want, per libaryName to count for each stage type how many nested stages documents it has.
So the final result I wish to retrieve is:
[
{
libraryName: 'a1',
count: 456,
stages: [
{
type: 'b1',
count: 43,
},
{
type: 'b2',
count: 44,
}
],
},
{
libraryName: 'a2',
count: 4546,
stages: [
{
type: 'b1',
count: 43
},
{
type: 'b3',
count: 44
}
]
}
]
Changed to:
[
{
"_id": "a1",
"count": 2,
"stages": [
{
"count": 1,
"type": "b3"
},
{
"count": 3,
"type": "b1"
},
{
"count": 4,
"type": "b2"
}
]
},
{
"_id": "a2",
"count": 1,
"stages": [
{
"count": 1,
"type": "b1"
},
{
"count": 3,
"type": "b2"
}
]
}
]

Using the sample data in the question post and the aggregation query:
db.collection.aggregate([
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$libraryName", type: "$stages.type" },
type_count: { "$sum": 1 }
}
},
{
$group: {
_id: { libraryName: "$_id.libraryName" },
count: { "$sum": "$type_count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
{
$project: {
libraryName: "$_id.libraryName",
count: 1,
stages: 1,
_id: 0
}
}
])
I get the following results:
{
"libraryName" : "a2",
"count" : 4,
"stages" : [
{
"type" : "b1",
"count" : 1
},
{
"type" : "b2",
"count" : 3
}
]
}
{
"libraryName" : "a1",
"count" : 8,
"stages" : [
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
},
{
"type" : "b2",
"count" : 4
}
]
}
[ EDIT - ADD ] : This is an answer after the question post's expected result is modified. This query uses the question post's sample documents as input.
db.collection.aggregate([
{
$group: {
_id: { libraryName: "$libraryName" },
count: { "$sum": 1 },
stages: { $push: "$stages" }
}
},
{
$unwind: "$stages"
},
{
$unwind: "$stages"
},
{
$group: {
_id: { libraryName: "$_id.libraryName", type: "$stages.type" },
type_count: { "$sum": 1 },
count: { $first: "$count" }
}
},
{
$group: {
_id: "$_id.libraryName",
count: { $first: "$count" },
stages: { $push: { type: "$_id.type", count: "$type_count" } }
}
},
])
The result:
{
"_id" : "a2",
"count" : 1,
"stages" : [
{
"type" : "b2",
"count" : 3
},
{
"type" : "b1",
"count" : 1
}
]
}
{
"_id" : "a1",
"count" : 2,
"stages" : [
{
"type" : "b2",
"count" : 4
},
{
"type" : "b3",
"count" : 1
},
{
"type" : "b1",
"count" : 3
}
]
}

Related

Grouping into array in MongoDB

I have MongoDB collection with below documents:
[
{
"productType":"Bike",
"company":"yamaha",
"model":"y1"
},
{
"productType":"Bike",
"company":"bajaj",
"model":"b1"
},
{
"productType":"Bike",
"company":"yamaha",
"model":"y1"
},
{
"productType":"Car",
"company":"Maruti",
"model":"m1"
},
{
"productType":"Bike",
"company":"yamaha",
"model":"y2"
},
{
"productType":"Car",
"company":"Suzuki",
"model":"s1"
}
]
I want my output to be like :
{
"productType": [
{
"name": "Bike",
"count": 4,
"companies": [
{
"name": "Yamaha",
"count": 3,
"models": [
{
"name": "y1",
"count": 2
},
{
"name": "y2",
"count": 1
}
]
},
{
"name": "Bajaj",
"count": 1,
"models": [
{
"name": "b1",
"count": 1
}
]
}
]
},
{
"name": "Car",
"count": 2,
"companies": [
{
"name": "Maruti",
"count": 1,
"models": [
{
"name": "m1",
"count": 1
}
]
},
{
"name": "Suzuki",
"count": 1,
"models": [
{
"name": "s1",
"count": 1
}
]
}
]
}
]
}
I am not able to understand how to create arrays inside existing array using $push. I know we can create an array using $push but how to create array of array with it ?
In future, I might want to add "metaData" field also along with name and count.
You have to run multiple $group stages, one for each level:
db.collection.aggregate([
{
$group: {
_id: { company: "$company", productType: "$productType", model: "$model" },
count: { $sum: 1 }
}
},
{
$group: {
_id: { productType: "$_id.productType", company: "$_id.company" },
models: { $push: { name: "$_id.model", count: "$count" } },
count: { $sum: "$count" }
}
},
{
$group: {
_id: "$_id.productType",
companies: { $push: { company: "$_id.company", models: "$models", count: "$count" } },
count: { $sum: "$count" }
}
},
{ $set: { name: "$_id", _id: "$$REMOVE" } },
{
$group: {
_id: null,
productType: { $push: "$$ROOT" }
}
}
])
Mongo Playground
Try this:
db.testCollection.aggregate([
{
$group: {
_id: {
name: "$productType",
company: "$company",
model: "$model"
},
count: { $sum: 1 }
}
},
{
$group: {
_id: {
name: "$_id.name",
company: "$_id.company"
},
count: { $sum: "$count" },
models: {
$push: {
name: "$_id.model",
count: "$count"
}
}
}
},
{
$group: {
_id: { name: "$_id.name" },
count: { $sum: "$count" },
companies: {
$push: {
name: "$_id.company",
count: "$count",
models: "$models"
}
}
}
},
{
$group: {
_id: null,
productType: {
$push: {
name: "$_id.name",
count: "$count",
companies: "$companies"
}
}
}
},
{
$project: { _id: 0 }
}
]);
Output:
{
"productType" : [
{
"name" : "Car",
"count" : 2,
"companies" : [
{
"name" : "Suzuki",
"count" : 1,
"models" : [
{
"name" : "s1",
"count" : 1
}
]
},
{
"name" : "Maruti",
"count" : 1,
"models" : [
{
"name" : "m1",
"count" : 1
}
]
}
]
},
{
"name" : "Bike",
"count" : 4,
"companies" : [
{
"name" : "yamaha",
"count" : 3,
"models" : [
{
"name" : "y2",
"count" : 1
},
{
"name" : "y1",
"count" : 2
}
]
},
{
"name" : "bajaj",
"count" : 1,
"models" : [
{
"name" : "b1",
"count" : 1
}
]
}
]
}
]
}

Apply multistage grouping in MongoDb Aggregation Framework

lets's assume I have the following data:
[
{ name: "Clint", hairColor: "brown", shoeSize: 8, income: 20000 },
{ name: "Clint", hairColor: "blond", shoeSize: 9, income: 30000 },
{ name: "George", hairColor: "brown", shoeSize: 7, income: 30000 },
{ name: "George", hairColor: "blond", shoeSize: 8, income: 10000 },
{ name: "George", hairColor: "blond", shoeSize: 9, income: 20000 }
]
I want to have the following output:
[
{
name: "Clint",
counts: 2,
avgShoesize: 8.5,
shoeSizeByHairColor: [
{ _id: "brown", counts: 1, avgShoesize: 8 },
{ _id: "blond", counts: 1, avgShoesize: 9 },
],
incomeByHairColor: [
{ _id: "brown", counts: 1, avgIncome: 20000 },
{ _id: "blond", counts: 1, avgIncome: 30000 },
]
},
{
name: "George",
counts: 3,
avgShoesize: 8,
shoeSizeByHairColor: [
{ _id: "brown", counts: 1, avgShoesize: 8 },
{ _id: "blond", counts: 2, avgShoesize: 8.5 },
],
incomeByHairColor: [
{ _id: "brown", counts: 1, avgIncome: 30000 },
{ _id: "blond", counts: 2, avgIncome: 15000 },
],
}
]
Basically I want to group my dataset by some key and then I want to have multiple groups of the subset.
First I thought of applying a $group with the key name. and the to use $facet in order to have various aggregations. I guess this will ot work since $facet does not use the subset from the previous $group. If I use $facet first I would need to split the result in multiple documents.
Any ideas how to properly solve my problem?
You need double $group, first one should aggregate by name and hairColor. And the second one can build nested array:
db.collection.aggregate([
{
$group: {
_id: { name: "$name", hairColor: "$hairColor" },
count: { $sum: 1 },
sumShoeSize: { $sum: "$shoeSize" },
avgShoeSize: { $avg: "$shoeSize" },
avgIncome: { $avg: "$income" },
docs: { $push: "$$ROOT" }
}
},
{
$group: {
_id: "$_id.name",
count: { $sum: "$count" },
sumShoeSize: { $sum: "$sumShoeSize" },
shoeSizeByHairColor: {
$push: {
_id: "$_id.hairColor", counts: "$count", avgShoeSize: "$avgShoeSize"
}
},
incomeByHairColor: {
$push: {
_id: "$_id.hairColor", counts: "$count", avgIncome: "$avgIncome"
}
}
}
},
{
$project: {
_id: 1,
count: 1,
avgShoeSize: { $divide: [ "$sumShoeSize", "$count" ] },
shoeSizeByHairColor: 1,
incomeByHairColor: 1
}
}
])
Mongo Playground
Phase 1: You can group by name and hairColor
and accumulate count, avgShoeSize, avgIncome, hairColors
Phase 2: Push accumulated into an array of incomeByHairColor, incomeByHairColor using $map operator.
Phase 3: Finally, in phase 3 you accumulate group by name and accumulate,
incomeByHairColor, incomeByHairColor and count
Pipeline:
db.users.aggregate([
{
$group :{
_id: {
name : "$name",
hairColor: "$hairColor"
},
count : {"$sum": 1},
avgShoeSize: {$avg: "$shoeSize"},
avgIncome : {$avg: "$income"},
hairColors : {$addToSet:"$hairColor" }
}
},
{
$project: {
_id:0,
name : "$_id.name",
hairColor: "$_id.hairColor",
count : "$count",
incomeByHairColor : {
$map: {
input: "$hairColors",
as: "key",
in: {
_id: "$$key",
counts: "$count",
avgIncome: "$avgIncome"
}
}
},
shoeSizeByHairColor:{
$map: {
input: "$hairColors",
as: "key",
in: {
_id: "$$key",
counts: "$count",
avgShoeSize: "$avgShoeSize"
}
}
}
}
},
{
$group: {
_id : "$name",
count : {$sum: "$count"},
incomeByHairColor: {$push : "$incomeByHairColor"},
shoeSizeByHairColor : {$push : "$shoeSizeByHairColor"}
}
}
]
)
Output:
/* 1 */
{
"_id" : "Clint",
"count" : 2,
"incomeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 1,
"avgIncome" : 30000
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgIncome" : 20000
}
]
],
"shoeSizeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 1,
"avgShoeSize" : 9
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgShoeSize" : 8
}
]
]
},
/* 2 */
{
"_id" : "George",
"count" : 3,
"incomeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 2,
"avgIncome" : 15000
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgIncome" : 30000
}
]
],
"shoeSizeByHairColor" : [
[
{
"_id" : "blond",
"counts" : 2,
"avgShoeSize" : 8.5
}
],
[
{
"_id" : "brown",
"counts" : 1,
"avgShoeSize" : 7
}
]
]
}

MongoDB/Mongoose: append related records to each aggreation result

Given the following Mongo collection called "members"
{
{name: "Joe", hobby: "Food"}, {name: "Lyn", hobby: "Food"},
{name: "Rex", hobby: "Play"}, {name: "Rex", hobby: "Shop"},...
}
I have an aggregation query that returns a paged set of records along with metadata for the total records found:
db.members.aggregate([
{
$facet: {
pipe1: [{ $count: 'count' }],
pipe2: [{ $skip: 0 }, { $limit: 4 }],
},
},
{
$unwind: '$pipe1',
},
{
$project: {
count: '$pipe1.count',
results: '$pipe2',
},
},
])
This gives me:
{count: 454, results: [<First 4 records here>]}
I am now trying to add to each record, an array of all member names that have the same hobby. So for the collection above, something like:
{
count: 454,
results: [
{name: "Joe", hobby: "Food", fanClub: ["Joe", "Lyn", "Alfred"]},
{name: "Lyn", hobby: "Food", fanClub: ["Joe", "Lyn", "Alfred"]},
{name: "Rex", hobby: "Play", fanClub: ["Rex"]},
{name: "Rex", hobby: "Shop", fanClub: ["Rex", "Rita"]}
]
}
I can't figure out how to run the follow up query within the aggregate. I've tried:
db.members.aggregate([
{
$facet: {
pipe1: [{ $count: 'count' }],
pipe2: [
{ $skip: 0 },
{ $limit: 2 },
{
$lookup: {
from: 'members',
pipeline: [{ $match: { hobby: '$hobby' } }],
as: 'fanClub',
},
},
],
},
},
{
$unwind: '$pipe1',
},
{
$project: {
count: '$pipe1.count',
results: '$pipe2',
},
},
])
Alas, the fanClub array is always empty.
Update 1
If I hardcode the hobby, for instance replace
{ $match: { hobby: '$hobby' }
with
{ $match: { hobby: 'Food' }
Then I do get results and all the fanClub arrays contain the results for Joe, Lyn and Alfred. So I must not be referring to the value within the pipeline correctly
Please try this :
db.membersHobby.aggregate([
{
$facet: {
pipe1: [{ $count: 'count' }],
pipe2: [{
$lookup:
{
from: "membersHobby",
let: { hobby: "$hobby" },
pipeline: [
{
$match:
{ $expr: { $eq: ["$hobby", "$$hobby"] } }
},
{ $project: { name: 1, _id: 0 } }
],
as: "fanClub"
}
}, { $skip: 0 }, { $limit: 4 }]
}
},
{
$unwind: '$pipe1'
},
{
$project: {
count: '$pipe1.count',
results: '$pipe2'
}
}
])
Result :
/* 1 */
{
"count" : 4,
"results" : [
{
"_id" : ObjectId("5e20a63ed3c98f2a7100fd4a"),
"name" : "Joe",
"hobby" : "Food",
"fanClub" : [
{
"name" : "Joe"
},
{
"name" : "Lyn"
}
]
},
{
"_id" : ObjectId("5e20a63ed3c98f2a7100fd4b"),
"name" : "Lyn",
"hobby" : "Food",
"fanClub" : [
{
"name" : "Joe"
},
{
"name" : "Lyn"
}
]
},
{
"_id" : ObjectId("5e20a63ed3c98f2a7100fd4c"),
"name" : "Rex",
"hobby" : "Play",
"fanClub" : [
{
"name" : "Rex"
}
]
},
{
"_id" : ObjectId("5e20a63ed3c98f2a7100fd4d"),
"name" : "Rex",
"hobby" : "Shop",
"fanClub" : [
{
"name" : "Rex"
}
]
}
]
}
If #srinivasy's answer meets your requierements, please grant my points him :)
If you want to get such structure:
{
count: 454,
results: [
{name: "Joe", hobby: "Food", fanClub: ["Joe", "Lyn", "Alfred"]},
{name: "Lyn", hobby: "Food", fanClub: ["Joe", "Lyn", "Alfred"]},
{name: "Rex", hobby: "Play", fanClub: ["Rex"]},
{name: "Rex", hobby: "Shop", fanClub: ["Rex", "Rita"]}
]
}
Use this query ($reduce is used to return single value, in you case fanClub as array):
db.members.aggregate([
{
$facet: {
pipe1: [
{
$count: "count"
}
],
pipe2: [
{
$skip: 0
},
{
$limit: 4
},
{
$lookup: {
from: "members",
let: {
hobby: "$hobby"
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$hobby",
"$$hobby"
]
}
}
}
],
as: "fanClub"
}
}
]
}
},
{
$unwind: "$pipe1"
},
{
$project: {
count: "$pipe1.count",
results: {
$map: {
input: "$pipe2",
as: "pipe2",
in: {
_id: "$$pipe2._id",
hobby: "$$pipe2.hobby",
name: "$$pipe2.name",
fanClub: {
$reduce: {
input: "$$pipe2.fanClub",
initialValue: [],
in: {
$concatArrays: [
"$$value",
[
"$$this.name"
]
]
}
}
}
}
}
}
}
}
])
MongoPlayground

MongoDB Aggregate how to pair relevant records for processing

I've got some event data captured in a MongoDB database, and some of these events occur in pairs.
Eg: DOOR_OPEN and DOOR_CLOSE are two events that occur in pairs
Events collection:
{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: t }
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: t+5 }
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:t+10 }
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:t+30 }
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:t+35 }
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:t+40 }
...
Assuming the records are sorted on the timestamp, the _id: 1 and _id: 3 are a "pair" for "user1. _id: 2 and _id: 6 for "user2".
I'd like to take all these DOOR_OPEN & DOOR_CLOSE pairs per user and calculate the average duration etc. the door has been opened by each user.
Can this be achieved using the aggregate framework?
You can use $lookup and $group for achieving this.
db.getCollection('TestColl').aggregate([
{ $match: {"name": { $in: [ "DOOR_OPEN", "DOOR_CLOSE" ] } }},
{ $lookup:
{
from: "TestColl",
let: { userID_lu: "$userID", name_lu: "$name", timestamp_lu :"$timestamp" },
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$userID", "$$userID_lu" ] },
{ $eq: [ "$$name_lu", "DOOR_OPEN" ]},
{ $eq: [ "$name", "DOOR_CLOSE" ]},
{ $gt: [ "$timestamp", "$$timestamp_lu" ] }
]
}
}
},
],
as: "close_dates"
}
},
{ $addFields: { "close_time": { $arrayElemAt: [ "$close_dates.timestamp", 0 ] } } },
{ $addFields: { "time_diff": { $divide: [ { $subtract: [ "$close_time", "$timestamp" ] }, 1000 * 60 ]} } }, // Minutes
{ $group: { _id: "$userID" ,
events: { $push: { "eventId": "$_id", "name": "$name", "timestamp": "$timestamp" } },
averageTimestamp: {$avg: "$time_diff"}
}
}
])
Sample Data:
[
{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: ISODate("2019-10-24T08:00:00Z") },
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: ISODate("2019-10-24T08:05:00Z") },
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:10:00Z") },
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:ISODate("2019-10-24T08:30:00Z") },
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:ISODate("2019-10-24T08:35:00Z") },
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:ISODate("2019-10-24T08:40:00Z") },
{ _id: 7, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:50:00Z") },
{ _id: 8, name: "DOOR_OPEN", userID: "user2", timestamp:ISODate("2019-10-24T08:55:00Z") }
]
Result:
/* 1 */
{
"_id" : "user2",
"events" : [
{
"eventId" : 2.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:05:00.000Z")
},
{
"eventId" : 6.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:40:00.000Z")
},
{
"eventId" : 8.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:55:00.000Z")
}
],
"averageTimestamp" : 35.0
}
/* 2 */
{
"_id" : "user1",
"events" : [
{
"eventId" : 1.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:00:00.000Z")
},
{
"eventId" : 3.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:10:00.000Z")
},
{
"eventId" : 4.0,
"name" : "DOOR_OPEN",
"timestamp" : ISODate("2019-10-24T08:30:00.000Z")
},
{
"eventId" : 7.0,
"name" : "DOOR_CLOSE",
"timestamp" : ISODate("2019-10-24T08:50:00.000Z")
}
],
"averageTimestamp" : 15.0
}
You could use the $group operator of the aggregate framework to group by userID and calculate the averages:
db.events.aggregate([{
$group: {
_id: "$userID",
averageTimestamp: {$avg: "$timestamp"}
}
}]);
If you also want to discard any other event other than DOOR_OPEN or DOOR_CLOSED, you can add a filter adding a $match in the aggregate pipeline:
db.events.aggregate([{
$match: {
$or: [{name: "DOOR_OPEN"},{name: "DOOR_CLOSE"}]
}
}, {
$group: {
_id: "$userID",
averageTimestamp: {$avg: "$timestamp"}
}
}]);

Use $size with $sort in array and sub array

Here's the structure part of my collection:
_id: ObjectId("W"),
names: [
{
number: 1,
subnames: [ { id: "X", day: 1 }, { id: "Y", day: 10 }, { id: "Z", day: 2 } ],
list: ["A","B","C"],
day: 1
},
{
number: 2,
day: 5
},
{
number: 3,
subnames: [ { id: "X", day: 8 }, { id: "Z", day: 5 } ],
list: ["A","C"],
day: 2
},
...
],
...
I use this request:
db.publication.aggregate( [ { $match: { _id: ObjectId("W") } }, { $group: { _id: "$_id", SizeName: { $first: { $size: { $ifNull: [ "$names", [] ] } } }, names: { $first: "$names" } } }, { $unwind: "$names" }, { $sort: { "names.day": 1 } }, { $group: { _id: "$_id", SzNames: { $sum: 1 }, names: { $push: { number: "$names.number", subnames: "$names.subnames", list: "$names.list", SizeList: { $size: { $ifNull: [ "$names.list", [] ] } } } } } } ] );
but I would now use $sort for my names array AND my subnames array to obtain this result (subnames may not exist) :
_id: ObjectId("W"),
names: [
{
number: 2,
SizeList: 0,
day: 5
},
{
number: 3,
subnames: [ { id: "Z", day: 5 }, { id: "X", day: 8 } ],
list: ["A","C"],
SizeList: 2,
day: 2
},
{
number: 1,
subnames: [ { id: "X", day: 1 }, { id: "Z", day: 2 }, { id: "Y", day: 10 } ],
list: ["A","B","C"],
SizeList: 3,
day: 1
}
...
],
...
Can you help me ?
You can do this, but with great difficulty. I for one would gladly vote for an inline version of $sort along the lines of the $map operator. That would makes things so much easier.
For now though you need to de-construct and re-build the arrays after sorting. And you have to be very careful about this. Hence make false arrays with a single entry before processing $unwind:
db.publication.aggregate([
{ "$project": {
"SizeNames": {
"$size": {
"$ifNull": [ "$names", [] ]
}
},
"names": { "$ifNull": [{ "$map": {
"input": "$names",
"as": "el",
"in": {
"SizeList": {
"$size": {
"$ifNull": [ "$$el.list", [] ]
}
},
"SizeSubnames": {
"$size": {
"$ifNull": [ "$$el.subnames", [] ]
}
},
"number": "$$el.number",
"day": "$$el.day",
"subnames": { "$ifNull": [ "$$el.subnames", [0] ] },
"list": "$$el.list"
}
}}, [0] ] }
}},
{ "$unwind": "$names" },
{ "$unwind": "$names.subnames" },
{ "$sort": { "_id": 1, "names.subnames.day": 1 } },
{ "$group": {
"_id": {
"_id": "$_id",
"SizeNames": "$SizeNames",
"names": {
"SizeList": "$names.SizeList",
"SizeSubnames": "$names.SizeSubnames",
"number": "$names.number",
"list": "$names.list",
"day": "$names.day"
}
},
"subnames": { "$push": "$names.subnames" }
}},
{ "$sort": { "_id._id": 1, "_id.names.day": 1 } },
{ "$group": {
"_id": "$_id._id",
"SizeNames": { "$first": "$_id.SizeNames" },
"names": {
"$push": { "$cond": [
{ "$ne": [ "$_id.names.SizeSubnames", 0 ] },
{
"number": "$_id.names.number",
"subnames": "$subnames",
"list": "$_id.names.list",
"SizeList": "$_id.names.SizeList",
"day": "$_id.names.day"
},
{
"number": "$_id.names.number",
"list": "$_id.names.list",
"SizeList": "$_id.names.SizeList",
"day": "$_id.names.day"
}
]}
}
}},
{ "$project": {
"SizeNames": 1,
"names": {
"$cond": [
{ "$ne": [ "$SizeNames", 0 ] },
"$names",
[]
]
}
}}
])
You can kind of "hide away" the original empty array from the inner document as shown, but it's really difficult to remove all presence of the outer "names" array without pulling a similar conditional array "push" technique, and that really isn't a practical approach.
If all of this is just about sorting array elements in individual documents though, the aggregation framework should not be the tool to do this. It can be done as shown, but per document this is much easier to do in client side code.
Output:
{
"_id" : ObjectId("54b5cff8102f292553ce9bb5"),
"SizeNames" : 3,
"names" : [
{
"number" : 1,
"subnames" : [
{
"id" : "X",
"day" : 1
},
{
"id" : "Z",
"day" : 2
},
{
"id" : "Y",
"day" : 10
}
],
"list" : [
"A",
"B",
"C"
],
"SizeList" : 3,
"day" : 1
},
{
"number" : 3,
"subnames" : [
{
"id" : "Z",
"day" : 5
},
{
"id" : "X",
"day" : 8
}
],
"list" : [
"A",
"C"
],
"SizeList" : 2,
"day" : 2
},
{
"number" : 2,
"SizeList" : 0,
"day" : 5
}
]
}