I use the following collection which represents sports > categories > tournaments.
{
"_id" : ObjectId("597846358bbbc4440895f2e8"),
"Name" : [
{ "k" : "en-US", "v" : "Soccer" },
{ "k" : "fr-FR", "v" : "Football" }
],
"Categories" : [
{
"Name" : [
{ "k" : "en-US", "v" : "France" },
{ "k" : "fr-FR", "v" : "France" }
],
"Tournaments" : [
{
"Name" : [
{ "k" : "en-US", "v" : "Ligue 1" },
{ "k" : "fr-FR", "v" : "Ligue 1" }
],
},
{
"Name" : [
{ "k" : "en-US", "v" : "Ligue 2" },
{ "k" : "fr-FR", "v" : "Ligue 2" }
],
}
]
},
{
"Name" : [
{ "k" : "en-US", "v" : "England" },
{ "k" : "fr-FR", "v" : "Angleterre" }
],
"Tournaments" : [
{
"Name" : [
{ "k" : "en-US", "v" : "Premier League" },
{ "k" : "fr-FR", "v" : "Premier League" }
],
},
{
"Name" : [
{ "k" : "en-US", "v" : "Championship" },
{ "k" : "fr-FR", "v" : "Championnat" }
],
}
]
},
]
}
I want to query the collection using the category’s name and the tournament’s name. I’ve successfully use “$elemMatch” with the following code:
db.getCollection('Sport').find({
Categories: {
$elemMatch: {
Name: {
$elemMatch: { v: "France" }
},
Tournaments: {
$elemMatch: {
Name: {
$elemMatch: { v: "Ligue 1" }
}
}
}
}
} },
{ "Categories.$": 1, Name: 1 })
However, I cannot receive only the matching tournament in the category object.
Using the answer in this question: MongoDB Projection of Nested Arrays, I’ve built an aggregation:
db.getCollection('Sport').aggregate([{
"$match": {
"Categories": {
"$elemMatch": {
"Name": {
"$elemMatch": {
"v": "France"
}
},
"Tournaments": {
"$elemMatch": {
"Name": {
"$elemMatch": {
"v": "Ligue 1"
}
}
}
}
}
}
}
}, {
"$addFields": {
"Categories": {
"$filter": {
"input": {
"$map": {
"input": "$Categories",
"as": "category",
"in": {
"Tournaments": {
"$filter": {
"input": "$$category.Tournaments",
"as": "tournament",
"cond": {
// stuck here
}
}
}
}
}
},
"as": "category",
"cond": {
// stuck here
}
}
}
}
}
])
I tried to use a condition but MongoDB doesn’t recognize (Use of undefined variable:) $$KEEP and $$PRUNE ($redact) when I use $anyElementTrue then $map on the “Name” property.
My question: how can I check that the collection of names contains my string?
I'm more surprised that on the answer you reference I did not not "strongly recommend you do not nest arrays" like this. Nesting in this way is impossible to update atomically until the next release of MongoDB, and they are notoriously difficult to query.
For this particular case you would do:
db.getCollection('Sport').aggregate([
{ "$match": {
"Categories": {
"$elemMatch": {
"Name.v": "France",
"Tournaments.Name.v": "Ligue 1"
}
}
}},
{ "$addFields": {
"Categories": {
"$filter": {
"input": {
"$map": {
"input": "$Categories",
"as": "c",
"in": {
"Name": {
"$filter": {
"input": "$$c.Name",
"as": "n",
"cond": { "$eq": [ "$$n.v", "France" ] }
}
},
"Tournaments": {
"$filter": {
"input": {
"$map": {
"input": "$$c.Tournaments",
"as": "t",
"in": {
"Name": {
"$filter": {
"input": "$$t.Name",
"as": "n",
"cond": {
"$eq": [ "$$n.v", "Ligue 1" ]
}
}
}
}
}
},
"as": "t",
"cond": {
"$ne": [{ "$size": "$$t.Name" }, 0]
}
}
}
}
}
},
"as": "c",
"cond": {
"$and": [
{ "$ne": [{ "$size": "$$c.Name" },0] },
{ "$ne": [{ "$size": "$$c.Tournaments" },0] }
]
}
}
}
}}
])
Which returns the result:
/* 1 */
{
"_id" : ObjectId("597846358bbbc4440895f2e8"),
"Name" : [
{
"k" : "en-US",
"v" : "Soccer"
},
{
"k" : "fr-FR",
"v" : "Football"
}
],
"Categories" : [
{
"Name" : [
{
"k" : "en-US",
"v" : "France"
},
{
"k" : "fr-FR",
"v" : "France"
}
],
"Tournaments" : [
{
"Name" : [
{
"k" : "en-US",
"v" : "Ligue 1"
},
{
"k" : "fr-FR",
"v" : "Ligue 1"
}
]
}
]
}
]
}
The whole point is that each array needs a $filter, and at the outer levels you are looking for $size not being 0 as a result of "inner" $filter operations on contained arrays.
Since the "inner" arrays can change in content as a result, the "outer" arrays need a $map in order to return the "changed" elements.
So in terms of the structure "Categories" needs a $map because it has inner elements. And the "inner" "Tournaments" needs a $map for the same reason. Every array all the way to the final properties need $filter, and each wrapping array with a $map has a $filter with a $size condition.
That's the general logic pattern, and it works by repeating that pattern for each nested level. As stated though, it's pretty horrible. Which is why you really should avoid "nesting" like this at all costs. The increased complexity just about always outweighs any perceived gains.
I should also note you went a little overboard with $elemMatch, You really only need it at the "Categories" array level since that's the only thing that has multiple conditions to be met for it's element.
The sub-elements can use plain "Dot Notation" since they are only "singular" conditions within their respective arrays. So that does cut down on the terse syntax somewhat and still matches exactly the same documents.
Related
I have a collection with child reference. Each document can have multiple parents.
How can I query it with $graphLookup in order to make a result prepared for a treeview?
Example:
{
"_id" : ObjectId("6143450cc0318c23d8f18424"),
"id" : "3",
"name" : "prod03",
"children" : [
{
"_id" : "6143440ac0318c23d8f1841f",
"qty" : 10
},
{
"_id" : "614344b1c0318c23d8f18422",
"qty" : 100
}
],
"totalQty" : 110
},
{
"_id" : ObjectId("614344b1c0318c23d8f18422"),
"id" : "2",
"name" : "prod02",
"children" : [ ],
"totalQty" : 100
},
{
"_id" : ObjectId("6143440ac0318c23d8f1841f"),
"id" : "1",
"name" : "prod01",
"children" : [ ],
"totalQty" : 10
}
Prod03 is formed from prod01 and prod02
The desired result would be like:
{
id: '3',
name: 'prod03',
totalQty: 110
children: [
{
id: '1',
name: 'prod01',
qty: 10
},
{
id: '2',
name: 'prod02',
qty: 100
},
],
}
The query must go multiple levels down until find no more children.
Final result would be a tree with all history of product manufacture components.
You are actually on the right track to use $graphLookup. You just need to convert children._id back to objectId from String for lookup.
db.collection.aggregate([
{
"$match": {
"id": "3"
}
},
{
"$addFields": {
"children": {
"$map": {
"input": "$children",
"as": "c",
"in": {
"_id": {
"$toObjectId": "$$c._id"
},
"qty": "$$c.qty"
}
}
}
}
},
{
"$graphLookup": {
"from": "collection",
"startWith": "$children._id",
"connectFromField": "children._id",
"connectToField": "_id",
"as": "children"
}
},
{
"$addFields": {
"children": {
"$map": {
"input": "$children",
"as": "c",
"in": {
"id": "$$c.id",
"name": "$$c.name",
"qty": "$$c.totalQty"
}
}
}
}
}
])
Here is the Mongo playground for your reference.
{
"no" : "2020921008981",
"date" : ISODate("2020-04-01T05:19:02.263+0000"),
"sale" : {
"soldItems" : [
{
"itemId" : "5b55ac7f0550de00210a3b24",
"qty" : NumberInt(1),
},
{
"itemId" : "5b55ac7f0550de00210a3b25",
"qty" : NumberInt(2),
}
],
"items" : [
{
"_id" : ObjectId("5b55ac7f0550de00210a3b24"),
unit :"KG"
},
{
"_id" : ObjectId("5b55ac7f0550de00210a3b25"),
unit :"ML"
}
]
}
}
Desired output :
{
"no" : "2020921008981",
"sale" : {}
"qtyList" : "1 KG \n 2 ML"
}
In order to build itemQtyList output field, two fields from different arrays (string and int) should be used. Couldn't find any reference for doing that. Any idea would be appreciated.
You can use below aggregation
db.collection.aggregate([
{ "$project": {
"itemQtyList": {
"$reduce": {
"input": { "$range": [0, { "$size": "$sale.soldItems" }] },
"initialValue": "",
"in": {
"$concat": [
"$$value",
{ "$cond": [{ "$eq": ["$$this", 0] }, "", " \n "] },
{ "$toString": {
"$arrayElemAt": [
"$sale.soldItems.qty",
"$$this"
]
}},
" ",
{ "$arrayElemAt": ["$sale.items.unit", "$$this"] }
]
}
}
}
}}
])
MongoPlayground
Using query aggregation I want to create a new array by a filter of another array, so that the filtered result will be done by a specific field of the preliminary array.
In this case, I want to filter by the field "fieldName".
I will always want to filter out the last occur
example:
I have one document:
{
"fullyQualifiedName" : "MongoDB",
"items" : [
{
"fieldName" : "_id",
"fieldCount" : 7,
"confidence_level" : 1,
"fieldClassifications" : [
"LineageGuid"
],
},
{
"fieldName" : "_id",
"fieldCount" : 7,
"fieldClassifications" : [
{
"classificationName" : "LineageGuid",
}
]
},
{
"fieldName" : "details",
"fieldCount" : 7,
},
{
"fieldName" : "state",
"fieldCount" : 7,
}
]
}
I want to create a new array like:
"items" : [
{
"fieldName" : "_id",
"fieldCount" : 7,
"confidence_level" : 1,
"fieldClassifications" : [
"LineageGuid"
],
},
{
"fieldName" : "details",
"fieldCount" : 7,
},
{
"fieldName" : "state",
"fieldCount" : 7,
}
]
The simple solution is to $unwind and $group again but I can't do it because of performance issue.
I am using MongoDB 3.4
You can use below aggregation
db.collection.aggregate([
{ "$addFields": {
"items": {
"$map": {
"input": {
"$setUnion": [
{ "$map": {
"input": "$items",
"in": { "$indexOfArray": ["$items.fieldName", "$$this.fieldName"] }
}}
]
},
"as": "i",
"in": {
"fieldName": { "$arrayElemAt": ["$items.fieldName", "$$i"] },
"fieldCount": { "$arrayElemAt": ["$items.fieldCount", "$$i"] },
"confidence_level": { "$arrayElemAt": ["$items.confidence_level", "$$i"] },
"fieldClassifications": { "$arrayElemAt": ["$items.fieldClassifications", "$$i"] }
}
}
}
}}
])
EDIT:
Our use case:
We get continues reports from servers about visitors. We pre-aggregate the data on the servers for a few seconds aber after that insert these "reports" into MongoDB.
In our dashboard we would like to query the different browsers, OSes, geolocation (country etc.) based on time ranges.
So like: Within the last 7 days, there were 1000 visitors using Chrome, 500 from Germany, 200 from England and so on.
I'm pretty stuck with a MongoDB query we need for our dashboard.
We have following report entries:
{
"_id" : ObjectId("59b9d08e402025326e1a0f30"),
"channel_perm_id" : "c361049fb4144b0e81b71c0b6cfdc296",
"source_id" : "insomnia",
"start_timestamp" : ISODate("2017-09-14T00:42:54.510Z"),
"end_timestamp" : ISODate("2017-09-14T00:42:54.510Z"),
"timestamp" : ISODate("2017-09-14T00:42:54.510Z"),
"resource_uri" : "b755d62a-8c0a-4e8a-945f-41782c13535b",
"sources_info" : {
"browsers" : [
{
"name" : "Chrome",
"count" : NumberLong(2)
}
],
"operating_systems" : [
{
"name" : "Mac OS X",
"count" : NumberLong(2)
}
],
"continent_ids" : [
{
"name" : "EU",
"count" : NumberLong(1)
}
],
"country_ids" : [
{
"name" : "DE",
"count" : NumberLong(1)
}
],
"city_ids" : [
{
"name" : "Solingen",
"count" : NumberLong(1)
}
]
},
"unique_sources" : NumberLong(1),
"requests" : NumberLong(1),
"cache_hits" : NumberLong(0),
"cache_misses" : NumberLong(1),
"cache_hit_size" : NumberLong(0),
"cache_refill_size" : NumberLong("170000000000")
}
Now, we need to aggregate these reports based on timestamp.
So far, so easy:
db.channel_report.aggregate([{
$group: {
_id: {
$dateToString: {
format: "%Y",
date: "$timestamp"
}
},
sources_info: {
$push: "$sources_info"
}
},
}];
But now it gets difficult for me. As you might already noticed, the sources_info object is the problem.
Instead of just "pushing" all sources info into array per group, we need to actually accumulate it.
So, if we have something like this:
{
sources_info: [
{
browsers: [
{
name: "Chrome,
count: 1
}
]
},
{
browsers: [
{
name: "Chrome,
count: 1
}
]
}
]
}
The array should be reduced to this:
{
sources_info:
{
browsers: [
{
name: "Chrome,
count: 2
}
]
}
}
We migrated from MySQL to MongoDB for analytics, but I have no clue how to model this behaviour in Mongo. Regarding the docs I almost think it is not possible, at least not with the current data structure.
Is there a nice solution for this? Or maybe even a different kind of data structure?
Cheers,
Chris from StriveCDN
The basic problem you have is that you are using "named keys" where you probably really should be instead using values to a consistent attribute path. This means instead of keys like "browsers", this probably should simply be "type": "browser" and so on on each entry.
The reasoning for this should become apparent on the general approaches to aggregating the data. It also really helps in querying in general. But the approaches basically involve coercing your initial data format into this kind of structure in order to aggregate it first.
With most recent releases ( MongoDB 3.4.4 and greater ), we can work with your named keys via $objectToArray and manipulate as follows:
db.channel_report.aggregate([
{ "$project": {
"timestamp": 1,
"sources": {
"$reduce": {
"input": {
"$map": {
"input": { "$objectToArray": "$sources_info" },
"as": "s",
"in": {
"$map": {
"input": "$$s.v",
"as": "v",
"in": {
"type": "$$s.k",
"name": "$$v.name",
"count": "$$v.count"
}
}
}
}
},
"initialValue": [],
"in": { "$concatArrays": ["$$value", "$$this"] }
}
}
}},
{ "$unwind": "$sources" },
{ "$group": {
"_id": {
"year": { "$year": "$timestamp" },
"type": "$sources.type",
"name": "$sources.name"
},
"count": { "$sum": "$sources.count" }
}},
{ "$group": {
"_id": { "year": "$_id.year", "type": "$_id.type" },
"v": { "$push": { "name": "$_id.name", "count": "$count" } }
}},
{ "$group": {
"_id": "$_id.year",
"sources_info": {
"$push": { "k": "$_id.type", "v": "$v" }
}
}},
{ "$addFields": {
"sources_info": { "$arrayToObject": "$sources_info" }
}}
])
Taking that back a notch to MongoDB 3.4 ( which should be default on most hosted services by now ) you could alternately declare each key name manually:
db.channel_report.aggregate([
{ "$project": {
"timestamp": 1,
"sources": {
"$concatArrays": [
{ "$map": {
"input": "$sources_info.browsers",
"in": {
"type": "browsers",
"name": "$$this.name",
"count": "$$this.count"
}
}},
{ "$map": {
"input": "$sources_info.operating_systems",
"in": {
"type": "operating_systems",
"name": "$$this.name",
"count": "$$this.count"
}
}},
{ "$map": {
"input": "$sources_info.continent_ids",
"in": {
"type": "continent_ids",
"name": "$$this.name",
"count": "$$this.count"
}
}},
{ "$map": {
"input": "$sources_info.country_ids",
"in": {
"type": "country_ids",
"name": "$$this.name",
"count": "$$this.count"
}
}},
{ "$map": {
"input": "$sources_info.city_ids",
"in": {
"type": "city_ids",
"name": "$$this.name",
"count": "$$this.count"
}
}}
]
}
}},
{ "$unwind": "$sources" },
{ "$group": {
"_id": {
"year": { "$year": "$timestamp" },
"type": "$sources.type",
"name": "$sources.name"
},
"count": { "$sum": "$sources.count" }
}},
{ "$group": {
"_id": { "year": "$_id.year", "type": "$_id.type" },
"v": { "$push": { "name": "$_id.name", "count": "$count" } }
}},
{ "$group": {
"_id": "$_id.year",
"sources": {
"$push": { "k": "$_id.type", "v": "$v" }
}
}},
{ "$project": {
"sources_info": {
"browsers": {
"$arrayElemAt": [
"$sources.v",
{ "$indexOfArray": [ "$sources.k", "browsers" ] }
]
},
"operating_systems": {
"$arrayElemAt": [
"$sources.v",
{ "$indexOfArray": [ "$sources.k", "operating_systems" ] }
]
},
"continent_ids": {
"$arrayElemAt": [
"$sources.v",
{ "$indexOfArray": [ "$sources.k", "continent_ids" ] }
]
},
"country_ids": {
"$arrayElemAt": [
"$sources.v",
{ "$indexOfArray": [ "$sources.k", "country_ids" ] }
]
},
"city_ids": {
"$arrayElemAt": [
"$sources.v",
{ "$indexOfArray": [ "$sources.k", "city_ids" ] }
]
}
}
}}
])
We can even wind that back to MongoDB 3.2 by using $map and $filter in place of $indexOfArray, but the general approach is the main thing to explain.
Concatenate arrays
The main thing that needs to happen is to take the data from the many different arrays with named keys and make a "single array" with a "type" property representing each key name. This is arguably how the data should be stored in the first place, and the first aggregation stage of either approach comes out like this:
/* 1 */
{
"_id" : ObjectId("59b9d08e402025326e1a0f30"),
"timestamp" : ISODate("2017-09-14T00:42:54.510Z"),
"sources" : [
{
"type" : "browsers",
"name" : "Chrome",
"count" : NumberLong(2)
},
{
"type" : "operating_systems",
"name" : "Mac OS X",
"count" : NumberLong(2)
},
{
"type" : "continent_ids",
"name" : "EU",
"count" : NumberLong(1)
},
{
"type" : "country_ids",
"name" : "DE",
"count" : NumberLong(1)
},
{
"type" : "city_ids",
"name" : "Solingen",
"count" : NumberLong(1)
}
]
}
Unwind and Group
Part of the data you want to accumulate on actually includes those "type" and "name" properties from "within" the array. Whenever you need to accumulate across documents from "within an array", the process you use is $unwind in order to be able to access those values as part of the grouping key.
What this means is that after using $unwind on the combined array, you then want to $group on both of those keys and the reduced "timestamp" detail in order to $sum the "count" values.
Since you then have "sub-levels" of detail ( i.e each name of browser within browsers ) then you use additional $group pipeline stages, gradually decreasing the granularity of the grouping keys and using $push to accumulate the details into arrays.
In either case, omitting the very last stage of output the accumulated structure comes out as:
/* 1 */
{
"_id" : 2017,
"sources_info" : [
{
"k" : "continent_ids",
"v" : [
{
"name" : "EU",
"count" : NumberLong(1)
}
]
},
{
"k" : "city_ids",
"v" : [
{
"name" : "Solingen",
"count" : NumberLong(1)
}
]
},
{
"k" : "country_ids",
"v" : [
{
"name" : "DE",
"count" : NumberLong(1)
}
]
},
{
"k" : "browsers",
"v" : [
{
"name" : "Chrome",
"count" : NumberLong(2)
}
]
},
{
"k" : "operating_systems",
"v" : [
{
"name" : "Mac OS X",
"count" : NumberLong(2)
}
]
}
]
}
This really is the final state of the data, though not represented in the same form as it was originally found. It is arguably complete at this point as any further processing is merely cosmetic to output as named keys again.
Output to named keys
As shown the varied approaches are either looking up the array entries by the matching key name, or by using $arrayToObject to transform the array content back into an object with named keys.
An alternate is also to simply do that very last manipulation in code, as shown by this .map() example of manipulating the cursor result in the shell:
db.channel_report.aggregate([
{ "$project": {
"timestamp": 1,
"sources": {
"$reduce": {
"input": {
"$map": {
"input": { "$objectToArray": "$sources_info" },
"as": "s",
"in": {
"$map": {
"input": "$$s.v",
"as": "v",
"in": {
"type": "$$s.k",
"name": "$$v.name",
"count": "$$v.count"
}
}
}
}
},
"initialValue": [],
"in": { "$concatArrays": ["$$value", "$$this"] }
}
}
}},
{ "$unwind": "$sources" },
{ "$group": {
"_id": {
"year": { "$year": "$timestamp" },
"type": "$sources.type",
"name": "$sources.name"
},
"count": { "$sum": "$sources.count" }
}},
{ "$group": {
"_id": { "year": "$_id.year", "type": "$_id.type" },
"v": { "$push": { "name": "$_id.name", "count": "$count" } }
}},
{ "$group": {
"_id": "$_id.year",
"sources_info": {
"$push": { "k": "$_id.type", "v": "$v" }
}
}},
/*
{ "$addFields": {
"sources_info": { "$arrayToObject": "$sources_info" }
}}
*/
]).map( d => Object.assign(d,{
"sources_info": d.sources_info.reduce((acc,curr) =>
Object.assign(acc,{ [curr.k]: curr.v }),{})
}))
Which of course applies to either aggregation pipeline approach.
And of course even $concatArrays can be replaced with $setUnion as long as all the entries have a unique identifying combination of "name" and "type" ( as they appear to be ), and that means with application of modifying the final output by processing the cursor instead you can apply the technique even as far back as MongoDB 2.6.
Final Output
And the final output ( actually aggregated of course, but the question only samples one document ) accumulates for all the sub-keys and reconstructs from the last sample output as shown as:
{
"_id" : 2017,
"sources_info" : {
"continent_ids" : [
{
"name" : "EU",
"count" : NumberLong(1)
}
],
"city_ids" : [
{
"name" : "Solingen",
"count" : NumberLong(1)
}
],
"country_ids" : [
{
"name" : "DE",
"count" : NumberLong(1)
}
],
"browsers" : [
{
"name" : "Chrome",
"count" : NumberLong(2)
}
],
"operating_systems" : [
{
"name" : "Mac OS X",
"count" : NumberLong(2)
}
]
}
}
Where every array entry under each key of sources_info is reduced down to it's cumulative count for every other entry sharing the same "name".
Essentially I'm trying to filter OUT subdocuments and sub-subdocuments that have been "trashed". Here's a stripped-down version of my schema:
permitSchema = {
_id,
name,
...
feeClassifications: [
new Schema({
_id,
_trashed,
name,
fees: [
new Schema({
_id,
_trashed,
name,
amount
})
]
})
],
...
}
So I'm able to get the effect I want with feeClassifications. But I'm struggling to find a way to have the same effect for feeClassifications.fees as well.
So, this works as desired:
Permit.aggregate([
{ $match: { _id: mongoose.Types.ObjectId(req.params.id) }},
{ $project: {
_id: 1,
_name: 1,
feeClassifications: {
$filter: {
input: '$feeClassifications',
as: 'item',
cond: { $not: {$gt: ['$$item._trashed', null] } }
}
}
}}
])
But I also want to filter the nested array fees. I've tried a few things including:
Permit.aggregate([
{ $match: { _id: mongoose.Types.ObjectId(req.params.id) }},
{ $project: {
_id: 1,
_name: 1,
feeClassifications: {
$filter: {
input: '$feeClassifications',
as: 'item',
cond: { $not: {$gt: ['$$item._trashed', null] } }
},
fees: {
$filter: {
input: '$fees',
as: 'fee',
cond: { $not: {$gt: ['$$fee._trashed', null] } }
}
}
}
}}
])
Which seems to follow the mongodb docs the closest. But I get the error:
this object is already an operator expression, and can't be used as a document expression (at 'fees')
Update: -----------
As requested, here's a sample document:
{
"_id" : ObjectId("57803fcd982971e403e3e879"),
"_updated" : ISODate("2016-07-11T19:24:27.204Z"),
"_created" : ISODate("2016-07-09T00:05:33.274Z"),
"name" : "Single Event",
"feeClassifications" : [
{
"_updated" : ISODate("2016-07-11T19:05:52.418Z"),
"_created" : ISODate("2016-07-11T17:49:12.247Z"),
"name" : "Event Type 1",
"_id" : ObjectId("5783dc18e09be99840fad29f"),
"fees" : [
{
"_updated" : ISODate("2016-07-11T18:51:10.259Z"),
"_created" : ISODate("2016-07-11T18:41:16.110Z"),
"name" : "Basic Fee",
"amount" : 156.5,
"_id" : ObjectId("5783e84cc46a883349bb2339")
},
{
"_updated" : ISODate("2016-07-11T19:05:52.419Z"),
"_created" : ISODate("2016-07-11T19:05:47.340Z"),
"name" : "Secondary Fee",
"amount" : 50,
"_id" : ObjectId("5783ee0bad7bf8774f6f9b5f"),
"_trashed" : ISODate("2016-07-11T19:05:52.410Z")
}
]
},
{
"_updated" : ISODate("2016-07-11T18:22:21.567Z"),
"_created" : ISODate("2016-07-11T18:22:21.567Z"),
"name" : "Event Type 2",
"_id" : ObjectId("5783e3dd540078de45bbbfaf"),
"_trashed" : ISODate("2016-07-11T19:24:27.203Z")
}
]
}
And here's the desired output ("trashed" subdocuments are excluded from BOTH feeClassifications AND fees):
{
"_id" : ObjectId("57803fcd982971e403e3e879"),
"_updated" : ISODate("2016-07-11T19:24:27.204Z"),
"_created" : ISODate("2016-07-09T00:05:33.274Z"),
"name" : "Single Event",
"feeClassifications" : [
{
"_updated" : ISODate("2016-07-11T19:05:52.418Z"),
"_created" : ISODate("2016-07-11T17:49:12.247Z"),
"name" : "Event Type 1",
"_id" : ObjectId("5783dc18e09be99840fad29f"),
"fees" : [
{
"_updated" : ISODate("2016-07-11T18:51:10.259Z"),
"_created" : ISODate("2016-07-11T18:41:16.110Z"),
"name" : "Basic Fee",
"amount" : 156.5,
"_id" : ObjectId("5783e84cc46a883349bb2339")
}
]
}
]
}
Since we want to filter both the outer and inner array fields, we can use the $map variable operator which return an array with the "values" we want.
In the $map expression, we provide a logical $conditional $filter to remove the non matching documents from both the document and subdocument array field.
The conditions are $lt which return true when the field "_trashed" is absent in the sub-document and or in the sub-document array field.
Note that in the $cond expression we also return false for the <false case>. Of course we need to apply filter to the $map result to remove all false.
Permit.aggregate(
[
{ "$match": { "_id": mongoose.Types.ObjectId(req.params.id) } },
{ "$project": {
"_updated": 1,
"_created": 1,
"name": 1,
"feeClassifications": {
"$filter": {
"input": {
"$map": {
"input": "$feeClassifications",
"as": "fclass",
"in": {
"$cond": [
{ "$lt": [ "$$fclass._trashed", 0 ] },
{
"_updated": "$$fclass._updated",
"_created": "$$fclass._created",
"name": "$$fclass.name",
"_id": "$$fclass._id",
"fees": {
"$filter": {
"input": "$$fclass.fees",
"as": "fees",
"cond": { "$lt": [ "$$fees._trashed", 0 ] }
}
}
},
false
]
}
}
},
"as": "cls",
"cond": "$$cls"
}
}
}}
]
)
In the upcoming MongoDB release (as of this writing and since MongoDB 3.3.5), You can replace the $cond expression in the the $map expression with a $switch expression:
Permit.aggregate(
[
{ "$match": { "_id": mongoose.Types.ObjectId(req.params.id) } },
{ "$project": {
"_updated": 1,
"_created": 1,
"name": 1,
"feeClassifications": {
"$filter": {
"input": {
"$map": {
"input": "$feeClassifications",
"as": "fclass",
"in": {
"$switch": {
"branches": [
{
"case": { "$lt": [ "$$fclass._trashed", 0 ] },
"then": {
"_updated": "$$fclass._updated",
"_created": "$$fclass._created",
"name": "$$fclass.name",
"_id": "$$fclass._id",
"fees": {
"$filter": {
"input": "$$fclass.fees",
"as": "fees",
"cond": { "$lt": [ "$$fees._trashed", 0 ] }
}
}
}
}
],
"default": false
}
}
}
},
"as": "cls",
"cond": "$$cls"
}
}
}}
]
)
For more complicated bigdats, it would be unnecessarily difficult.
Just edit it in $filter input by adding a dotted annotation field.You can search the document to any depth of JSON by dotted annotation without further complicated $filter mapping.
"$filter":{
"input": "$feeClassifications._trashed",
"as": "trashed",
"cond": { "$lt": [ "$$trashed._trashed", 0 ] }
}