I have one document for blogpost like this:
{
"_id": "5d8051cdf0b1017da7bff23c",
"description": "<p>will update soon</p>",
"topic": "How to setup Kafka Cluster on CentOS",
"comments": [
{
"created_at": "2019-10-12T02:13:01.859Z",
"updated_at": "2019-10-12T02:13:01.859Z",
"edited": false,
"_id": "5da182aa013d12567e340a2d",
"message": "test"
"replies": [
{
"created_at": "2019-10-12T02:13:01.859Z",
"updated_at": "2019-10-12T02:13:01.859Z",
"edited": false,
"_id": "5da182aa013d12567e340a2a",
"message": "I am replying to first comment",
"commentator": "5daae8b8af029ec4533fe317"
},
{
"created_at": "2019-10-12T02:13:01.859Z",
"updated_at": "2019-10-12T02:13:01.859Z",
"edited": false,
"_id": "5da182aa013d12567e340a2c",
"message": "Helpful second Comment",
"commentator": "5d7f936544dac213e3f650ec"
}
]
}
]
}
}
I want to do a nested aggregation using mongodb.
My query so far is
{ $unwind: { path: '$comments', preserveNullAndEmptyArrays: true } },
{ $unwind: { path: '$comments.replies', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: 'users',
let: { thread_reply_commentator: '$comments.replies.commentator' },
pipeline: [
{ $match: { $expr: { $eq: ['$_id', '$$thread_reply_commentator'] } } },
{ $project: AUTHOR_PROJECTION },
],
as: 'comments.replies.commentator'
}
},
{ $unwind: { path: '$comments.replies.commentator', preserveNullAndEmptyArrays: true } },
{
$group: {
_id: { _id: '$_id', comment: "$comments._id" },
root: { $mergeObjects: '$$ROOT' },
replies: { $push: '$comments.replies' }
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: ['$$ROOT.replies']
}
}
}
And now my above query results in
{
"_id": "5d8051cdf0b1017da7bff23c",
"description": "<p>will update soon</p>",
"topic": "How to setup Kafka Cluster on CentOS",
"comments": [
{
"_id": "5da182aa013d12567e340a2d",
"message": "test",
"replies": {
"created_at": "2019-10-12T02:13:01.859Z",
"updated_at": "2019-10-12T02:13:01.859Z",
"edited": false,
"_id": "5da182aa013d12567e340a2a",
"message": "I am replying to first comment",
"commentator": {
"first_name":"test",
"last_name":"test"
}
}
},
{
"_id": "5da182aa013d12567e340a2d",
"message": "test",
"replies": {
"created_at": "2019-10-12T02:13:01.859Z",
"updated_at": "2019-10-12T02:13:01.859Z",
"edited": false,
"_id": "5da182aa013d12567e340a2a",
"message": "Helpful second Comment",
"commentator": {
"first_name":"test",
"last_name":"test"
}
}
}
]
}
But my desired result is:
{
"_id": "5d8051cdf0b1017da7bff23c",
"description": "<p>will update soon</p>",
"topic": "How to setup Kafka Cluster on CentOS",
"comments": [
{
"_id": "5da182aa013d12567e340a2d",
"message": "test",
"replies": [
{
"created_at": "2019-10-12T02:13:01.859Z",
"updated_at": "2019-10-12T02:13:01.859Z",
"edited": false,
"_id": "5da182aa013d12567e340a2a",
"message": "I am replying to first comment",
"commentator": {
"first_name":"test",
"last_name":"test"
}
},
{
"created_at": "2019-10-12T02:13:01.859Z",
"updated_at": "2019-10-12T02:13:01.859Z",
"edited": false,
"_id": "5da182aa013d12567e340a2a",
"message": "Helpful second Comment",
"commentator": {
"first_name":"test",
"last_name":"test"
}
}
]
}
]
}
Please help how can I achieve this. I know it is easily possible using mongoose but I donot have access to the model Schema so I can only use mongodb and cannot use mongoose.
Alternate approach: don't make the DB do anything that you cannot do as easily or performantly including the transfer of material over the network.
We (ultimately) want to substitute the commentator ID in the replies with info about the commentator like first and last name. We see that the commentator ID is a unique ID into a users collection. At worst, every single reply will have a different commentator ID. This means that a whole bunch of first and last names will have to be looked up. This takes time in DB engine but there is no getting around that. The engine then marries the replies info with the users info and sends the doc across the wire. Some commentators, however, will comment on more than one reply. Depending on the sophistication of the DB engine, we may only need that commentator's first and last name to be looked up and transmitted once for all replies in all docs. But the belief is that is probably not the case. It is use-case likely that doc-to-doc overlap of commentators is low. In addition, there is no getting around the fact that the first and last name -- same though they may be in many docs -- is sent over the wire again and again so there is no network/data performance gained through this approach. So given this setup, arguably the ideal query is simply this:
c = db.repliesColl.aggregate([
{$lookup: {
from: "users",
localField: "comments.replies.commentator",
foreignField: "commentator",
as: "z"
}
}
]);
That's it. What will this do? The "double dive" through two arrays (comments and replies) will in each doc create an array z containing the unique lookups for that doc. As stated before, if commentator C1 showed up over and over again, then yes that information is passed over the wire again and again (doc by doc) but make no mistake: there is no getting around the initial lookup to find it in the first place on the server side. And we assert that the "repeat rate" of commentators across docs is likely low.
So in practice, data like this (some extra fields eliminated for clarity):
{
"_id": "5d8051cdf0b1017da7bff23c",
"description": "<p>will update soon</p>",
"topic": "How to setup Kafka Cluster on CentOS",
"comments": [
{
"_id": "5da182aa013d12567e340a2d",
"message": "original msg",
"replies": [
{"commentator": "C1", "message": "I am replying to first comment"},
{"commentator": "C1", "message": "Forgot something"},
{"commentator": "C2", "message": "Second comment"},
{"commentator": "C2", "message": "Third comment, same guy"}
]
}
]
}
}
will yield this output:
(everything in the doc above plus):
"z" : [
{
"commentator" : "C1",
"fname" : "Steve",
"lname" : "Jones"
},
{
"commentator" : "C2",
"fname" : "Dan",
"lname" : "Dare"
}
]
So now, in the client side code, we can do this pseudocode:
Map cidmap = new HashMap();
while(cursor.hasNext()) {
Document doc = cursor.next();
// Capture id->name mappings:
for(Map m : (List)doc.get("z")) {
String cid = m.get("commentator");
if(!cidmap.containsKey(cid)) {
cidmap.set(cid, m));
}
}
// Process comments and, where necessary, substitute the value for cid.
}
The attraction here is that you are eliminating much of the work from the central resource by doing it yourself. The more "unique" the set of commentators, the more efficient this scheme becomes. In summary we are balancing load on the DB engine to manipulate the data, expected cardinality of unique commentators, network transfer speed, and complexity of client-side "post processing" of the query.
Try this one:
db.collection.aggregate([
{
$lookup: {
from: "users",
let: {
thread_reply_commentator: {
$reduce: {
input: "$comments.replies.commentator",
initialValue: [],
in: {
$concatArrays: [
"$$value",
"$$this"
]
}
}
}
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$_id",
"$$thread_reply_commentator"
]
}
}
},
{ $project: AUTHOR_PROJECTION }
],
as: "comentators"
}
},
{
$addFields: {
comments: {
$map: {
input: "$comments",
as: "comments",
in: {
$mergeObjects: [
"$$comments",
{
replies: {
$map: {
input: "$$comments.replies",
as: "replies",
in: {
$mergeObjects: [
"$$replies",
{
commentator: {
$arrayElemAt: [
{
$filter: {
input: "$comentators",
cond: {
$eq: [
"$$this._id",
"$$replies.commentator"
]
}
}
},
0
]
}
}
]
}
}
}
}
]
}
}
}
}
},
{
$project: {
comentators: 0,
"comments.replies.commentator._id": 0
}
}
])
MongoPlayground
Related
We have three nested arrays:
principalCredits with 2 objects
credits with 2 objects each
awardNominations.edges with variable totals from 0 to 3
The task is to add a field to the third array of objects awardNominations.edges based on a lookup from eventsCollection.
Here's the data I have (simplified, can copy and paste into MongoDB Compass):
[{
"principalCredits": [
{
"category": {
"id": "director",
"text": "Directors"
},
"totalCredits": 2,
"credits": [
{
"name": {
"id": "nm11813828",
"nameText": {
"text": "Pippa Ehrlich"
},
"awardNominations": {
"total": 2,
"edges": [
{
"node": {
"id": "an1393007",
"isWinner": true,
"award": {
"id": "an1393007",
"year": 2020,
"text": "Green Warsaw Award",
"event": {
"id": "ev0003786",
"text": "Millennium Docs Against Gravity"
},
"category": {
"text": null
}
}
}
},
{
"node": {
"id": "an1428940",
"isWinner": false,
"award": {
"id": "an1428940",
"year": 2021,
"text": "IDA Award",
"event": {
"id": "ev0000351",
"text": "International Documentary Association"
},
"category": {
"text": "Best Writing"
}
}
}
},
]
}
},
"category": {
"id": "director",
"text": "Director"
}
},
{
"name": {
"id": "nm1624755",
"nameText": {
"text": "James Reed"
},
"awardNominations": {
"total": 3,
"edges": [
{
"node": {
"id": "an0694012",
"isWinner": true,
"award": {
"id": "an0694012",
"year": 2015,
"text": "Best of Festival",
"event": {
"id": "ev0001486",
"text": "Jackson Wild Media Awards"
},
"category": {
"text": "Best of Festival"
}
}
}
},
{
"node": {
"id": "an0975779",
"isWinner": true,
"award": {
"id": "an0975779",
"year": 2017,
"text": "RTS West Television Award",
"event": {
"id": "ev0000571",
"text": "Royal Television Society, UK"
},
"category": {
"text": "Documentary"
}
}
}
},
{
"node": {
"id": "an0975781",
"isWinner": true,
"award": {
"id": "an0975781",
"year": 2015,
"text": "Grand Teton Prize",
"event": {
"id": "ev0001356",
"text": "Jackson Hole Film Festival"
},
"category": {
"text": "Best in Festival"
}
}
}
}
]
}
},
"category": {
"id": "director",
"text": "Director"
}
}
]
},
{
"category": {
"id": "writer",
"text": "Writers"
},
"totalCredits": 2,
"credits": [
{
"name": {
"id": "nm11813828",
"nameText": {
"text": "Pippa Ehrlich"
},
"awardNominations": {
"total": 2,
"edges": [
{
"node": {
"id": "an1393007",
"isWinner": true,
"award": {
"id": "an1393007",
"year": 2020,
"text": "Green Warsaw Award",
"event": {
"id": "ev0003786",
"text": "Millennium Docs Against Gravity"
},
"category": {
"text": null
}
}
}
},
{
"node": {
"id": "an1428940",
"isWinner": false,
"award": {
"id": "an1428940",
"year": 2021,
"text": "IDA Award",
"event": {
"id": "ev0000351",
"text": "International Documentary Association"
},
"category": {
"text": "Best Writing"
}
}
}
}
]
}
},
"category": {
"id": "writer",
"text": "Writer"
},
},
{
"name": {
"id": "nm1624755",
"nameText": {
"text": "James Reed"
},
"awardNominations": {
"total": 0,
"edges": []
}
},
"category": {
"id": "writer",
"text": "Writer"
},
}
]
}
]
}]
An example scored award should look like this:
{
"id": "an0975781",
"isWinner": true,
"award": { ... },
"score": 1.5
}
Once all the manipulation is done, the data needs to be in exactly the same shape as it was initially and with no null values. So in the case of the last array awardsNominations.edges it should be [] as it was, and not { node: { score: null }} or anything else.
To achieve this I have created an aggregation pipeline:
[
{
'$unwind': {
'path': '$principalCredits',
'preserveNullAndEmptyArrays': true
}
}, {
'$unwind': {
'path': '$principalCredits.credits',
'preserveNullAndEmptyArrays': true
}
}, {
'$unwind': {
'path': '$principalCredits.credits.name.awardNominations.edges',
'preserveNullAndEmptyArrays': true
}
}, {
'$lookup': {
'from': 'eventsCollection',
'localField': 'principalCredits.credits.name.awardNominations.edges.node.award.event.id',
'foreignField': 'id',
'as': 'matchingEvent'
}
}, {
'$unwind': {
'path': '$matchingEvent',
'preserveNullAndEmptyArrays': true
}
}, {
'$addFields': {
'principalCredits.credits.name.awardNominations.edges.node.score': {
'$multiply': [
'$matchingEvent.importance', {
'$cond': {
'if': '$principalCredits.credits.name.awardNominations.edges.node.isWinner',
'then': 1.5,
'else': 1.2
}
}
]
}
}
}
]
The above pipeline assigns the score to each award. However, the null values are still there and I have absolutely no idea how to group it back together. I have tried to group with:
{
'$group': {
'_id': '$id',
'titleDoc': {
'$first': '$$ROOT'
},
'allPrincipalCredits': {
'$push': '$principalCredits'
}
}
}
To keep the root and then somehow sort all the records back into shape but could not get back to the orginal object structure.
Any help in putting it all together will be much appriciated!
I'm fairly good with simple aggregations, but this seems to be too much for me currently and would love to learn how to $group things back properly.
I've tried and put together all the knowledge I have so far from different sources and similar answers but can't seem to get it to work.
Lookup collection eventsCollection contains objects like this:
{
"_id": { "$oid": "62c57125d6943d92f83f6fff" },
"id": "ev0030197",
"text": "#AmLatino Film Festival",
"importance": 1
}
So the "rule" in restoring to original structure is that for each $unwind you did to "deconstruct" the document you now have to do a $group to restore it.
As you can imagine in such a pipeline this could be VERY cumbersome. but definitely doable.
However let me propose a different approach that is still very messy but much easier compared to the alternative, additionally it is more efficient from a performance perspective.
(just minor sidenot the reason your score is still null is because you have a syntax error in your $multiply function)
Anyways, The idea is to first gather all the unique event ids that exist in the in nested documents.
Then execute one lookup to fetch all the relevant events.
And finally adding the score field using $map and $mergeDocuments instead of $unwinding and $grouping, like so:
Mongo Playground
db.collection.aggregate([
{
$addFields: {
allEvents: {
$reduce: {
input: {
$map: {
input: "$principalCredits",
in: {
$map: {
input: "$$this.credits",
as: "credit",
in: {
$map: {
input: "$$credit.name.awardNominations.edges",
as: "edge",
in: "$$edge.node.award.event.id"
}
}
}
}
}
},
initialValue: [],
in: {
"$concatArrays": [
{
"$reduce": {
input: "$$this",
initialValue: [],
in: {
"$concatArrays": [
"$$this",
"$$value"
]
}
}
},
"$$value"
]
}
}
}
}
},
{
"$lookup": {
"from": "eventsCollection",
"localField": "allEvents",
"foreignField": "id",
"as": "matchingEvents"
}
},
{
$addFields: {
principalCredits: {
$map: {
input: "$principalCredits",
in: {
$mergeObjects: [
"$$this",
{
credits: {
$map: {
input: "$$this.credits",
as: "credit",
in: {
$mergeObjects: [
"$$credit",
{
name: {
"$mergeObjects": [
"$$credit.name",
{
"awardNominations": {
"$mergeObjects": [
"$$credit.name.awardNominations",
{
edges: {
$map: {
input: "$$credit.name.awardNominations.edges",
as: "edge",
in: {
node: {
$mergeObjects: [
"$$edge.node",
{
score: {
"$multiply": [
{
$cond: [
"$$edge.node.isWinner",
1.5,
1.2
]
},
{
$first: {
$map: {
input: {
$filter: {
input: "$matchingEvents",
as: "matchedEvent",
cond: {
$eq: [
"$$matchedEvent.id",
"$$edge.node.award.event.id"
]
}
}
},
as: "matched",
in: "$$matched.importance"
}
}
}
]
}
}
]
}
}
}
}
}
]
}
}
]
}
}
]
}
}
}
}
]
}
}
}
}
},
{
$unset: [
"allEvents",
"matchingEvents"
]
}
])
Mongo Playground
I will just mention that you can make this much much much cleaner by involving some code while keeping the same approach suggested. first getting unique eventid with distinct. then fetching the matching importance for each event. Finally execute a single query using arrayFilters you can construct with this information.
Final side not is that the provided pipeline did not deal with null or missing values. So if an array is missing an error will be thrown as $map expects input to be a valid array.
This can easily be solved by just wrapping each of these expressions with $ifNull, like so:
{
$map: {
input: {$ifNull: ["$$this.credits",[]]}
}
}
This will also replace null values with an empty []
The deep buried keys (...award.event.id) in arrays confounds an easy approach without 1) messing up the structure as the OP has noted 2) incurring potentially very expensive multiple $unwind calls.
Recommendation: Two pass approach. Get the necessary importance values for the principalCredits objects in question, then go back and manually iterate over the collection, diving into the structure and applying the logic score = importance * isWinner? 1.2 : 1.5
PASS 1: Get the ev data
c=db.foo.aggregate([
{$project: {
XX: {$reduce: {
// Rapidly get to things we need to lookup:
input: '$principalCredits.credits.name.awardNominations.edges.node.award.event.id',
// We end up with a mess incl. empty arrays...
// [ [[ev1,ev2], [ev3,ev4]], [], [[ev1,...], [] ... ] ]
// Need to collapse all those arrays of arrays of arrays into
// a single list of ev values, hence a reduce within a reduce:
initialValue: [],
in: {$concatArrays: [
'$$value',
{$reduce: {
input: '$$this',
initialValue: [],
in: {$concatArrays: [ '$$value', '$$this' ] }
}} ]}
}}
}}
// XX is now [ ev1,ev2,ev3,ev4,ev1 ... ]
// The empty arrays are ignored. Don't worry about dupes.
,{$lookup: {
from: "Xev",
let: { evids: "$XX" },
pipeline: [
{$match: {$expr: {$in: ["$id","$$evids"]} } }
],
as: 'XX' // overwrite XX...
}}
]);
evdict = {}
c.forEach(function(d) {
d['XX'].forEach(function(ww) {
evdict[ww['id']] = ww;
});
});
{
"ev0003786" : {
"_id" : ObjectId("62cd7f8138d0fbc0eacfb17f"),
"id" : "ev0003786",
"text" : "Millennium Docs Against Gravity",
"importance" : 1
},
"ev0000351" : {
"_id" : ObjectId("62cd7f8138d0fbc0eacfb180"),
"id" : "ev0000351",
"text" : "International Documentary Association",
"importance" : 2
},
"ev0000571" : {
"_id" : ObjectId("62cd7f8138d0fbc0eacfb181"),
"id" : "ev0000571",
"text" : "Royal Television Society, UK",
"importance" : 3
}
}
PASS 2: Iterate main collection
Left as exercise to reader.
Note that if
The number of events is small.
There is no need or value in performing $match on the initial principalCredits collection (i.e. before the fancy $project/$reduce) to significantly reduce the lookup set into events
then this whole thing is unnecessary. Simply slurp all events into evdict with a quick find and proceed to pass 2.
There is potentially a very cool solution that can do this in one pass
UPDATED
See Tom's answer below.
Note to MongoDB 5.0 users: The new $getField function allows you to pluck out fields by name instead of having to use the standard trick of using dot notation in the $in clause to access the field. This might be clearer to some:
{$getField: {
"field": "importance",
"input": {
$first: {
$filter: {
input: "$matchingEvents",
as: "matchedEvent",
cond: {
$eq: [
"$$matchedEvent.id",
"$$edge.node.award.event.id"
]
}
}
}
}
}
}
So I have a number of documents in my collection. Each object is a user object which contains thoughts and thoughts have replies. What I want is when a reply has anonymous true, its username value should say anonymous instead of the username value.
Document
[
{
"_id": {
"$oid": "6276eb2195b181d38eee0b43"
},
"username": "abvd",
"password": "efgh",
"thoughts": [
{
"_id": {
"$oid": "62778ff975e2c8725b9276f5"
},
"text": "last thought",
"anonymous": true,
"replies": [
{
"_id": {
"$oid": "62778fff75e2c8725b9276f5"
},
"text": "new reply",
"anonymous": true,
"username": "cdf"
},
{
"_id": {
"$oid": "62778fff75e2c8725b9276f5"
},
"text": "new reply",
"anonymous": false,
"username": "cdf"
}
]
}
]
}
]
Output Required. If you see the value in username says anonymous even though the existing document has "cdf" as the value
[
{
"_id": {
"$oid": "6276eb2195b181d38eee0b43"
},
"username": "abvd",
"password": "efgh",
"thoughts": [
{
"_id": {
"$oid": "62778ff975e2c8725b9276f5"
},
"text": "last thought",
"anonymous": true,
"replies": [
{
"_id": {
"$oid": "62778fff75e2c8725b9276f5"
},
"text": "new reply",
"anonymous": true,
"username": "anonymous"
},
{
"_id": {
"$oid": "62778fff75e2c8725b9276f5"
},
"text": "new reply",
"anonymous": false,
"username": "cdf"
}
]
}
]
}
]
Let me know if you know how to help.
Here's a MongoDB Playground URL containing the existing document:
https://mongoplayground.net/p/WoP-3z-DMuf
A bit complex query.
$set - Update the thoughts field.
1.1. $map - Iterate each thought document and return new document from 1.1.1.
1.1.1. $mergeObjects - Merge objects for thoughts document and replies array from 1.1.1.1.
1.1.1.1. $map - Iterate the reply document, return the new document by merging reply document and username field with updating its value based on the anonymous field via $cond.
db.collection.aggregate([
{
$set: {
thoughts: {
$map: {
input: "$thoughts",
as: "thought",
in: {
$mergeObjects: [
"$$thought",
{
replies: {
$map: {
input: "$$thought.replies",
as: "reply",
in: {
$mergeObjects: [
"$$reply",
{
username: {
"$cond": {
"if": "$$reply.anonymous",
"then": "anonymous",
"else": "$$reply.username"
}
}
}
]
}
}
}
}
]
}
}
}
}
}
])
Sample Mongo Playground
Have a look at my collections below,
db={
"replies": [
{
"_id": {
"$oid": "60c4814d09488145b72beda9"
},
"post": [
{
"$oid": "5fc67eb5111f570dc3eb7087"
}
],
"likes": [],
"text": "Reply not reported",
"comment": {
"$oid": "60c4813f09488145b72beda8"
},
"__v": 0
},
{
"_id": {
"$oid": "60c4815609488145b72bedaa"
},
"post": [
{
"$oid": "5fc67eb5111f570dc3eb7087"
}
],
"likes": [],
"text": "Reply reported",
"comment": {
"$oid": "60c4813f09488145b72beda8"
},
"__v": 0,
"reportCount": 1,
"reportedUsers": [
{
"$oid": "6252fe50a5cbd65064d4aab8"
}
]
}
],
"comments": [
{
"_id": {
"$oid": "5fca1877111f570dc3eb7088"
},
"replies": [],
"text": "Reported comment",
"post": {
"$oid": "5fc67eb5111f570dc3eb7087"
},
"reportCount": 1,
"reportedUsers": [
{
"$oid": "6252fe50a5cbd65064d4aab8"
}
],
"__v": 0
},
{
"_id": {
"$oid": "60c4813f09488145b72beda8"
},
"replies": [
{
"$oid": "60c4814d09488145b72beda9"
},
{
"$oid": "60c4815609488145b72bedaa"
}
],
"text": "Comment not reported",
"post": {
"$oid": "5fc67eb5111f570dc3eb7087"
},
"__v": 0
}
]
}
I am trying to get comments and replies from post (Object id - 5fc67eb5111f570dc3eb7087). Assume I blocked user with id "6252fe50a5cbd65064d4aab8". In this case the results should not contain the first comment with id "5fca1877111f570dc3eb7088" reportedUsers contains the above mentioned id and the reply with id also need to filter out since it is also reported.
Both comments are comes under post "5fc67eb5111f570dc3eb7087". Can somebody help with the aggregate query for filter comments and replies under post "5fc67eb5111f570dc3eb7087" if the reported users includes user id "6252fe50a5cbd65064d4aab8".
Mongodb playground url
Please let me know any more details needed.
Expected output will be list of comments with expanded list of replies like below,
[
{
"_id": "60c4813f09488145b72beda8",
"replies": [
{
"_id": "60c4814d09488145b72beda9",
"post": "5fc67eb5111f570dc3eb7087",
"likes": [],
"text": "Reply not reported",
"comment": "60c4813f09488145b72beda8"
}
],
"text": "Comment not reported",
"post": "5fc67eb5111f570dc3eb7087"
}
]
Edit:
according to your comment that $oid represents ObjectId, and assuming a post can have several reported user, you can do something like this:
db.comments.aggregate([
{
$match: { "post": ObjectId("5fc67eb5111f570dc3eb7087")}
},
{
$facet: {
reportedUsers: [
{
$match: {reportedUsers: { $ne: []} }
},
{
$project: {
_id: 0,
reportedUsers: 1
}
},
{ $unwind: "$reportedUsers" },
{
$group: {
_id: 0,
rep: { $addToSet: "$reportedUsers" }
}
}
],
ok: [ {$match: {reportedUsers: null}} ]
}
},
{$unwind: "$ok"},
{
$lookup: {
from: "replies",
localField: "ok.replies",
foreignField: "_id",
let: {
reportedUsers: "$reportedUsers"
},
pipeline: [
{
$addFields: {
reportedUsers: {
"$ifNull": [
"$reportedUsers",
[]
]
}
}
}
],
as: "ok.replies"
}
},
{
$project: {
ok: 1,
reportedUsers: {
"$arrayElemAt": [
"$reportedUsers",
0
]
}
}
},
{
$addFields: {
"replies": {
$filter: {
input: "$ok.replies",
as: "item",
cond: {
$eq: [
{
$size: {
"$setIntersection": [
"$$item.reportedUsers",
"$reportedUsers.rep"
]
}
},
0
]
}
}
},
reportedUsers: 0
}
},
{
$project: {
_id: "$ok._id",
post: "$ok.post",
replies: 1,
text: "$ok.text"
}
}
])
As you can see here
This query is $matching all the comments for the required post, then split it to comments without reportedUsers (ok) and group all the reportedUsers, via a $facet step. Then it collects all replies from the replies collection, via $lookup and adding an empty reportedUsers array if it does not exists. Next step is filter the replies and remove all replies that has a reported user that was in our list of reportedUsers grouped before.
I'm trying to make a query to mongodb. I want to get an array containing [location, status] of every document.
This is how my collection looks like
{
"_id": 1,
"status": "OPEN",
"location": "Costa Rica",
"type": "virtual store"
},
{
"_id": 2,
"status": "CLOSED",
"location": "El Salvador"
"type": "virtual store"
},
{
"_id": 3,
"status": "OPEN",
"location": "Mexico",
"type": "physical store"
},
{
"_id": 4,
"status": "CLOSED",
"location": "Nicaragua",
"type": "physical store"
}
I made a query, using the aggregate framework, trying to get all documents that match that specific type of store.
{
{'$match': {
'type': { '$eq': "physical store"}
}
}
What I want is something like this:
{
{
'stores': [
["Mexico", "OPEN"],
["Nicaragua", "CLOSED"]
]
},
}
I tried with the $push but couldn't make it.
Could someone please guide me on how to do it.
Since { $push: ["$location", "$status"] } would give you the error The $push accumulator is a unary operator. You would have to work around it a bit by passing to it a single object that output your desired array. One way to do it would be:
[
{
"$match": {
"type": {
"$eq": "physical store"
}
}
},
{
"$group": {
"_id": null,
"stores": {
"$push": {
"$slice": [["$location", "$status"], 2]
}
}
}
}
]
If the given documents are not sub-documents, then below is the approach:
db.collection.find({
type: {
$eq: "physical store"
}
},
{
location: 1,
status: 1
})
MongoPlayGround link for the above
If, they are the part of a field (means they are sub-documents), then below is the approach:
db.collection.aggregate([
{
$project: {
stores: {
$filter: {
input: "$stores",
as: "store",
cond: {
$eq: [
"$$store.type",
"physical store"
]
}
}
}
}
},
{
$unwind: "$stores"
},
{
$project: {
location: "$stores.location",
status: "$stores.status",
_id: "$stores._id"
}
}
])
MongoPlayGround link for the above
I have a database with some users, who belong to teams. Each team has a leader. Each user has a subject.
I want to collate teams by the leader's subject.
My data looks like this:
db={
"teams": [
{
_id: "t1",
members: [
{
"_id": "u1",
"leader": true
},
{
"_id": "u2"
},
{
"_id": "u3"
}
],
},
{
_id: "t2",
members: [
{
"_id": "u2",
"leader": true
},
{
"_id": "u4"
}
],
},
{
_id: "t3",
members: [
{
"_id": "u1",
"leader": true
},
{
"_id": "u4"
}
],
},
{
_id: "t4",
members: [
{
"_id": "u2",
"leader": true
}
],
},
],
"users": [
{
"_id": "u1",
"subject": "history"
},
{
"_id": "u2",
"subject": "maths"
},
{
"_id": "u3",
"subject": "geography"
},
{
"_id": "u4",
"subject": "french"
}
]
}
The result I want is:
{
"history": ["t1", "t3"],
"maths": ["t2", "t4"]
}
I have an aggregation that gets me the _id of every leader, and from there I can get the result I want in stages, by first finding the subject of every leader, then going back through the projects and assigning a subject to each project based on the identify of the leader. It works but it is inelegant and I think it will be slow. It seems to me there should be some better way to do this, maybe something like a join?
Is there a nifty way to get the result I want from a single MongoDB operation?
Here is a Mongo Playground with my data:
https://mongoplayground.net/p/SIJv9-hVNzJ
Many thanks for any help.
Edit: my test data are confusing because '_id' is used in both collections, making it hard to unpack the answer. Here is an updated Mongo Playground that uses different key names for each collection and helped me to understand the perfect answer.
Yes, you should join your collections on users._id with a $lookup, and then transform value to key with $arrayToObject (introduced in Mongodb 3.4.4)
Here is a possible way to do this :
db.teams.aggregate([
{
"$unwind": "$members"
},
{
"$match": {
"members.leader": true
}
},
{
"$lookup": {
"from": "users",
"localField": "members._id",
"foreignField": "_id",
"as": "users"
}
},
{
"$unwind": "$users"
},
{
"$group": {
"_id": "$users.subject",
"team": {
"$push": "$_id"
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$arrayToObject": [
[
{
k: "$_id",
v: "$team"
}
]
]
}
}
}
])
try it online: mongoplayground.net/p/TuEpMzHkI-0