MongoDB aggregate - return as separate objects - mongodb

So I have 2 collections (tenants and campaigns) and I'm trying to compose a query to return 1 tenant and 1 campaign. As an input, there is a tenant domain and campaign slug. Since I first need the tenant _id to query the campaign (based on both tenantId and slug), aggregation seems more performative option (than making 2 consecutive queries).
Technically speaking, I know how to do that:
[{
$match: { 'domains.name': '<DOMAIN_HERE>' },
}, {
$lookup: {
from: 'campaigns',
localField: '_id',
foreignField: 'tenantId',
as: 'campaign',
pipeline: [{
$match: { slug: '<SLUG_HERE>' },
}],
},
}]
which returns:
{
_id: ObjectId('...'),
campaign: [{
_id: ObjectId('...'),
}],
}
But it feels very uncomfortable, because for one the campaign is returned as a field of tenant and for other the campaign is returned as a single item in an array. I know, I can process and better format the result programmatically afterwards. But is there any way to „hack“ the aggregation to achieve a result that looks more like this?
{
tenant: {
_id: ObjectId('...'),
},
campaign: {
_id: ObjectId('...'),
},
}
This is just a simplified example, in reality this aggregation query is a bit more complicated (across more collections, upon few of which I need to perform a very similar query), so it's not just about this one simple query. So the ability to return an aggregated document as a separate object, rather than an array field on parent document would be quite helpful - if not, the world won't fall apart :)

To all those whom it may concern...
Thanks to answers from some good samaritans here, I've figured it out as a combination of $addFields, $project and $unwind. Extending my original aggregation query, the final pipeline would look like this:
[{
$match: { 'domains.name': '<DOMAIN_HERE>' },
}, {
$addFields: { tenant: '$$ROOT' },
}, {
$project: { _id: 0, tenant: 1 },
}, {
$lookup: {
from: 'campaigns',
localField: 'tenant._id',
foreignField: 'tenantId',
as: 'campaign',
pipeline: [{
$match: { slug: '<SLUG_HERE>' },
}],
},
}, {
$unwind: {
path: '$campaign',
preserveNullAndEmptyArrays: true,
},
}]
Thanks for the help! 😊

Related

How to use the result of $graphLookup in $match stage

I have two collections with the following schema:
product_categories
interface Category {
_id: string;
name: string;
parentId: string;
}
This collection is hierarchical.
Example Data
Components
CPU
GPU
RAM
MOBO
Accessories
Keyboard
Mouse
Headphone
products
interface Product {
_id: string;
name: string;
categoryId: string;
// ... other irrelevant fields ...
}
I want to filter products by categoryId. For example: if I'm searching
with the _id of Components category then the query should also find
products from sub-categories like "CPU", "GPU" etc.
I've managed to come up with a query to find sub-categories recursively with the
$graphLookup stage.
db.product_categories.aggregate([
{ $match: { _id: "<parent-category-id-here>" } },
{
$graphLookup: {
as: "children",
startWith: "$_id",
connectFromField: "_id",
connectToField: "parentId",
from: "product_categories",
},
},
{ $unwind: "$children" },
{ $project: { _id: 1, childId: "$children._id" } },
]);
But, I don't know how to use it in the $match stage of the following query.
db.products.aggregate([
{
$match: {
categoryId: {
$in: [
"<parent-category-id>",
"<sub-category-ids-from-the-$graphLookup-stage>",
],
},
},
},
// ... other stages
]);
I know that, I can just fetch the subcategories before running second query but
I'm trying to do it all in one go. Is it possible?
Please give me some hint about how can I proceed further from this step. Is
there a better alternative solution to this problem?
Thanks in advance, I highly appreciate your time on SO 💝.

Access root document map in the $filter aggregation (MongoDb)

I apologize for the vague question description, but I have quite a complex question regarding filtration in MongoDB aggregations. Please, see my data schema to understand the question better:
Company {
_id: ObjectId
name: string
}
License {
_id: ObjectId
companyId: ObjectId
userId: ObjectId
}
User {
_id: ObjectId
companyId: ObjectId
email: string
}
The goal:
I would like to query all non-licensed users. In order to do this, you would need these plain MongoDB queries:
const licenses = db.licenses.find({ companyId }); // Get all licenses for specific company
const userIds = licenses.toArray().map(l => l.userId); // Collect all licensed user ids
const nonLicensedUsers = db.users.find({ _id: { $nin: userIds } }); // Query all users that don't hold a license
The problem:
The code above works perfectly fine. However, in our system, companies may have hundreds of thousands of users. Therefore, the first and the last step become exceptionally expensive. I'll elaborate on this. First things first, you need to fetch a big number of documents from DB and transmit them via the network, which is fairly expensive. Then, we need to pass a huge $nin query to MongoDB over the network again, which doubles overhead costs.
So, I would like to perform all the mentioned operations on the MongoDB end and return a small slice of non-licensed users to avoid network transmission costs. Are there ideas on how to achieve this?
I was able to come pretty close using the following aggregation (pseudo-code):
db.company.aggregate([
{ $match: { _id: id } }, // Step 1. Find the company entity by id
{ $lookup: {...} }, // Step 2. Joins 'users' collection by `companyId` field
{ $lookup: {...} }, // Step 3. Joins 'licenses' collection by `companyId` field
{
$project: {
licensesMap: // Step 4. Convert 'licenses' array to the map with the shape { 'user-id': true }. Could be done with $arrayToObject operator
}
},
{
$project: {
unlicensedUsers: {
$filter: {...} // And this is the place, where I stopped
}
}
}
]);
Let's have a closer look at the final stage of the above aggregation. I tried to utilize the $filter aggregation in the following manner:
{
$filter: {
input: "$users"
as: "user",
cond: {
$neq: ["$licensesMap[$$user._id]", true]
}
}
}
But, unfortunately, that didn't work. It seemed like MongoDB didn't apply interpolation and just tried to compare a raw "$licensesMap[$$user._id]" string with true boolean value.
Note #1:
Unfortunately, we're not in a position to change the current data schema. It would be costly for us.
Note #2:
I didn't include this in the aggregation example above, but I did convert Mongo object ids to strings to be able to create the licensesMap. And also, I stringified the ids of the users list to be able to access licensesMap properly.
Sample data:
Companies collection:
[
{ _id: "1", name: "Acme" }
]
Licenses collection
[
{ _id: "1", companyId: "1", userId: "1" },
{ _id: "2", companyId: "1", userId: "2" }
]
Users collection:
[
{ _id: "1", companyId: "1" },
{ _id: "2", companyId: "1" },
{ _id: "3", companyId: "1" },
{ _id: "4", companyId: "1" },
]
The expected result is:
[
_id: "1", // company id
name: "Acme",
unlicensedUsers: [
{ _id: "3", companyId: "1" },
{ _id: "4", companyId: "1" },
]
]
Explanation: unlicensedUsers list contains the third and the fourth users because they don't have corresponding entries in the licenses collection.
How about something simple like:
db.usersCollection.aggregate([
{
$lookup: {
from: "licensesCollection",
localField: "_id",
foreignField: "userId",
as: "licensedUsers"
}
},
{$match: {"licensedUsers.0": {$exists: false}}},
{
$group: {
_id: "$companyId",
unlicensedUsers: {$push: {_id: "$_id", companyId: "$companyId"}}
}
},
{
$lookup: {
from: "companiesCollection",
localField: "_id",
foreignField: "_id",
as: "company"
}
},
{$project: {unlicensedUsers: 1, company: {$arrayElemAt: ["$company", 0]}}},
{$project: {unlicensedUsers: 1, name: "$company.name"}}
])
playground example
users collection and licenses collection, both have anything you need on the users so after the first $lookup that "merges" them, and a simple $match to keep only the unlicensed users, all that left is just formatting to the format you request.
Bonus: This solution can work with any type of id. For example playground
If you're facing a similar situation. Bear in mind that the above solution will work fast only with the hashed index.

What's the best way to manage large ObjectID arrays in mongoose / mongo

In this case:
const PostSchema = new mongoose.Schema({
"content": {
type: String,
required: true
},
"user": {
type: mongoose.Schema.Types.ObjectId,
required: true,
ref: "User"
},
"created": {
type: Date,
default: Date.now()
},
"comments": [{
type: mongoose.Schema.Types.ObjectID,
ref: 'Comment'
}]
})
I want to be able to get 10 comments at a time, but I see no way to do that without having to get all the comments every time.
You can use uncorrelated lookup to join collections and limit to 10. Here is an example, I used String for _id for easy understanding.
$lookup - there are two lookup, I used here uncorrelated lookup where you can do parallel aggregation in joining collection. $match helps to conditionally join documents. $expr is a must to use inside the $match when you use uncorrelated lookup. $limit helps to limit the documents. If you need you can add more stages to perform aggregation inside the pipeline
Here is the script
db.PostSchema.aggregate([
{
"$lookup": {
"from": "Comment",
let: {
cId: "$comments"
},
"pipeline": [
{
$match: {
$expr: {
_id: {
in: [
"$$cId"
]
}
}
}
},
{
$limit: 10
}
],
"as": "comments"
}
}
])
Working Mongo playground

How to lookup a field in an array of subdocuments in mongoose?

I have an array of review objects like this :
"reviews": {
"author": "5e9167c5303a530023bcae42",
"rate": 5,
"spoiler": false,
"content": "This is a comment This is a comment This is a comment.",
"createdAt": "2020-04-12T16:08:34.966Z",
"updatedAt": "2020-04-12T16:08:34.966Z"
},
What I want to achieve is to lookup the author field and get the user data, but the problem is that the lookup I am trying to use only returns this to me:
Code :
.lookup({
from: 'users',
localField: 'reviews.author',
foreignField: '_id',
as: 'reviews.author',
})
Response :
Any way to get the author's data in that field? That's where the author's Id is.
Try to execute below query on your database :
db.reviews.aggregate([
/** unwind in general is not needed for `$lookup` for if you wanted to match lookup result with specific elem in array is needed */
{
$unwind: { path: "$reviews", preserveNullAndEmptyArrays: true },
},
{
$lookup: {
from: "users",
localField: "reviews.author",
foreignField: "_id",
as: "author", // Pull lookup result into 'author' field
},
},
/** Update 'reviews.author' field in 'reviews' object by checking if 'author' field got a match from 'users' collection.
* If Yes - As lookup returns an array get first elem & assign(As there will be only one element returned -uniques),
* If No - keep 'reviews.author' as is */
{
$addFields: {
"reviews.author": {
$cond: [
{ $ne: ["$author", []] },
{ $arrayElemAt: ["$author", 0] },
"$reviews.author",
],
},
},
},
/** Group back the documents based on '_id' field & push back all individual 'reviews' objects to 'reviews' array */
{
$group: {
_id: "$_id",
reviews: { $push: "$reviews" },
},
},
]);
Test : MongoDB-Playground
Note : Just in case if you've other fields in document along with reviews that needs to be preserved in output then starting at $group use these stages :
{
$group: {
_id: "$_id",
data: {
$first: "$$ROOT"
},
reviews: {
$push: "$reviews"
}
}
},
{
$addFields: {
"data.reviews": "$reviews"
}
},
{
$project: {
"data.author": 0
}
},
{
$replaceRoot: {
newRoot: "$data"
}
}
Test : MongoDB-Playground
Note : Try to keep queries to run on lesser datasets maybe by adding $match as first stage to filter documents & also have proper indexes.
you should use populate('author') method of mongoose on the request to the server which gets the id of that author and adds the user data to the response of mongoose
and dont forget to set your schema in a way that these two collections are connected
in your review schema you should add ref to the schema which the author user is saved
author: { type: Schema.Types.ObjectId, ref: 'users' },
You can follow this code
$lookup:{
from:'users',
localField:'reviews.author',
foreignField:'_id',
as:'reviews.author'
}
**OR**
> When You find the doc then use populate
> reviews.find().populate("author")

How to $lookup/populate an embedded document that is inside an array?

How to $lookup/populate an embedded document that is inside an array?
Below is how my schema is looking like.
const CommentSchema = new mongoose.Schema({
commentText:{
type:String,
required: true
},
arrayOfReplies: [{
replyText:{
type:String,
required: true
},
replier: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
}],
}],
});
How can I get query results that look like below:
[
{
commentText: 'comment text',
arrayOfReplies: [
{
replyText: 'replyText',
replier: {
username:"username"
bio: 'bio'
}
}
]
}
]
I am trying to populate the replier field inside the array arrayOfReplies. I have tried several variations of the aggregation query below. The ones that have come close to what I am trying to achieve have one short-coming. The comments that do not have replies have an arrayOfReplies array that has an empty object. I.e arrayOfReplies: [{}], essentially meaning that the array is not empty.
I have tried using add fields, $mergeObjects among other pipeline operators but to no avail.
How to $lookup/populate the replier document that is inside the arrayOfReplies array?
Below is a template of the main part of my aggregation query, minus trying populate the replier document.
Comment.aggregate([
{$unwind: {"path": '$arrayOfReplies', "preserveNullAndEmptyArrays": true }},
{$lookup:{from:"users",localField:"$arrayOfReplies.replier",foreignField:"_id",as:"replier"}},
{$unwind: {"path": "$replier", "preserveNullAndEmptyArrays": true }},
{$group: {
_id : '$_id',
commentText:{$first: '$commentText'},
userWhoPostedThisComment:{$first: '$userWhoPostedThisComment'},
arrayOfReplies: {$push: '$arrayOfReplies' },
}},
After your lookup stage, each document will have
{
commentText: "text",
arrayOfReplies: <single reply, with replier ID>
replier: [<looked up replier data>]
}
Use an $addFields stage to move that replier data inside the reply object before the group, like:
{$addFields: {"arrayOfReplies.replier":"$replier"}}
Then your group stage will rebuild arrayOfReplies like you want.
You can use the following aggregate:
Playground
Comment.aggregate([
{
$unwind: {
"path": "$arrayOfReplies",
"preserveNullAndEmptyArrays": true
}
},
{
$lookup: {
from: "users",
localField: "arrayOfReplies.replier",
foreignField: "_id",
as: "replier"
}
},
{
$addFields: {
"arrayOfReplies.replier": {
$arrayElemAt: [
"$replier",
0
]
}
}
},
{
$project: {
"replier": 0
}
},
{
$group: {
_id: "$_id",
"arrayOfReplies": {
"$push": "$arrayOfReplies"
},
commentText: {
"$first": "$commentText"
}
}
}
]);
All the answers provided did not solve this issue as stated in the question.
I am trying to populate the replier field inside the array
arrayOfReplies. I have tried several variations of the aggregation
query below. The ones that have come close to what I am trying to
achieve have one short-coming. The comments that do not have replies
have an arrayOfReplies array that has an empty object. I.e
arrayOfReplies: [{}], essentially meaning that the array is not empty.
I wanted an aggregation that returns an empty array (not an array with an empty object) when the array is empty.
I was able to achieve what I wanted by using the code below:
arrayOfReplies:
{$cond:{
if: { $eq: ['$arrayOfReplies', {} ] },
then: "$$REMOVE",
else: {
_id : '$arrayOfReplies._id',
replyText:'$arrayOfReplies.replyText',
}
}}
If you combine the code above with #SuleymanSah's answer you get the full working code.