MongoDB query with fields from different collection - mongodb

I have no idea about how to build a query which does this:
I have a collection of users, each user has a field userdata which contains an array of String.
Each string is the string of the ObjectID of other documents (news already seen) in another collection.
I need, knowing the username of this user, to perform a query which gets all the news but not those which have been already seen.
I think the $nin operator does what I need but I don't know how to mix it with data from another collection.
Users
user
username: String
userdata: Object
news: Array of String
News
news1
_id: ObjectID
news2
_id: ObjectID
EXAMPLE:
Users: [{
username: 'mario',
userdata: {
news: ['10', '11']
}
}]
News: [{
_id: '10',
content: 'hello world10'
},{
_id: '11',
content: 'hello world11'
},{
_id: '12',
content: 'hello world12'
}]
Passing to the query the username (as a String) 'mario', I need to query the collection News and get back only the one with _id '12'.
Thanks

You need to run $lookup with custom pipeline. There's no $nin for aggregations but you can use $not along with $in. Then you can also try $unwind with $replaceRoot to promote filtered News to the root level:
db.Users.aggregate([
{ $match: { username: "mario" } },
{
$lookup: {
from: "News",
let: { user_news: "$userdata.news" },
pipeline: [{ $match: { $expr: { $not: { $in: [ "$_id", "$$user_news" ] } } } }],
as: "filteredNews"
}
},
{ $unwind: "$filteredNews" },
{ $replaceRoot: { newRoot: "$filteredNews" }}
])

Related

MongoDB aggregate - return as separate objects

So I have 2 collections (tenants and campaigns) and I'm trying to compose a query to return 1 tenant and 1 campaign. As an input, there is a tenant domain and campaign slug. Since I first need the tenant _id to query the campaign (based on both tenantId and slug), aggregation seems more performative option (than making 2 consecutive queries).
Technically speaking, I know how to do that:
[{
$match: { 'domains.name': '<DOMAIN_HERE>' },
}, {
$lookup: {
from: 'campaigns',
localField: '_id',
foreignField: 'tenantId',
as: 'campaign',
pipeline: [{
$match: { slug: '<SLUG_HERE>' },
}],
},
}]
which returns:
{
_id: ObjectId('...'),
campaign: [{
_id: ObjectId('...'),
}],
}
But it feels very uncomfortable, because for one the campaign is returned as a field of tenant and for other the campaign is returned as a single item in an array. I know, I can process and better format the result programmatically afterwards. But is there any way to „hack“ the aggregation to achieve a result that looks more like this?
{
tenant: {
_id: ObjectId('...'),
},
campaign: {
_id: ObjectId('...'),
},
}
This is just a simplified example, in reality this aggregation query is a bit more complicated (across more collections, upon few of which I need to perform a very similar query), so it's not just about this one simple query. So the ability to return an aggregated document as a separate object, rather than an array field on parent document would be quite helpful - if not, the world won't fall apart :)
To all those whom it may concern...
Thanks to answers from some good samaritans here, I've figured it out as a combination of $addFields, $project and $unwind. Extending my original aggregation query, the final pipeline would look like this:
[{
$match: { 'domains.name': '<DOMAIN_HERE>' },
}, {
$addFields: { tenant: '$$ROOT' },
}, {
$project: { _id: 0, tenant: 1 },
}, {
$lookup: {
from: 'campaigns',
localField: 'tenant._id',
foreignField: 'tenantId',
as: 'campaign',
pipeline: [{
$match: { slug: '<SLUG_HERE>' },
}],
},
}, {
$unwind: {
path: '$campaign',
preserveNullAndEmptyArrays: true,
},
}]
Thanks for the help! 😊

Access root document map in the $filter aggregation (MongoDb)

I apologize for the vague question description, but I have quite a complex question regarding filtration in MongoDB aggregations. Please, see my data schema to understand the question better:
Company {
_id: ObjectId
name: string
}
License {
_id: ObjectId
companyId: ObjectId
userId: ObjectId
}
User {
_id: ObjectId
companyId: ObjectId
email: string
}
The goal:
I would like to query all non-licensed users. In order to do this, you would need these plain MongoDB queries:
const licenses = db.licenses.find({ companyId }); // Get all licenses for specific company
const userIds = licenses.toArray().map(l => l.userId); // Collect all licensed user ids
const nonLicensedUsers = db.users.find({ _id: { $nin: userIds } }); // Query all users that don't hold a license
The problem:
The code above works perfectly fine. However, in our system, companies may have hundreds of thousands of users. Therefore, the first and the last step become exceptionally expensive. I'll elaborate on this. First things first, you need to fetch a big number of documents from DB and transmit them via the network, which is fairly expensive. Then, we need to pass a huge $nin query to MongoDB over the network again, which doubles overhead costs.
So, I would like to perform all the mentioned operations on the MongoDB end and return a small slice of non-licensed users to avoid network transmission costs. Are there ideas on how to achieve this?
I was able to come pretty close using the following aggregation (pseudo-code):
db.company.aggregate([
{ $match: { _id: id } }, // Step 1. Find the company entity by id
{ $lookup: {...} }, // Step 2. Joins 'users' collection by `companyId` field
{ $lookup: {...} }, // Step 3. Joins 'licenses' collection by `companyId` field
{
$project: {
licensesMap: // Step 4. Convert 'licenses' array to the map with the shape { 'user-id': true }. Could be done with $arrayToObject operator
}
},
{
$project: {
unlicensedUsers: {
$filter: {...} // And this is the place, where I stopped
}
}
}
]);
Let's have a closer look at the final stage of the above aggregation. I tried to utilize the $filter aggregation in the following manner:
{
$filter: {
input: "$users"
as: "user",
cond: {
$neq: ["$licensesMap[$$user._id]", true]
}
}
}
But, unfortunately, that didn't work. It seemed like MongoDB didn't apply interpolation and just tried to compare a raw "$licensesMap[$$user._id]" string with true boolean value.
Note #1:
Unfortunately, we're not in a position to change the current data schema. It would be costly for us.
Note #2:
I didn't include this in the aggregation example above, but I did convert Mongo object ids to strings to be able to create the licensesMap. And also, I stringified the ids of the users list to be able to access licensesMap properly.
Sample data:
Companies collection:
[
{ _id: "1", name: "Acme" }
]
Licenses collection
[
{ _id: "1", companyId: "1", userId: "1" },
{ _id: "2", companyId: "1", userId: "2" }
]
Users collection:
[
{ _id: "1", companyId: "1" },
{ _id: "2", companyId: "1" },
{ _id: "3", companyId: "1" },
{ _id: "4", companyId: "1" },
]
The expected result is:
[
_id: "1", // company id
name: "Acme",
unlicensedUsers: [
{ _id: "3", companyId: "1" },
{ _id: "4", companyId: "1" },
]
]
Explanation: unlicensedUsers list contains the third and the fourth users because they don't have corresponding entries in the licenses collection.
How about something simple like:
db.usersCollection.aggregate([
{
$lookup: {
from: "licensesCollection",
localField: "_id",
foreignField: "userId",
as: "licensedUsers"
}
},
{$match: {"licensedUsers.0": {$exists: false}}},
{
$group: {
_id: "$companyId",
unlicensedUsers: {$push: {_id: "$_id", companyId: "$companyId"}}
}
},
{
$lookup: {
from: "companiesCollection",
localField: "_id",
foreignField: "_id",
as: "company"
}
},
{$project: {unlicensedUsers: 1, company: {$arrayElemAt: ["$company", 0]}}},
{$project: {unlicensedUsers: 1, name: "$company.name"}}
])
playground example
users collection and licenses collection, both have anything you need on the users so after the first $lookup that "merges" them, and a simple $match to keep only the unlicensed users, all that left is just formatting to the format you request.
Bonus: This solution can work with any type of id. For example playground
If you're facing a similar situation. Bear in mind that the above solution will work fast only with the hashed index.

Mongodb elemMatch not reading variable from aggregate root

I'm trying to build an aggregate query from a users collection. Each user has an id property that is associated in a clients collection array:
// users...
[
{
id: '12345',
name: 'John'
}
// ...
}
// clients...
[
{
name: 'Foo',
members: [ { id: '1234', role: 'Admin' } ]
},
// ....
]
So what I'm trying to do is aggregate the users collection and do a $lookup to "join" the clients with which a user is a member (by the id)
db.users.aggregate([
$lookup: {
from: 'clients',
as: 'clients',
let: { user_id: '$id' },
pipeline: [
{
$match: {
members: {
$elemMatch: { id: '$user_id' },
},
},
},
],
};
}])
If I hard-code any user's id into the $elemMatch (replacing $user_id) it works, but I can't seem to get it to work as a variable from the user records.
From $lookup let,
A $match stage requires the use of an $expr operator to access the variables. The $expr operator allows the use of aggregation expressions inside of the $match syntax.
From your scenario,
Need $expr to access the variable.
Apply the $in operator instead of $elemMatch.
To reference the variable in the pipeline, use $$<variable> but not $<variable>.
{
$match: {
$expr: {
$in: [
"$$user_id",
"$members.id"
]
}
}
}
Sample Mongo Playground

Mongodb lookup with array

I have two collections first one is
user_profile collection
const userProfileSchema = mongoose.Schema({
phone_number: {
type: String,
required: false,
},
primary_skills: [
{
skill_id: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Skill'
},
years: Number,
}
]
});
sample data
{
"phone_number":"222",
"primary_skills":[{skill_id:1,years:12},{skill_id:2,years:13}]
}
in the primary_skills the key skill_id is mapped with another collection named skills
skills collection
const skillSchema = mongoose.Schema({
name: {
type: String,
required: true,
unique:true,
},
});
sample data
[
{
id:1,
name:'php'
},
{
id:2,
name:'java'
}
]
I want to fetch all values in the user_profile collection along with the respective skills name
expected output:
{
"phone_number":"222",
"primary_skills":[{
name:"php",skill_id:1,years:12
},{
name:"java",skill_id:2,years:13}
]
}
I found a similar thread to my question MongoDB lookup when foreign field is an array of objects but it's doing the opposite of what I want
This is the query I tried
profile.aggregate([{
$lookup:{
from:'skills',
localField:'primary_skills.skill_id',
foreignField:'_id',
'as':'primary_skills'
}
}])
This works fine but it didn't contain the years key
You need to do it with $unwind and $group,
$unwind primary_skills because its an array and we need to lookup sub document wise
db.user_profile.aggregate([
{
$unwind: "$primary_skills"
},
$lookup to join primary_skills, that you have already did
{
$lookup: {
from: "skills",
localField: "primary_skills.skill_id",
foreignField: "id",
as: "primary_skills.name"
}
},
$unwind primary_skills.name that we have stored join result, its array and we are unwinding to do object
{
$unwind: {
path: "$primary_skills.name"
}
},
$addFields replace field name that we have object and we need only name
{
$addFields: {
"primary_skills.name": "$primary_skills.name.name"
}
},
$group by _id because we have unwind and we need to combine all documents
{
$group: {
_id: "$_id",
phone_number: {
$first: "$phone_number"
},
primary_skills: {
$push: "$primary_skills"
}
}
}
])
Playground: https://mongoplayground.net/p/bDmrOwmASn5

How to $lookup/populate an embedded document that is inside an array?

How to $lookup/populate an embedded document that is inside an array?
Below is how my schema is looking like.
const CommentSchema = new mongoose.Schema({
commentText:{
type:String,
required: true
},
arrayOfReplies: [{
replyText:{
type:String,
required: true
},
replier: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
}],
}],
});
How can I get query results that look like below:
[
{
commentText: 'comment text',
arrayOfReplies: [
{
replyText: 'replyText',
replier: {
username:"username"
bio: 'bio'
}
}
]
}
]
I am trying to populate the replier field inside the array arrayOfReplies. I have tried several variations of the aggregation query below. The ones that have come close to what I am trying to achieve have one short-coming. The comments that do not have replies have an arrayOfReplies array that has an empty object. I.e arrayOfReplies: [{}], essentially meaning that the array is not empty.
I have tried using add fields, $mergeObjects among other pipeline operators but to no avail.
How to $lookup/populate the replier document that is inside the arrayOfReplies array?
Below is a template of the main part of my aggregation query, minus trying populate the replier document.
Comment.aggregate([
{$unwind: {"path": '$arrayOfReplies', "preserveNullAndEmptyArrays": true }},
{$lookup:{from:"users",localField:"$arrayOfReplies.replier",foreignField:"_id",as:"replier"}},
{$unwind: {"path": "$replier", "preserveNullAndEmptyArrays": true }},
{$group: {
_id : '$_id',
commentText:{$first: '$commentText'},
userWhoPostedThisComment:{$first: '$userWhoPostedThisComment'},
arrayOfReplies: {$push: '$arrayOfReplies' },
}},
After your lookup stage, each document will have
{
commentText: "text",
arrayOfReplies: <single reply, with replier ID>
replier: [<looked up replier data>]
}
Use an $addFields stage to move that replier data inside the reply object before the group, like:
{$addFields: {"arrayOfReplies.replier":"$replier"}}
Then your group stage will rebuild arrayOfReplies like you want.
You can use the following aggregate:
Playground
Comment.aggregate([
{
$unwind: {
"path": "$arrayOfReplies",
"preserveNullAndEmptyArrays": true
}
},
{
$lookup: {
from: "users",
localField: "arrayOfReplies.replier",
foreignField: "_id",
as: "replier"
}
},
{
$addFields: {
"arrayOfReplies.replier": {
$arrayElemAt: [
"$replier",
0
]
}
}
},
{
$project: {
"replier": 0
}
},
{
$group: {
_id: "$_id",
"arrayOfReplies": {
"$push": "$arrayOfReplies"
},
commentText: {
"$first": "$commentText"
}
}
}
]);
All the answers provided did not solve this issue as stated in the question.
I am trying to populate the replier field inside the array
arrayOfReplies. I have tried several variations of the aggregation
query below. The ones that have come close to what I am trying to
achieve have one short-coming. The comments that do not have replies
have an arrayOfReplies array that has an empty object. I.e
arrayOfReplies: [{}], essentially meaning that the array is not empty.
I wanted an aggregation that returns an empty array (not an array with an empty object) when the array is empty.
I was able to achieve what I wanted by using the code below:
arrayOfReplies:
{$cond:{
if: { $eq: ['$arrayOfReplies', {} ] },
then: "$$REMOVE",
else: {
_id : '$arrayOfReplies._id',
replyText:'$arrayOfReplies.replyText',
}
}}
If you combine the code above with #SuleymanSah's answer you get the full working code.