Issues with lookup and match multipe collections - mongodb

Having issues with aggregate and lookup in multiple stages. The issue is that I cannot match by userId In the last lookup. If I omit the { $eq: ['$userId', '$$userId'] } it works and match by the other criteria. But not by the userid.
I've tried added pools as a let and use it as { $eq: ['$userId', '$$pools.userId'] } in the last stage but that doesn't work either. I get an empty coupon array.
I get this with the below query. I think I need to use $unwind in some way? But haven't got that to work yet. Any pointers?
There is three collections total to be joined. First the userModel, it should contain pools and then the pools should contain a users coupons.
{
"userId": "5df344a1372f345308dac12a", // Match this usedId with below userId coming from the coupon
"pools": [
{
"_id": "5e1ebbc6cffd4b042fc081ab",
"eventId": "id999",
"eventStartTime": "some date",
"trackName": "tracky",
"type": "foo bar",
"coupon": []
}
]
},
I need the coupon array to be filled with the correct data (below) which has a matching userId in it.
"coupon": [
{
"eventId": "id999",
"userId": "5df344a1372f345308dac12a", // This userId need to match the above one
"checked": true,
"pool": "a pool",
}
poolProject:
const poolProject = {
eventId: 1,
eventStartTime: 1,
trackName: 1,
type: 1,
};
Userproject:
const userProjection = {
_id: {
$toString: '$_id',
},
paper: 1,
correctBetsLastWeek: 1,
correctBetsTotal: 1,
totalScore: 1,
role: 1,
};
The aggregate query
const result = await userModel.aggregate([
{ $project: userProjection },
{
$match: {
$or: [{ role: 'User' },
{ role: 'SuperUser' }],
},
},
{ $addFields: { userId: { $toString: '$_id' } } },
{
$lookup: {
from: 'pools',
as: 'pools',
let: { eventId: '$eventId' },
pipeline: [
{ $project: poolProject },
{
$match: {
$expr: {
$in: ['$eventId', eventIds],
},
},
},
{
$lookup: {
from: 'coupons',
as: 'coupon',
let: { innerUserId: '$$userId' },
pipeline: [
{
$match: {
$expr: {
$eq: ['$userId', '$$innerUserId'],
},
},
},
],
},
},
],
},
},
]);
Thanks for any input!
Edit:
If i move the second lookup (coupon) so they are in the same "level" it works but i would like to have it inside of the pool. If I add as: 'pools.coupon', in the last lookup it overwrites the lookedup pool data.

When you access fields with the $$ prefix it means they are defined as "special" system variables by Mongo.
We don't know exactly how Mongo the magic happens but you're naming two variables with the same name, which causes a conflict as it seems.
So either remove userId: '$userId' from the first lookup as you're not even using it.
Or rename or second userId: '$userId' a different name like innerUserId: '$userId' to avoid conflicts when you access it.
Just dont forget to change { $eq: ['$userId', '$$userId'] } to { $eq: ['$userId', '$$innerUserId'] } after.
EDIT:
Now that its clear theres no field userId in pools collection just change the variable in the second lookup collection from:
let: { innerUserId: '$userId' } //userId does not exist in pools.
To:
let: { innerUserId: '$$userId' }

Related

MongoDB query slow in join using $ne to look for not empty arrays

I'm new to Mongo and I have a slow query when using $ne in a match pipeline (to get the records that match only and not all the ones where the array is empty)
The query is as follow:
db.EN.aggregate([
{
$lookup: {
from: 'csv_import',
let: {pn:'$ICECAT-interface.Product.#Prod_id'},
pipeline: [{
$match: {
$expr: {
$eq: ["$$pn","$part_no"]
}
}
}],
as: 'part_number_info'
}
}, { $match: { part_number_info: { $ne: [] } } }
]).pretty();
When I remove the { $match: { part_number_info: { $ne: [] } } } the query executes in 21 seconds, vs almost 2 hours when executed using the $ne clause.
There's an index already on ICECAT-interface.Product.#Prod_id, and here are the 2 collections structure sample:
csv_import:
{
"_id": "ObjectId(\"6348339cc6e5c8ce0b7da5a4\")",
"index": 23679,
"product_id": 4019734,
"part_no": "CP-HAR-EP-ADVANCED-REN-1Y",
"vendor_standard": "Check Point"
}
EN:
[{
"_id": "1414",
"ICECAT-interface": {
"#xmlns:xsi": "http://www.w3.org/2001/XMLSchema-instance",
"#xsi:noNamespaceSchemaLocation": "https://data.icecat.biz/xsd/ICECAT-interface_response.xsd",
"Product": {
"#Code": "1",
"#HighPic": "https://images.icecat.biz/img/norm/high/1414-HP.jpg",
"#HighPicHeight": "400",
"#HighPicSize": "43288",
"#HighPicWidth": "400",
"#ID": "1414",
"#LowPic": "https://images.icecat.biz/img/norm/low/1414-HP.jpg",
"#LowPicHeight": "200",
"#LowPicSize": "17390",
"#LowPicWidth": "200",
"#Name": "C6614NE",
"#IntName": "C6614NE",
"#LocalName": "",
"#Pic500x500": "https://images.icecat.biz/img/gallery_mediums/img_1414_medium_1480667779_072_2323.jpg",
"#Pic500x500Height": "500",
"#Pic500x500Size": "101045",
"#Pic500x500Width": "500",
"#Prod_id": "C6614NE",
SOLUTION
I did add an index on part_no field in csv_import and I changed the order of the query to be smaller to large (EN is 27GB and csv_import is a few MB)
Final query: (includes the suggestion made by nimrod serok
db.csv_import.aggregate([
{
$lookup: {
from: 'EN',
let: {pn:'$part_no'},
pipeline: [{
$match: {
$expr: {
$eq: ["$$pn","$ICECAT-interface.Product.#Prod_id"]
}
}
}],
as: 'part_number_info'
}
},{$match: {"part_number_info.0": {$exists: true}}}
])
A better option is to use:
{$match: {"part_number_info.0": {$exists: true}}}
See how it works on the playground example

Conditionally update/upsert embedded array with findOneAndUpdate in MongoDB

I have a collection in the following format:
[
{
"postId": ObjectId("62dffd0acb17483cf015375f"),
"userId": ObjectId("62dff9584f5b702d61c81c3c"),
"state": [
{
"id": ObjectId("62dffc49cb17483cf0153220"),
"notes": "these are my custom notes!",
"lvl": 3,
},
{
"id": ObjectId("62dffc49cb17483cf0153221"),
"notes": "hello again",
"lvl": 0,
},
]
},
]
My goal is to be able to update and add an element in this array in the following situation:
If the ID of the new element is not in the state array, push the new element in the array
If the ID of the new element is in the state array and its lvl field is 0, update that element with the new information
If the ID of the new element exists in the array, and its lvl field is not 0, then nothing should happen. I will throw an error by seeing that no documents were matched.
Basically, to accomplish this I was thinking about using findOneAndUpdate with upsert, but I am not sure how to tell the query to update the state if lvl is 0 or don't do anything if it is bigger than 0 when the match is found.
For solving (1) this is what I was able to come up with:
db.collection.findOneAndUpdate(
{
"postId": ObjectId("62dffd0acb17483cf015375f"),
"userId": ObjectId("62dff9584f5b702d61c81c3c"),
"state.id": {
"$ne": ObjectId("62dffc49cb17483cf0153222"),
},
},
{
"$push": {"state": {"id": ObjectId("62dffc49cb17483cf0153222"), "lvl": 1}}
},
{
"new": true,
"upsert": true,
}
)
What is the correct way to approach this issue? Should I just split the query into multiple ones?
Edit: as of now I have done this in more than one query (one to fetch the document, then I iterate over its state array to check if the ID exists in it, and then I perform (1), (2) and (3) in a normal if-else clause)
If the ID of the new element exists in the array, and its lvl field is not 0, then nothing should happen. I will throw an error by seeing that no documents where matched.
First thing FYI,
upsert is not possible in the nested array
upsert will not add new elements to the array
upsert can add a new document with the new element
if you want to throw an error if the record does not present then you don't need upsert
Second thing, you can achieve this in one query by using an update with aggregation pipeline in MongoDB 4.2,
Note: Here i must inform you, this query will respond updated document but there will be no flag or any clue if this query fulfilled your first situation or second situation, or the third situation out of 3, you have to check in your client-side code through query response.
check conditions for postId and userId fields only
we are going to update state field under $set stage
check the condition if the provided id is present in state's id?
true, $map to iterate loop of state array
check conditions for id and lvl: 0?
true, $mergeObjects to merge current object with the new information
false, it will not do anything
false, then add that new element in state array, by $concatArrays operator
db.collection.findOneAndUpdate(
{
postId: ObjectId("62dffd0acb17483cf015375f"),
userId: ObjectId("62dff9584f5b702d61c81c3c")
},
[{
$set: {
state: {
$cond: [
{ $in: [ObjectId("62dffc49cb17483cf0153221"), "$state.id"] },
{
$map: {
input: "$state",
in: {
$cond: [
{
$and: [
{ $eq: ["$$this.id", ObjectId("62dffc49cb17483cf0153221")] },
{ $eq: ["$$this.lvl", 0] }
]
},
{
$mergeObjects: [
"$$this",
{
// update your new fields here
"notes": "new note"
}
]
},
"$$this"
]
}
}
},
{
$concatArrays: [
"$state",
[
// add new element
{
"id": ObjectId("62dffc49cb17483cf0153221"),
"lvl": 1
}
]
]
}
]
}
}
}],
{ returnNewDocument: true }
)
Playrgound
Third thing, you can execute 2 update queries,
The first query, for the case: element does not present and it will push a new element in state
let response = db.collection.findOneAndUpdate({
postId: ObjectId("62dffd0acb17483cf015375f"),
userId: ObjectId("62dff9584f5b702d61c81c3c"),
"state.id": { $ne: ObjectId("62dffc49cb17483cf0153221") }
},
{
$push: {
state: {
id: ObjectId("62dffc49cb17483cf0153221"),
lvl: 1
}
}
},
{
returnNewDocument: true
})
The second query on the base of if the response of the above query is null then this query will execute,
This will check state id and lvl: 0 conditions if conditions are fulfilled then execute the update fields operation, it will return null if the document is not found
You can throw if this will return null otherwise do stuff with response data and response success
if (response == null) {
response = db.collection.findOneAndUpdate({
postId: ObjectId("62dffd0acb17483cf015375f"),
userId: ObjectId("62dff9584f5b702d61c81c3c"),
state: {
$elemMatch: {
id: ObjectId("62dffc49cb17483cf0153221"),
lvl: 0
}
}
},
{
$set: {
// add your update fields
"state.$.notes": "new note"
}
},
{
returnNewDocument: true
});
// not found and throw an error
if (response == null) {
return {
// throw error;
};
}
}
// do stuff with "response" data and return result
return {
// success;
};
Note: As per the above options, I would recommend you that I explained in the Third thing that you can execute 2 update queries.
What you're trying became possible with the introduction pipelined updates, here is how I would do it by using $concatArrays to concat the exists state array with the new input and $ifNull in case of an upsert to init the empty value, like so:
const inputObj = {
"id": ObjectId("62dffc49cb17483cf0153222"),
"lvl": 1
};
db.collection.findOneAndUpdate({
"postId": ObjectId("62dffd0acb17483cf015375f"),
"userId": ObjectId("62dff9584f5b702d61c81c3c")
},
[
{
$set: {
state: {
$ifNull: [
"$state",
[]
]
},
}
},
{
$set: {
state: {
$concatArrays: [
{
$map: {
input: "$state",
in: {
$mergeObjects: [
{
$cond: [
{
$and: [
{
$in: [
inputObj.id,
"$state.id"
]
},
{
$eq: [
inputObj.lvl,
0
]
}
]
},
inputObj,
{},
]
},
"$$this"
]
}
}
},
{
$cond: [
{
$not: {
$in: [
inputObj.id,
"$state.id"
]
}
},
[
],
[]
]
}
]
}
}
}
],
{
"new": true,
"upsert": true
})
Mongo Playground
Prior to version 4.2 and the introduction of this feature what you're trying to do was not possible using the naive update syntax, If you are using an older version then you'd have to split this into 2 separate calls, first a findOne to see if the document exists, and only then an update based on that. obviously this can cause stability issue's if you have high update volume.

Mongodb lookup like search: local field as array of objects

I have two collections userProfile and skills,
Eg:userProfile
{
"_id": "5f72c6d4e23732390c96b031",
"name":"name"
"other_skills": [
"1","2"
],
"primary_skills": [
{
"_id": "607ffd1549e13876fef7f2c5",
"years": 4.5,
"skill_id": "1"
},
{
"_id": "607ffd1549e13876fef7f2c6",
"years": 2,
"skill_id": "2"
},
{
"_id": "607ffd1549e13876fef7f2c7",
"years": 1,
"skill_id": "3"
}
]
}
Eg:Skills
{
"_id":1,
"name": "Ruby on Rails",
}
{
"_id":2,
"name": "PHP",
}
{
"_id":3,
"name": "php",
}
I want to retrieve the userprofile based on the skills
eg: input of skill php i want to retrieve the userprofiles that matches either in primary_skills or other_skills
But I got confused about the implementation, I think it can do with pipeline in lookup and the elemMatch. This is the query I tried so far
const skills = ['php','PHP']
userProfile.aggrigate([{
$lookup:{
from:'skills',
let:{'primary_skills':'$primary_skills'},
pipeline:[
{
$match:{
primary_skills:{
$elemMatch:{
name:'' //not sure how to write match
}
}
}
}
]
}
}])
Can somebody help me with this, Thanks in advance
I'll first show you how to correct your pipeline to work, however this approach is very inefficient as you will have to $lookup on every single user in your db which is obviously a lot of overhead.
Here is how to properly match your condition:
const skills = ['php','PHP']
db.userProfile.aggregate([
{
$lookup: {
from: "skills",
let: {
"primary_skills": {
$map: {
input: "$primary_skills",
as: "skill",
in: "$$skill.skill_id"
}
},
"other_skills": "$other_skills"
},
pipeline: [
{
$match: {
$expr: {
"$in": [
"$_id",
{
"$concatArrays": [
"$$other_skills",
"$$primary_skills"
]
}
]
}
}
}
],
as: "skills"
}
},
{
$match: {
'skills.name': {$in: skills}
}
}
])
Mongo Playground
As I've said I recommend you do not do this. what I suggest you do is split it into 2 calls, first fetch the relevant skill ids. and then query on users.
By doing this you can also utilize indexes for much faster queries, like so:
const skills = ['php', 'PHP'];
const matchedSkillIds = await skills.distinct('_id', {name: {$in: skills}});
const users = await userProfile.find({
$or: [
{
'primary_skills.skill_id': {$in: matchedSkillIds}
},
{
'other_skills': {$in: matchedSkillIds}
}
]
})
Finally if you do insist on doing it in one query at the very least start the pipeline from the skill collection.

get collection data based on id fetched in listing in nested lookups and conditional lookup in mongodb aggregation

I just need following output
{
message: "Details are:",
status: 1,
data:[
{
leadId: "92106",
projectName: "Sales Rep Mobile App with Shopify Backend"
projectOverview: "<any description>"
notificationsData:[
{
_id:"6076e2593580d805814c338e",
content:"<strong>User Jangid</strong> posted a comment about this message on estimation portal.",
estimationId:"5f75a496c70f05559088d971",
commentId:"6076e2583580d805814c338b",
commentData:{
_id:"6076e2583580d805814c338b",
content:"<p>hello krishna this is and this is second </p>\n"
}
},
{
_id:"6077c7c75c1bfc051f8dff3e",
content:"<strong>User Nunna</strong> posted a comment about this message on estimation portal.",
estimationId:"5f75a496c70f05559088d971",
commentId:"6077c7c35c1bfc051f8dff3b",
commentData:{
_id:"6077c7c35c1bfc051f8dff3b",
content:"<p>hey hiiii</p>\n\n<p> </p>\n"
}
},
],
userName:"user Nunna",
profileImage:"profile url",
isViewed:true,
isSortByStatus:2,
isNotifyEst:1
}
]
}
In the above example i have multiple list of same type of collection
and this list is i aggregate with Estimation Collection
These Above Data if fetched based on Estimation Collection
so based on this Estimation Collection's _id i need to fetch notification and based on commentId that fetched inside notification collection i need to fetch comments collection data.
i already do aggregation with some more collection i only need the notification and comment data inside notificationData array
Yes i Achieve this i use following query for that
let estData = await Estimations.aggregate([
{
$lookup: {
from: "notifications", as: "notificationsData",
let: {
estimation_id: "$_id"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ['$estimationId', "$$estimation_id"] },
{ $eq: ['$userId', ObjectId(this.req.currentUser._id)] },
{ $eq: ['$isViewed', false] },
]
}
},
},
{
$lookup:
{
from: "comments",
localField: "commentId",
foreignField: "_id",
as: "commentData"
}
},
{ "$unwind": "$commentData" },
],
}
},
{
$project: {
projectName: 1,
"notificationsData.content": 1,
"notificationsData.estimationId": 1,
"notificationsData._id": 1,
"notificationsData.commentId": 1,
"notificationsData.commentData.content": 1,
"notificationsData.commentData._id": 1,
leadId: 1,
projectOverview: 1
}
},
])
in the above query i get notificationData based on estimationId and commentData data based on commentId that is specified inside notificationData
so here it is solution that i do and works fine.

MongoDB find document based on existing reference in other collection

I have a situation were i got the following database-structure
Collection "User":
[
{ _id: ObjectId("507f1f77bcf86c0000000001"), name: "Mike", status: "ACTIVE", verified: true },
{ _id: ObjectId("507f1f77bcf86c0000000002"), name: "Ben", status: "INACTIVE", verified: true },
{ _id: ObjectId("507f1f77bcf86c0000000003"), name: "Anastasia", status: "ACTIVE", verified: true }
]
Collection "Reports"
[
{ userRef: ObjectId("507f1f77bcf86c0000000001"), reportVerified: true },
{ userRef: ObjectId("507f1f77bcf86c0000000003"), reportVerified: false },
]
As you can see I have a collection with all of my users and a different collection called "Report" were entries references to a user and have a separated flag-field called "reportVerified". Now I want to find all entries from the "User"-collection which have specific properties in the "User"-collection but are also references with a specific property in the "Report"-collection.
Example: I want to find all users which have User-Collection.status "ACTIVE" and have a reference in the "Report"-Table with "reportVerified" set true. This should match only "Mike" in my case.
Having the properties of the "Report"-collection in the "User"-collection directly is not an option for me.
The situation would be quite easy if i only got find-criterias either in the "User"-collection (simple find) or in the "Report"-collection (using populate) but I need a combination of both.
The best way would be using aggregate. First you need to use lookup for adding user object to the report object.
for example
mongoose.db(dbName).collection(cName).aggregate([
{
$match :{} // your match condition for report
},
{
$lookup:
{
from: "user-collection-name",
let: { user_id: "$_id", user_conditon: "$status" },
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$userRef", "$$user_id" ] }, // for joining collections
{ $eq: [ conditionInput, "$$status" ] }, // for querying on user collection
]
}
}
}
],
as: "user"
}
}
])