In below example, looking for new partner suggestions for user abc. abc has already sent a request to 123 so that can be ignored. rrr has sent request to abc but rrr is in the fromUser field so rrr is still a valid row to be shown as suggestion to abc
I have two collections:
User collection
[
{
_id: "abc",
name: "abc",
group: 1
},
{
_id: "xyz",
name: "xyyy",
group: 1
},
{
_id: "123",
name: "yyy",
group: 1
},
{
_id: "rrr",
name: "tttt",
group: 1
},
{
_id: "eee",
name: "uuu",
group: 1
}
]
Partnership collection (if users have already partnered)
[
{
_id: "abc_123",
fromUser: "abc",
toUser: "123"
},
{
_id: "rrr_abc",
fromUser: "rrr",
toUser: "abc"
},
{
_id: "xyz_rrr",
fromUser: "xyz",
toUser: "rrr"
}
]
My query below excludes the user rrr but it should not because its not listed in toUser field in the partnership collection corresponding to the user abc.
How to modify this query to include user rrr in this case?
db.users.aggregate([
{
$match: {
group: 1,
_id: {
$ne: "abc"
}
}
},
{
$lookup: {
from: "partnership",
let: {
userId: "$_id"
},
as: "prob",
pipeline: [
{
$set: {
users: [
"$fromUser",
"$toUser"
],
u: "$$userId"
}
},
{
$match: {
$expr: {
$and: [
{
$in: [
"$$userId",
"$users"
]
},
{
$in: [
"abc",
"$users"
]
}
]
}
}
}
]
}
},
{
$match: {
"prob.0": {
$exists: false
}
}
},
{
$sample: {
size: 1
}
},
{
$unset: "prob"
}
])
https://mongoplayground.net/p/utGMeHFRGmt
Your current query does not allow creating an existing connection regardless of the connection direction. If the order of the connection is important use:
db.users.aggregate([
{$match: {
group: 1,
_id: {$ne: "abc"}
}
},
{$lookup: {
from: "partnership",
let: { userId: {$concat: ["abc", "_", "$_id"]}},
as: "prob",
pipeline: [{$match: {$expr: {$eq: ["$_id", "$$userId"]}}}]
}
},
{$match: {"prob.0": {$exists: false}}},
{$sample: {size: 1}},
{$unset: "prob"}
])
See how it works on the playground example
For MongoDB 5 and later, I'd propose the following aggregation pipeline:
db.users.aggregate([
{
$match: {
group: 1,
_id: {
$ne: "abc"
}
}
},
{
$lookup: {
from: "partnership",
as: "prob",
localField: "_id",
foreignField: "toUser",
pipeline: [
{
$match: {
fromUser: "abc",
}
}
]
}
},
{
$match: {
"prob.0": {
$exists: false
}
}
},
{
$unset: "prob"
}
])
The following documents are returned (full result without the $sample stage):
[
{
"_id": "eee",
"group": 1,
"name": "uuu"
},
{
"_id": "rrr",
"group": 1,
"name": "tttt"
},
{
"_id": "xyz",
"group": 1,
"name": "xyyy"
}
]
The main difference is that the lookup connects the collections by the toUser field (see localField, foreignField) and uses a minimal pipeline to restrict the results further to only retrieve the requests from the current user document to "abc".
See this playground to test.
When using MongoDB < 5, you cannot use localField and foreignField to run the pipeline only on a subset of the documents in the * from*
collection. To overcome this, you can use this aggregation pipeline:
db.users.aggregate([
{
$match: {
group: 1,
_id: {
$ne: "abc"
}
}
},
{
$lookup: {
from: "partnership",
as: "prob",
let: {
userId: "$_id"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [
"$fromUser",
"abc"
]
},
{
$eq: [
"$toUser",
"$$userId"
]
}
]
}
}
}
]
}
},
{
$match: {
"prob.0": {
$exists: false
}
}
},
{
$unset: "prob"
}
])
The results are the same as for the upper pipeline.
See this playground to test.
For another, another way, this query starts from the partnership collection, finds which users to exclude, and then does a "$lookup" for everybody else. The remainder is just output formatting, although it looks like you may want to add a "$sample" stage at the end.
db.partnership.aggregate([
{
"$match": {
"fromUser": "abc"
}
},
{
"$group": {
"_id": null,
"exclude": {"$push": "$toUser" }
}
},
{
"$lookup": {
"from": "users",
"let": {
"exclude": {"$concatArrays": [["abc"], "$exclude"]
}
},
"pipeline": [
{
"$match": {
"$expr": {
"$not": {"$in": ["$_id", "$$exclude"]}
}
}
}
],
"as": "output"
}
},
{
"$project": {
"_id": 0,
"output": 1
}
},
{"$unwind": "$output"},
{"$replaceWith": "$output"}
])
Try it on mongoplayground.net.
Related
I have two collections in MongoDB, items and categories.
items is
{
_id: "some_id",
category_A: "foo",
category_B: "bar",
}
and categories is
{
_id: "foo_id",
name: "foo",
type: "A"
},
{
_id: "bar_id",
name: "bar",
type: "B"
}
I'm trying to use a pipeline to get foo_id and bar_id by using $lookup, but I don't understand why the category_A_out array always returns empty.
Here is the relevant step of the pipeline for category_A:
{
from: 'categories',
"let": {
"category": "$name",
"type": "$type"
},
"pipeline": [{
"$match": {
$expr: {
$and: [
{ $eq: ["$category_A", "$$category"] },
{ $eq: ["$$type", "A"] }
]
}
}
}],
as: 'category_A_out'
}
I am sure that foo and bar exist in the categories collection.
What am I doing wrong?
let should use for declaring the variable for LEFT collection which is items.
If category_A holds the categories' name, you need match with name.
Else match with _id.
db.items.aggregate([
{
$lookup: {
from: "categories",
"let": {
"category_A": "$category_A"
},
"pipeline": [
{
"$match": {
$expr: {
$and: [
{
$eq: [
"$name", // Or Match with $_id if category_A holds id
"$$category_A"
]
},
{
$eq: [
"$type",
"A"
]
}
]
}
}
}
],
as: "category_A_out"
}
}
])
Sample Mongo Playground
I have two collections, viz: clib and mp.
The schema for clib is : {name: String, type: Number} and that for mp is: {clibId: String}.
Sample Document for clib:
{_id: ObjectId("6178008397be0747443a2a92"), name: "c1", type: 1}
{_id: ObjectId("6178008397be0747443a2a91"), name: "c2", type: 0}
Sample Document for mp:
{clibId: "6178008397be0747443a2a92"}
{clibId:"6178008397be0747443a2a91"}
While Querying mp, I want those clibId's that have type = 0 in clib collection.
Any ideas how this can be achieved?
One approach that I can think of was to use $lookUp, but that doesnt seem to be working. Also, I m not sure if this is anti-pattern for mongodb, another approach is to copy the type from clib to mp while saving mp document.
If I've understood correctly you can use a pipeline like this:
This query get the values from clib where its _id is the same as clibId and also has type = 0. Also I've added a $match stage to not output values where there is not any coincidence.
db.mp.aggregate([
{
"$lookup": {
"from": "clib",
"let": {
"id": "$clibId"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$eq": [
{
"$toObjectId": "$$id"
},
"$_id"
]
},
{
"$eq": [
"$type",
0
]
}
]
}
}
}
],
"as": "result"
}
},
{
"$match": {
"result": {
"$ne": []
}
}
}
])
Example here
db.mp.aggregate([
{
$lookup: {
from: "clib",
let: {
clibId: "$clibId"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [ "$_id", "$$clibId" ],
}
]
}
}
},
{
$project: { type: 1, _id: 0 }
}
],
as: "clib"
}
},
{
"$unwind": "$clib"
},
{
"$match": {
"clib.type": 0
}
}
])
Test Here
I would like to get a count of all notifications that aren't read by an User ("A", "B", "C", etc) for each subRoom. Taking into account that it could be millions of notifications documents and hundreds of subrooms elements in Rooms Collections, i need to limit it. For that reason I've limited the $lookup for first 100 elements and then check if that notifications have been read or not by an User. I did it using documents (roomId) in $lookup but I cant do it using subdocuments (subRoom.id).
Notifications Collection is indexed using a Compound of (roomId: 1, timestamp: -1)
Notifications Collection: (id corresponds to notification id and roomId is the link to Rooms collection)
[{
"_id": "XXX",
"id": "1",
"read": ["A", "B", "C"],
"roomId": "c1d87a4c-231d-4cc8-8438-35cf21ed7fc5",
"content": "XXX",
"timestamp": { "$date": "2021-12-31T22:50:53.000Z" }
},{
"_id": "XXX",
"id": "2",
"read": ["C"],
"roomId": "c1d87a4c-231d-4cc8-8438-35cf21ed7fc5",
"content": "XXX",
"timestamp": { "$date": "2021-12-31T22:50:53.000Z" }
},
...
]
Rooms Collection:
[{
"_id": "XXX"
"subRoom": [{
"id": "c1d87a4c-231d-4cc8-8438-35cf21ed7fc5",
"image": "XXX",
"name": "XXX"
}, {
"id": "c2d5081e-0cf1-4e69-937d-be357da1d104",
"image": "XXX",
"name": "XXX"
}, {
"id": "530c2c02-26e8-441c-af39-c5232dfe1f73",
"image": "XXX",
"name": "XXX"
}],
"id": "453a6458-6545-4842-8946-05f49efea216",
"name": "XXX",
},
...
]
Code working using roomId instead subRoom.id:
{ $lookup: {
from: "notifications",
let: { "id": "$id" },
pipeline: [
{ $match: {
$expr:
{ $eq: [ "$roomId", "$$id" ] }
}},
{ $limit: 100},
{ $project: {_id: 0, read: 1}}
],
as: "messages"
}},
{ $project: {_id: 0, id: 1, notRead: {
$size: {
$filter: {
input: "$notifications",
cond: {
$not: {
$in: [
"A",
"$$this.read"
]
}
}
}
}
},
}
Code NOT WORKING using subRoom.id:
{ $lookup: {
from: "notifications",
let: { "id": "$subRoom.id" },
pipeline: [
{ $match: {
$expr:
{ $eq: [ "$roomId", "$$id" ] }
}},
{ $limit: 100},
{ $project: {_id: 0, read: 1}}
],
as: "messages"
}},
{
$addFields: {
items: {
$map: {
input: { $zip: { inputs: ["$subRoom", "$messages"] } },
in: { $mergeObjects: "$$this" },
},
},
},
},
.
. projection
.
Expected Result:
[{
"_id": "XXX"
"subRoom": [{
"id": "c1d87a4c-231d-4cc8-8438-35cf21ed7fc5",
"notRead": 50 //e.g
}, {
"id": "c2d5081e-0cf1-4e69-937d-be357da1d104",
"notRead": 35 //e.g
}, {
"id": "530c2c02-26e8-441c-af39-c5232dfe1f73",
"image": "XXX",
"notRead": 5 //e.g
}],
"id": "453a6458-6545-4842-8946-05f49efea216",
"name": "XXX",
},
...
]
Finally and very importantly, I want an scalable solution that can be done with big data.
Thank you very much in advance.
$unwind deconstruct subRoom array with preserve null and empty array property
$lookup with notification collection using pipeline, let to pass id to pipeline, check condition for roomId and user should not read notification
$group by null and count total unread notifications
$addFields to get count to notifications using $sum
$group by _id and reconstruct the subRoom array with required fields in result
db.rooms.aggregate([
{
$unwind: {
path: "$subRoom",
preserveNullAndEmptyArrays: true
}
},
{
$lookup: {
from: "nitifications",
let: { id: "$subRoom.id" },
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: ["$$id", "$roomId"] } },
{ read: { $ne: "A" } }
]
}
},
{
$group: {
_id: null,
count: { $sum: 1 }
}
}
],
as: "subRoom.notRead"
}
},
{
$addFields: {
"subRoom.notRead": { $sum: "$subRoom.notRead.count" }
}
},
{
$group: {
_id: "$_id",
name: { $first: "$name" },
id: { $first: "$id" },
subRoom: { $push: "$subRoom" }
}
}
])
Playground
Second option without using $unwind stage,
$lookup with notification collection using pipeline, let to pass id to pipeline, check condition for roomId and user should not read notification
$group by null and count total unread notifications
$map to iterate loop of subRoom array
$filter to iterate loop of return result from lookup notifications count and get current subRoom document
$let to declare a variable n and assign above filtered result to it and return $sum from count
$mergeObjects to merge current object of subRoom and new field notRead
db.rooms.aggregate([
{
$lookup: {
from: "nitifications",
let: { id: "$subRoom.id" },
pipeline: [
{
$match: {
$and: [
{ $expr: { $in: ["$roomId", "$$id"] } },
{ read: { $ne: "A" } }
]
}
},
{
$group: {
_id: "$roomId",
count: { $sum: 1 }
}
}
],
as: "notRead"
}
},
{
$project: {
id: 1,
name: 1,
subRoom: {
$map: {
input: "$subRoom",
as: "s",
in: {
$mergeObjects: [
"$$s",
{
notRead: {
$let: {
vars: {
n: {
$filter: {
input: "$notRead",
cond: { $eq: ["$$this._id", "$$s.id"] }
}
}
},
in: { $sum: "$$n.count" }
}
}
}
]
}
}
}
}
}
])
Playground
I have two collections,
Users-
{
_id:,
name:"",
company_id:"",
}
Companies-
{
_id:,
name:"",
}
Now i want to query all the companies for which either the company name matches the name in companies collection or companies for which name matches in users collection.
You can use $group
db.Users.aggregate([
{
$lookup: {
from: "Companies",
let: {companyId: "$company_id"},
pipeline: [
{
$match: {
$expr: {
$or: [
{ $eq: [ "$_id", "$$companyId" ] },
{ $eq: [ "$name", "Rockefeller" ] }
]
}
}
}
],
as: "Company"
}
},
{
$project: {
_id: 0,
Company: { "$arrayElemAt": [ "$Company", 0 ] }
}
},
{
"$group": {
"_id": null,
"companies": { "$push": "$Company" }
}
}
])
Working Mongo playground
I have a collection of Orders. each order has a list of Items, and each Item has catalog_id, which is an ObjectId pointing to the Catalogs collection.
I need an aggregate query that will retrieve certain orders - each order with its Items in extended fashion including the Catalog name and SKU. i.e:
Original data structure:
Orders: [{
_id : ObjectId('ord1'),
items : [{
catalog_id: ObjectId('xyz1'),
qty: 5
},
{
catalog_id: ObjectId('xyz2'),
qty: 3
}]
Catalogs: [{
_id : ObjectId('xyz1')
name: 'my catalog name',
SKU: 'XxYxZx1'
},{
_id : ObjectId('xyz2')
name: 'my other catalog name',
SKU: 'XxYxZx2'
}
]
ideal outcome would be:
Orders: [{
_id : ObjectId('ord1'),
items : [{
catalog_id: ObjectId('xyz1'),
catalog_name: 'my catalog name',
catalog_SKU: 'XxYxZx1' ,
qty: 5
},
{
catalog_id: ObjectId('xyz2'),
catalog_name: 'my other catalog name',
catalog_SKU: 'XxYxZx2' ,
qty: 3
}
]
What I did so far was:
db.orders.aggregate(
[
{
$match: {merchant_order_id: 'NIM333'}
},
{
$lookup: {
from: "catalogs",
//localField: 'items.catalog_id',
//foreignField: '_id',
let: { 'catalogId' : 'items.catalog_id' },
pipeline: [
{
$match : {$expr:{$eq:["$catalogs._id", "$$catalogId"]}}
},
{
$project: {"name": 1, "merchant_SKU": 1 }
}
],
as: "items_ex"
},
},
])
but items_ex comes out empty for some reason i cannot understand.
You need to first $unwind the items and reconstruct the array back using $group to match the exact position of qty with the catalogs_id inside the items array
db.orders.aggregate([
{ "$match": { "merchant_order_id": "NIM333" }},
{ "$unwind": "$items" },
{ "$lookup": {
"from": "catalogs",
"let": { "catalogId": "$items.catalog_id", "qty": "$items.qty" },
"pipeline": [
{ "$match": { "$expr": { "$eq": ["$_id", "$$catalogId"] } }},
{ "$project": { "name": 1, "merchant_SKU": 1, "qty": "$$qty" }}
],
"as": "items"
}},
{ "$unwind": "$items" },
{ "$group": {
"_id": "$_id",
"items": { "$push": "$items" },
"data": { "$first": "$$ROOT" }
}},
{ "$replaceRoot": {
"newRoot": {
"$mergeObjects": ["$data", { "items": "$items" }]
}
}}
])
MongoPlayground
You're missing a dollar sign when you define your pipeline variable. There should be:
let: { 'catalogId' : '$items.catalog_id' },
and also this expression returns an array to you need $in instead of $eq:
{
$lookup: {
from: "catalogs",
let: { 'catalogId' : 'items.catalog_id' },
pipeline: [
{
$match : {$expr:{$in:["$_id", "$$catalogId"]}}
},
{
$project: {"name": 1, "merchant_SKU": 1 }
}
],
as: "items_ex"
}
}
Mongo Playground