MongoDB join 2 tables and get ids on condition - mongodb

We are really new to MongoDB query writing. We have 2 MongoDB tables Supplier1 & Supplier 2. Both have the same _id. But the version number of these objects can be different sometimes.
We need to find out _id when the version of 2 collections are different (i.e. Suplier1.version != Supplier2.version)
Supplier1
{
"_id" : ObjectId("60cd86b914dfed073d77300f"),
"companyName" : "Main Supplier",
"version" : NumberLong(246),
}
Supplier2
{
"_id" : ObjectId("60cd86b914dfed073d77300f"),
"companyName" : "Main Supplier",
"version" : NumberLong(247),
}
What we have written up to now and no idea to move forward with this. Any help is highly appreciated.
db.getCollection("Supplier1").aggregate([
{
$lookup: {
from: "Supplier2",
localField: "_id",
foreignField: "_id",
as: "selected-supplier"
}
},

You can simply use a sub-pipeline in $lookup. Simply $unwind the result array to filter out unwanted result.
db.Supplier1.aggregate([
{
"$lookup": {
"from": "Supplier2",
"let": {
id1: "$_id",
version1: "$version"
},
"pipeline": [
{
"$match": {
$expr: {
$and: [
{
$eq: [
"$$id1",
"$_id"
]
},
{
$ne: [
"$$version1",
"$version"
]
}
]
}
}
}
],
"as": "selected-supplier"
}
},
{
"$unwind": "$selected-supplier"
}
])
Here is the Mongo playground for your reference.

Related

Mongodb combine aggregate queries

I have following collections in MongoDB
Profile Collection
> db.Profile.find()
{ "_id" : ObjectId("5ec62ccb8897af3841a46d46"), "u" : "Test User", "is_del": false }
Store Collection
> db.Store.find()
{ "_id" : ObjectId("5eaa939aa709c30ff4703ffd"), "id" : "5ec62ccb8897af3841a46d46", "a" : { "ci": "Test City", "st": "Test State" }, "ip" : false }, "op" : [ ], "b" : [ "normal" ], "is_del": false}
Item Collection
> db.Item.find()
{ "_id" : ObjectId("5ea98a25f1246b53a46b9e10"), "sid" : "5eaa939aa709c30ff4703ffd", "n" : "sample", "is_del": false}
Relation among these collections are defined as follows:
Profile -> Store: It is 1:n relation. id field in Store relates with _id field in Profile.
Store -> Item: It is also 1:n relation. sid field in Item relates with _id field in Store.
Now, I need to write a query to find the all the store of profiles alongwith their count of Item for each store. Document with is_del as true must be excluded.
I am trying it following way:
Query 1 to find the count of item for each store.
Query 2 to find the store for each profile.
Then in the application logic use both the result to produce the combined output.
I have query 1 as follows:
db.Item.aggregate({$group: {_id: "$sid", count:{$sum:1}}})
Query 2 is as follows:
db.Profile.aggregate([{ "$addFields": { "pid": { "$toString": "$_id" }}}, { "$lookup": {"from": "Store","localField": "pid","foreignField": "id", "as": "stores"}}])
In the query, is_del is also missing. Is there any simpler way to perform all these in a single query? If so, what will be scalability impact?
You can use uncorrelated sub-queries, available from MongoDB v3.6
db.Profile.aggregate([
{
$match: { is_del: false }
},
{
$lookup: {
from: "Store",
as: "stores",
let: {
pid: { $toString: "$_id" }
},
pipeline: [
{
$match: {
is_del: false,
$expr: { $eq: ["$$pid", "$id"] }
}
},
{
$lookup: {
from: "Item",
as: "items",
let: {
sid: { $toString: "$_id" }
},
pipeline: [
{
$match: {
is_del: false,
$expr: { $eq: ["$$sid", "$sid"] }
}
},
{
$count: "count"
}
]
}
},
{
$unwind: "$items"
}
]
}
}
])
Mongo Playground
To improve performance, I suggest you store the reference ids as ObjectId so you don't have to convert them in each step.

MongoDB lookup match two values

I have two collections, tennis matches with two players and players.
matches looks like this:
{
"_id" : ObjectId("5ce51febc6dd820a820f20a5"),
"players" : [
ObjectId("5ce51c1af3cd6009a171a5b3"),
ObjectId("5ce51c1af3cd6009a171a350")
],
"result" : "4:6 6:3 7:6(7) 7:6(8)"
},
{
"_id" : ObjectId("5ce51febc6dd820a820f20a6"),
"players" : [
ObjectId("5ce51c1af3cd6009a171a005"),
ObjectId("5ce51c1af3cd6009a171a16c")
],
"result" : "6:2 4:6 6:3"
},
[...]
and players like this:
{
"_id" : ObjectId("5ce51c1af3cd6009a171a5b3"),
"name" : "Serena Williams",
"country" : "USA"
},
{
"_id" : ObjectId("5ce51c1af3cd6009a171a350"),
"name" : "Garbiñe Muguruza",
"country" : "Spain"
},
[...]
I need all matches where players[0] is equal to a name and players[1] to another name.
I've tried this without success:
db.matches.aggregate([
{
$unwind: "$players"
},
{
$lookup: {
from: "players",
localField: "players",
foreignField: "_id",
as: "tmp_join"
}
},
{
$match: {
"tmp_join.name": ["Serena Williams","Garbiñe Muguruza"]
}
}
])
You have to first $unwind the tmp_join array and then you can use $in to find the documents contain name.
db.matches.aggregate([
{ "$lookup": {
"from": "players",
"localField": "players",
"foreignField": "_id",
"as": "tmp_join"
}},
{ "$unwind": "$tmp_join" },
{ "$match": {
"tmp_join.name": {
"$in": ["Serena Williams","Garbiñe Muguruza"]
}
}}
])
Use below aggregation if you are using mongodb 3.6 and above
db.matches.aggregate([
{ "$lookup": {
"from": "players",
"let": { "players": "$players" },
"pipeline": [
{ "$match": {
"$expr": { "$in": ["$_id", "$$players"] },
"name": { "$in": ["Serena Williams", "Garbiñe Muguruza"] }
}}
],
"as": "tmp_join"
}},
{ "$match": { "$expr": { "$gt": [{ "$size": "$tmp_join" }, 1] }}}
])

How can I use a field from aggregate in a regex $match in mongodb?

A very simplified version of my use case is to find all posts beginning with the authors name, something like this:
> db.users.find();
{ "_id" : ObjectId("5c4185be19da7e815cb18f59"), "name" : "User1" }
{ "_id" : ObjectId("5c4185be19da7e815cb18f5a"), "name" : "User2" }
db.posts.insert([
{author : ObjectId("5c4185be19da7e815cb18f59"), text: "User1 is my name"},
{author : ObjectId("5c4185be19da7e815cb18f5a"), text: "My name is User2, but this post doesn't start with it"}
]);
So I want to identify all posts that start with the authors name. I'm trying with an aggregate like this, but I don't know how to extract the user's name from the aggregate pipeline to use in a regex match:
db.users.aggregate([
{
$lookup: {
from: "posts",
localField: "_id",
foreignField: "author",
as: "post"
}
},
{
$match: { "post.text": { $regex: "^" + name}}
}
]).pretty();
The thing "name" here is not something defined, I need to extract the name from the users collection entry from the previous step of the pipeline. For some reason I don't understand how to do that.
This is probably super simple and I'm definitely feeling thick as a brick here… Any help highly appreciated!
You can use below aggregation using $indexOfCP
db.users.aggregate([
{ "$lookup": {
"from": "posts",
"let": { "authorId": "$_id", "name": "$name" },
"pipeline": [
{ "$match": {
"$expr": {
"$and": [
{ "$ne": [{ "$indexOfCP": ["$text", "$$name"] }, -1] },
{ "$eq": ["$author", "$$authorId"] }
]
}
}}
],
"as": "post"
}}
])

Count _id occurrences in other collection

We have a DB structure similar to the following:
Pet owners:
/* 1 */
{
"_id" : ObjectId("5baa8b8ce70dcbe59d7f1a32"),
"name" : "bob"
}
/* 2 */
{
"_id" : ObjectId("5baa8b8ee70dcbe59d7f1a33"),
"name" : "mary"
}
Pets:
/* 1 */
{
"_id" : ObjectId("5baa8b4fe70dcbe59d7f1a2a"),
"name" : "max",
"owner" : ObjectId("5baa8b8ce70dcbe59d7f1a32")
}
/* 2 */
{
"_id" : ObjectId("5baa8b52e70dcbe59d7f1a2b"),
"name" : "charlie",
"owner" : ObjectId("5baa8b8ce70dcbe59d7f1a32")
}
/* 3 */
{
"_id" : ObjectId("5baa8b53e70dcbe59d7f1a2c"),
"name" : "buddy",
"owner" : ObjectId("5baa8b8ee70dcbe59d7f1a33")
}
I need a list of all pet owners and additionally the number of pets they own. Our current query looks similar to the following:
db.getCollection('owners').aggregate([
{ $lookup: { from: 'pets', localField: '_id', foreignField: 'owner', as: 'pets' } },
{ $project: { '_id': 1, name: 1, numPets: { $size: '$pets' } } }
]);
This works, however it's quite slow and I'm asking myself if there's a more efficient way to perform the query?
[update and feedback] Thanks for the answers. The solutions work, however I can unfortunately see no performance improvement compared to the query given above. Obviously, MongoDB still needs to scan the entire pet collection. My hope was, that the owner index (which is present) on the pets collection could somehow be exploited for getting just the counts (not needing to touch the pet documents), but this does not seem to be the case.
Are there any other ideas or solutions for a very fast retrieval of the 'pet count' beside explicitly storing the count within the owner documents?
In MongoDB 3.6 you can create custom $lookup pipeline and count instead of entire pets documents, try:
db.owners.aggregate([
{
$lookup: {
from: "pets",
let: { ownerId: "$_id" },
pipeline: [
{ $match: { $expr: { $eq: [ "$$ownerId", "$owner" ] } } },
{ $count: "count" }
],
as: "numPets"
}
},
{
$unwind: "$numPets"
}
])
You can try below aggregation
db.owners.aggregate([
{ "$lookup": {
"from": "pets",
"let": { "ownerId": "$_id" },
"pipeline": [
{ "$match": { "$expr": { "$eq": [ "$$ownerId", "$owner" ] }}},
{ "$count": "count" }
],
"as": "numPets"
}},
{ "$project": {
"_id": 1,
"name": 1,
"numPets": { "$ifNull": [{ "$arrayElemAt": ["$numPets.count", 0] }, 0]}
}}
])

How to make lookup between two collections when an item in an array exists in the other collection?

In Lookup with a pipeline, I would like to get the linked records from an array in the parent document.
// Orders
[{
"_id" : ObjectId("5b5b91a25c68de2538620689"),
"Name" : "Test",
"Products" : [
ObjectId("5b5b919a5c68de2538620688"),
ObjectId("5b5b925a5c68de2538621a15")
]
}]
// Products
[
{
"_id": ObjectId("5b5b919a5c68de2538620688"),
"ProductName": "P1"
},
{
"_id": ObjectId("5b5b925a5c68de2538621a15"),
"ProductName": "P2"
}
,
{
"_id": ObjectId("5b5b925a5c68de2538621a55"),
"ProductName": "P3"
}
]
How to make a lookup between Orders and Products when Products field is an array!
I tried this query
db.getCollection("Orders").
aggregate(
[
{
$lookup:
{
from: "Products",
let: { localId: "$_id" , prods: "$Products" },
pipeline: [
{
"$match":
{
"_id" : { $in: "$$prods" }
}
},
{
$project:
{
"_id": "$_id",
"name": "$prods" ,
}
}
],
as: "linkedData"
}
},
{
"$skip": 0
},
{
"$limit": 1
},
]
)
This is not working because $in is expecting an array, and even though $$prods is an array, it is not accepting it.
Is my whole approach correct? How to make this magic join ?
You were going in the right direction the only thing you missed here is to use expr with in aggregation operator which matches the same fields of the document
db.getCollection("Orders").aggregate([
{ "$lookup": {
"from": "Products",
"let": { "localId": "$_id" , "prods": "$Products" },
"pipeline": [
{ "$match": { "$expr": { "$in": [ "$_id", "$$prods" ] } } },
{ "$project": { "_id": 1, "name": "$ProductName" } }
],
"as": "linkedData"
}},
{ "$skip": 0 },
{ "$limit": 1 }
])
See the docs here
You just need regular $lookup, the documentation states that:
If your localField is an array, you may want to add an $unwind stage to your pipeline. Otherwise, the equality condition between the localField and foreignField is foreignField: { $in: [ localField.elem1, localField.elem2, ... ] }.
So for below aggregation:
db.Orders.aggregate([
{
$lookup: {
from :"Products",
localField: "Products",
foreignField: "_id",
as: "Products"
}
}
])
you'll get following result for your sample data:
{
"_id" : ObjectId("5b5b91a25c68de2538620689"),
"Name" : "Test",
"Products" : [
{
"_id" : ObjectId("5b5b919a5c68de2538620688"),
"ProductName" : "P1"
},
{
"_id" : ObjectId("5b5b925a5c68de2538621a15"),
"ProductName" : "P2"
}
]
}
have you try unwind before the lookup. use unwind to brak the array annd then make lookup.