$lookup in MongoDB Provides Unexpected Data Structure Result - mongodb

I am trying to understand why a $lookup I'm using in my MongoDB aggregation is producing the result it is.
First off, my initial data looks like this:
"subscriptions": [
{
"agency": "3dg2672f145d0598be095634", // This is an ObjectId
"memberType": "primary"
}
]
Now, what I want to do is a simple $lookup, pulling in the related data for the ObjectId that's currently being populated as the value to the "agency" field.
What I tried doing was a $lookup like this:
{
"from" : "agencies",
"localField" : "subscriptions.0.agency",
"foreignField" : "_id",
"as" : "subscriptions.0.agency"
}
So, basically what I want to do is go get that info related to that ObjectId ref, and populate it right here, in place of where the ObjectId currently resides.
What I'd expect as a result is something like this:
"subscriptions": [
{
"agency": [
{
_id: <id-value>,
name: <name-value>,
address: <address-value>
}
],
"memberType": "primary"
}
]
Instead, I end up with this (with my "memberType" prop now nowhere to be found):
"subscriptions" : {
"0" : {
"agency" : [ <agency-data> ]
}
}
Why is this the result of the $lookup, and how can I get the data structure I'm looking for here?
To clarify further, in the docs they mention using an $unwind BEFORE the $lookup when it's an array field. But in this case, the actual local field being targeted and replaced by the $lookup is NOT an array, but it is within an array. So I'm not clear on what the problem is.

You need to use $unwind to match your "localField" with to the "foreignField" and then $group to rollback again to the array
db.collection.aggregate([
{ "$unwind": "$subsciption" },
{ "$lookup": {
"from": Agency.collection.name,
"localField": "subsciption.agency",
"foreignField": "_id",
"as": "subsciption.agency"
}},
{ "$group": {
"_id": "$_id",
"memberType": { "$first": "$memberType" },
"subsciption": { "$push": "$subsciption" },
}}
])

Basically, what OP is looking for is to transform data in desired format after looking up into another collection. Assuming there are two collections C1 and C2 where C1 contains document
{ "_id" : ObjectId("5b50b8ebfd2b5637081105c6"), "subscriptions" : [ { "agency" : "3dg", "memberyType" : "primary" } ] }
and C2 contains
{ "_id" : ObjectId("5b50b984fd2b5637081105c8"), "agency" : "3dg", "name" : "ABC", "address" : "1 some street" }
if following query is executed against database
db.C1.aggregate([
{$unwind: "$subscriptions"},
{
$lookup: {
from: "C2",
localField: "subscriptions.agency",
foreignField: "agency",
as: "subscriptions.agency"
}
}
])
We get result
{
"_id": ObjectId("5b50b8ebfd2b5637081105c6"),
"subscriptions": {
"agency": [{
"_id": ObjectId("5b50b984fd2b5637081105c8"),
"agency": "3dg",
"name": "ABC",
"address": "1 some street"
}],
"memberyType": "primary"
}
}
which is pretty close to what OP is looking forward.
Note: there may be some edge cases but with minor tweaks, this solution should work

Related

How to lookup through an Array in MongoDB and Project Names From a Certain Collection

I have two collections, one named Exports and one named Service. Inside the Exports collection there is an object that holds inside of it an array of servicesIds.
I want to aggregate and lookup for the corressponding matching _ids from the Exports collection with the Service collection to find the name of the services.
The structure of the each document for the two collection is as follows:
Exports:
{
"_id" : "818a2c4fc4",
"companyId" : "7feb1812d8",
"filter" : {
"servicesIds" : [
"0111138dc679d",
"0c18c499435e9",
],
},
"_created_at" : ISODate("2019-10-27T09:06:03.102+0000"),
"_updated_at" : ISODate("2019-10-27T09:06:05.099+0000"),
}
Service:
An example of one document with its _id is a foreign key inside the filters object then inside the servicesIds array
{
"_id" : "0111138dc679d",
"name" : "Bay Services",
"character" : "B",
"company" : {
"id" : "f718a1c385",
"name" : "xxx"
},
"active" : true,
"tags" : [
],
"_created_at" : ISODate("2020-04-09T06:36:14.442+0000"),
"_updated_at" : ISODate("2020-06-06T03:52:16.770+0000"),
}
How can i do that?
Here is what i tried, but it keeps giving me and error reading
Mongo Server error '$in requires an array as a second argument, found: missing' on server
Here is my code:
db.getCollection("Exports").aggregate([
{
"$match": { "companyId":"818a2c4fc4" },
},
{
"$lookup": {
"from": "Service",
"let":{ id : "$_id" },
"pipeline": [
{
"$match":
{
"$expr":
{
"$in": ["$$id","$filter.servicesIds"]
}
}
}
],
"as":"services"
}
},
])
$unwind the array first, or you can edit your answer with an expected result you want, then I will correct my answer.
db.Exports.aggregate([
{
"$match": {
"companyId": "7feb1812d8"
}
},
{
"$unwind": "$filter.servicesIds"
},
{
"$lookup": {
"from": "Service",
"localField": "filter.servicesIds",
"foreignField": "_id",
"as": "docs"
}
}
])
https://mongoplayground.net/p/l2VweVYz1Fy

How to get Child data after filtering dataset of Parents in a single mongo collection?

I have a single mongo collection called "CourseCollection" and it contains both parent and child doc. Any document with the key "Parent" is a child doc and a parent can have multiple child doc.
{
"_id" : "abracadavra",
"Name" : "abracadavra",
"Description" : "",
"Type" : "Spell",
"Parent" : {
"_id" : "Magic",
"Type" : "Course",
"Name" : "Magic"
}
},
{
"_id" : "Magic",
"Name" : "Magic",
"Type" : "Course",
"Access" : [
{
"_id" : "2sssdw5oe",
"Name" : "Abc"
},
{
"_id" : "4fddfye42",
"Name" : "Xyz"
}
]
}
What I'm trying to do is, based on the Access of Parent doc, I'm trying to get all the child doc.
Existing and working solution:
The solution that I have currently is to perform 2 queries.
Query 1. Get all the courses that the user has access to.
db.getCollection("CourseCollection").find({"Type": "Course", "Access._id": {"$in": ["2sssdw5oe"]}})
Query 2. Since I'm using Python, I do a list comprehension to get only the IDs of the course and then perform another query with this list
db.getCollection("CourseCollection").find({"Type": "Spell", "Parent._id": {"$in": course_list_id}})
Is there a way to get the child data after filtering out the parent in a single query. I also tried aggregation but only the results of the previous stage are passed to the next stage.
I guess you're trying to do something like this:
db.CourseCollection.aggregate([
{
"$match": {
"Type": "Spell"
}
},
{
"$lookup": {
"from": "CourseCollection",
"localField": "Parent._id",
"foreignField": "_id",
"as": "Parents"
}
},
{
"$match": {
"Parents": {
"$elemMatch": {
"Type": "Course",
"Access._id": {
"$in": [
"2sssdw5oe"
]
}
}
}
}
}
])
You can achieve the same result doing this too:
db.CourseCollection.aggregate([
{
"$match": {
"Type": "Spell"
}
},
{
"$lookup": {
"from": "CourseCollection",
"localField": "Parent._id",
"foreignField": "_id",
"as": "Parents",
"pipeline": [
{
"$match": {
"Type": "Course",
"Access._id": {
"$in": [
"2sssdw5oe"
]
}
}
}
]
}
},
{
"$match": {
"Parents.0": {
"$exists": true
}
}
}
])

MongoDB Query Optimization using Multiple $lookup's and $group

This is a simplified schema of the database I am working with:
Collection: documents
{
"_Id": "1",
"business": "e.g food",
"relationships": "192",
"components": "ObjectId(34927493..)",
"_Score": "10",
...
}
Collection: components
{
"_Id": "280948304320",
"assessments": "8394",
"relationships": "192",
"results":"ObjectId("82394792343")...."// can be many results
}
Collection: results
{
"_Id": "7978394243",
"state": "severe",
"parentComponent": "ObjectId("28907403")"
"confidence":"0.5"
"category":"Inspection"
}
I have a mongoDB query which is taking 200+ seconds to execute. Here it is below:
db.documents.aggregate([
{$match:
{ "business" : "food"}
},
{
$unwind: "$components"
},
{
$lookup:
{
from: "components",
localField: "components",
foreignField: "_id",
as: "matching_components"
}
},
{
$unwind: "$matching_components"
},
{
$lookup:
{
from: "results",
localField: "components",
foreignField: "parentComponent",
as: "list_results"
}
},
{
$unwind: "$list_results"
},
{$group :
{ _id : '$list_results.state', count : {$sum : 1}}
}
])
I am wondering if there is any way for me to improve the performance of this query. I tried using a group statement in the beginning of the query that groups the documents into their business category, but that did not work as I realized it removes the fields needed for the rest of the query. I indexed for all the fields that I am looking across.
Just to be clear I want to group the documents by their business field. Then I want to map to another collection called components that contains results. After I use another lookup to finally map to the results collection, I want to ultimately count the frequency of each state by business. Currently as you can see, I am using a match in the beginning instead just to see if the query works for one business type. Though the query works, it is taking around 140 seconds.
EDIT: Example Result from this Aggregation:
{
"_id" : State1",
"count" : 90699.0
}
{
"_id" : "State2",
"count" : 448869.0
}
{
"_id" : "State3",
"count" : 71399.0
}
{
"_id" : "State4",
"count" : 513928.0
}
{
"_id" : "State5",
"count" : 765509.0
}

How can I use a field from aggregate in a regex $match in mongodb?

A very simplified version of my use case is to find all posts beginning with the authors name, something like this:
> db.users.find();
{ "_id" : ObjectId("5c4185be19da7e815cb18f59"), "name" : "User1" }
{ "_id" : ObjectId("5c4185be19da7e815cb18f5a"), "name" : "User2" }
db.posts.insert([
{author : ObjectId("5c4185be19da7e815cb18f59"), text: "User1 is my name"},
{author : ObjectId("5c4185be19da7e815cb18f5a"), text: "My name is User2, but this post doesn't start with it"}
]);
So I want to identify all posts that start with the authors name. I'm trying with an aggregate like this, but I don't know how to extract the user's name from the aggregate pipeline to use in a regex match:
db.users.aggregate([
{
$lookup: {
from: "posts",
localField: "_id",
foreignField: "author",
as: "post"
}
},
{
$match: { "post.text": { $regex: "^" + name}}
}
]).pretty();
The thing "name" here is not something defined, I need to extract the name from the users collection entry from the previous step of the pipeline. For some reason I don't understand how to do that.
This is probably super simple and I'm definitely feeling thick as a brick hereā€¦ Any help highly appreciated!
You can use below aggregation using $indexOfCP
db.users.aggregate([
{ "$lookup": {
"from": "posts",
"let": { "authorId": "$_id", "name": "$name" },
"pipeline": [
{ "$match": {
"$expr": {
"$and": [
{ "$ne": [{ "$indexOfCP": ["$text", "$$name"] }, -1] },
{ "$eq": ["$author", "$$authorId"] }
]
}
}}
],
"as": "post"
}}
])

Aggregate Populate array of ids with Their Documents

I'm Strugling with some aggregation functions in mongodb.
I want to get books Documents in author's document that has just books ids as array of strings ids like this :
Author Document
{
"_id" : "10",
"full_name" : "Joi Dark",
"books" : ["100", "200", "351"],
}
And other documents (books) :
{
"_id" : "100",
"title" : "Node.js In Action",
"ISBN" : "121215151515154",
"date" : "2015-10-10"
}
So in result i want this :
{
"_id" : "10",
"full_name" : "Joi Dark",
"books" : [
{
"_id" : "100",
"title" : "Node.js In Action",
"ISBN" : "121215151515154",
"date" : "2015-10-10"
},
{
"_id" : "200",
"title" : "Book 2",
"ISBN" : "1212151454515154",
"date" : "2015-10-20"
},
{
"_id" : "351",
"title" : "Book 3",
"ISBN" : "1212151454515154",
"date" : "2015-11-20"
}
],
}
Use $lookup which retrieves data from the nominated collection in from based on the matching of the localField to the foreignField:
db.authors.aggregate([
{ "$lookup": {
"from": "$books",
"foreignField": "_id",
"localField": "books",
"as": "books"
}}
])
The as is where in the document to write an "array" containing the related documents. If you specify an existing property ( such as is done here ) then that property is overwritten with the new array content in output.
If you have a MongoDB before MongoDB 3.4 then you may need to $unwind the array of "books" as the localField first:
db.authors.aggregate([
{ "$unwind": "$books" },
{ "$lookup": {
"from": "$books",
"foreignField": "_id",
"localField": "books",
"as": "books"
}}
])
Which creates a new document for each array member in the original document, therefore use $unwind again and $group to create the original form:
db.authors.aggregate([
{ "$unwind": "$books" },
{ "$lookup": {
"from": "$books",
"foreignField": "_id",
"localField": "books",
"as": "books"
}},
{ "$unwind": "$books" },
{ "$group": {
"_id": "$_id",
"full_name": { "$first" "$full_name" },
"books": { "$push": "$books" }
}}
])
If in fact your _id values in the foreign collection of of ObjectId type, but you have values in the localField which are "string" versions of that, then you need to convert the data so the types match. There is no other way.
Run something like this through the shell to convert:
var ops = [];
db.authors.find().forEach(doc => {
doc.books = doc.books.map( book => new ObjectId(book.valueOf()) );
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": { "books": doc.books }
}
}
});
if ( ops.length >= 500 ) {
db.authors.bulkWrite(ops);
ops = [];
}
});
if ( ops.length > 0 ) {
db.authors.bulkWrite(ops);
ops = [];
}
That will convert all the values in the "books" array into real ObjectId values that can actually match in a $lookup operation.
Just adding on top of the previous answer. If your input consists of an array of strings and you want to convert them to ObjectIds, you can achieve this by using a projection, followed by a map and the $toObjectId method.
db.authors.aggregate([
{ $project: {
books: {
$map: {
input: '$books',
as: 'book',
in: { $toObjectId: '$$book' },
},
},
},},
{ $lookup: {
from: "$books",
foreignField: "_id",
localField: "books",
as: "books"
}
},
])
Ideally, your database would be formatted in such a manner that your aggregates are stored as ObjectIds, but in the case where that is not an option, this poses as a viable solution.