How can I compare two fields in diffrent two collections in mongodb? - mongodb

I am beginner in the MongoDB.
Right now, I am making one query by using mongo. Please look this and let me know is it possible? If it is possible, how can I do?
collection:students
[{id:a, name:a-name}, {id:b, name:b-name}, {id:c, name:c-name}]
collection:school
[{
name:schoolA,
students:[a,b,c]
}]
collection:room
[{
name:roomA,
students:[c,a]
}]
Expected result for roomA
{
name:roomA,
students:[
{id:a name:a-name isRoom:YES},
{id:b name:b-name isRoom:NO},
{id:c name:c-name isRoom:YES}
]
}

Not sure about the isRoom property, but to perform a join across collections, you'd have two basic options:
code it yourself, with multiple queries
use the aggregation pipeline with $lookup operator
As a quick example of $lookup, you can take a given room, unwind its students array (meaning separate out each student element into its own entity), and then look up the corresponding student id in the student collection.
Assuming a slight tweak to your room collection document:
[{
name:"roomA",
students:[ {studentId: "c"}, {studentId: "a"}]
}]
Something like:
db.room.aggregate([
{
$unwind: "$students"
},
{
$lookup:
{
from: "students",
localField: "studentid",
foreignField: "id",
as: "classroomStudents"
}
},
{
$project:
{ _id: 0, name : 1 , classroomStudents : 1 }
}
])
That would yield something like:
{
name:"roomA",
classroomStudents: [
{id:"a", name:"a-name"},
{id:"c", name:"c-name"}
]
}
Disclaimer: I haven't actually run this aggregation, so there may be a few slight issues. Just trying to give you an idea of how you'd go about solving this with $lookup.
More info on $lookup is here.

Related

Mongo $lookup, which way is the fastest?

it has been a while since I began using MongoDB aggregation.
It's a great way to perform complex queries and it has improved my app's performance in ways I never thought it was possible.
However, I came across $lookup and it appears that there are 3 ways of performing them. I would like to know what are the the advantages and drawbacks to each of them.
For the below examples, I am starting from collectionA using fieldA to match documents from collectionB using fieldB
What I'd call preset $lookup
{
$lookup: {
from: 'collectionB',
localField: 'fieldA',
foreignField: 'fieldB',
as: 'documentsB'
}
}
What I'd call custom $lookup
{
$lookup: {
from: 'collectionB',
let: { valueA: '$fieldA' },
pipeline: [
{
$match: {
$expr: {
$eq: ['$$valueA', '$fieldB']
}
}
}
],
as: 'documentsB'
}
}
Perfoming a find then an aggregate on collectionB
const docsA = db.collection('collectionA').find({}).toArray();
// Basically I will extract all values possible for the query to docB
const valuesForB = docsA.map((docA) => docA.fieldA);
db.collection('collectionB').aggregate([
{
$match: {
fieldB: { $in: valuesForB }
}
}
]);
I'd like to know which one is the fastest
If there are any parameters that makes one faster than the others
If there are any limitations to one of them
From what I can tell, I found :
find + aggregate faster than preset $lookup which is faster than custom $lookup
But then I wonder why custom $lookup exists...
If data is too large than the preset lookup will be faster.
why
All the data is looked up at the database level the data is to be held in another variable.
While in find and aggregate will take longer as data is larger and while aggregating you are just increasing the data.
TIP
If you want to use find and aggregate than you should see the distinct query of MongoDB.
Example
var arr = db.collection('collectionA').distinct('fieldA',{});
db.collection('collectionB').aggregate([
{
$match: {
fieldB: { $in: arr}
}
}
]);

Why MongoDb sort is slow with lookup collections

I have two collections in my mongodb database as follows:
employee_details with approximately 330000 documents which has department_id as a reference from departments collection
departments collections with 2 fields _id and dept_name
I want to join the above two collections using department_id as foreign key by using lookup method. Join works fine but the mongo query execution takes long time when I add sort.
Note: The execution is fast If I remove the sort object or If I remove the lookup method.
I have referred several posts in different blogs and SO, but none of them give a solution with sort.
My query is given below:
db.getCollection("employee_details").aggregate([
{
$lookup: {
from: "departments",
localField: "department_id",
foreignField: "_id",
as: "Department"
}
},
{ $unwind: { path: "$Department", preserveNullAndEmptyArrays: true } },
{ $sort: { employee_fname: -1 } },
{ $limit: 10 }
]);
Can someone give a method to make the above query to work without delay, as my client cannot compromise with the performance delay. I hope there is some method to fix the performance issue as nosql is intented to handle large database.
Any indexing methods is available there? so that I can use it along with my same collection structure.
Thanks in advance.
Currently lookup will be made for every employee_details which means for 330000 times, but if we first sort and limit before lookup, it will be only 10 times. This will greatly decrease query time.
db.getCollection('employee_details').aggregate([
{$sort : {employee_fname: -1}},
{$limit :10},
{
$lookup : {
from : "departments",
localField : "department_id",
foreignField : "_id",
as : "Department"
}
},
{ $unwind : { path: "$Department", preserveNullAndEmptyArrays: true }},
])
After trying this, if you even want to decrease the response time you can define an index on the sort field.
db.employee_details.createIndex( { employee_fname: -1 } )

MongoDB aggregate ID's efficiently for bulk searches?

I have more than 8 references in a MongoDB document. Those are Object ID's stored in the origin document and in order to get the real data of the foreign I have to make an aggregation query, something like this:
{
$lookup: {
from: "departments",
let: { "department": "$_department" },
pipeline: [
{ $match: { $expr: { $eq: ["$_id", "$$department"] }}},
],
as: "department"
}
},
{
$unwind: { "path": "$department", "preserveNullAndEmptyArrays": true }
},
That is working and instead of ObjectId I got the real department object.
However this takes time and make the finding queries to take lot of time.
I have noticed that I have the same ID's multiple times and it's better to collect all of the unique ID's and just fetch them once from DB and then just reuse the same object.
I don't know any plugin or a service doing so, using MongoDB. I can make one bymyself I just want to know before I work on something like this, if there any kind of a service or package in Github?

Mongo aggregation framework on big data

Could you please help me with mongoDB aggregation. Here is what I would like to do next:
I have collection A. A document from A represents an object like:
{
nameA: 'first',
items: [
'item1',
'item2',
'item3',
'item4'
]
}
And I have the Collection B with documents like:
[
{
item: 'item3',
info: 'info1'
},
{
item: 'item3',
info: 'info2'
},
{
item: 'item3',
info: 'info3'
}
]
I work with big data, so it would be better to do it in one query. Imagine that we already have all data from collection A. I would like to build a query on collection B to get next structure result:
{
'first'/*nameA*/: ['info1', 'info2', 'info3'],
....
}
How do I achieve the desired result with MongoDB aggregation?
As Rahul Kumar mentioned in his comment, your design is more leaning towards a relational database schema design, and it makes it quite difficult to design efficient MongoDB it.
However, it is still possible to achieve the functionality you are looking for by leveraging the $lookup stage of the aggregation framework, as follows:
db.A.aggregate([
{
$unwind: {
path: "$items"
}
},
{
$lookup: {
from: "B",
localField: "items",
foreignField: "item",
as: "item_info"
}
},
{
$unwind: {
path: "$item_info"
}
},
{
$group: {
_id: "$nameA",
item_info: { $addToSet: "$item_info.info" }
}
}
]);
In the first $unwind stage you normalize the items array on
collection A in order to be able to pass its output to the next
stage
In the $lookup stage you make a left join between two collections
that are part of the same database, in this case used to get the
item information from collection B
In the second $unwind stage you normalize the data you extracted
from collection B in order to flatten the array containing the
objects from collection B that were mapped to the corresponding
items in collection A
Finally, in the $group stage you group all the entries of the
result set by nameA and create an array of unique item information
values. If you would like to have all the duplicate occurrences of
the item information values, you can replace the $addToSet
accumulator with $push.
Below is the result of running the above aggregation pipeline on the collections that you provided:
{ "_id" : "second", "item_info" : [ "info3", "info2", "info1" ] }
{ "_id" : "first", "item_info" : [ "info3", "info2", "info1" ] }

mongodb - perform batch query

I need query data from collection a first, then according to those data, query from collection b. Such as:
For each id queried from a
query data from b where "_id" == id
In SQL, this can be done by join table a & b in a single select. But in mongodb, it needs do multi query, it seems inefficient, doesn't it? Or it can be done by just 2 queries?(one for a, another for b, rather than 1 plus n) I know NoSQL doesn't support join, but is there a way to batch execute queries in for loop into a single query?
You'll need to do it as two steps.
Look into the $in operator (reference) which allows passing an array of _ids for example. Many would suggest you do those in batches of, say, 1000 _ids.
db.myCollection.find({ _id : { $in : [ 1, 2, 3, 4] }})
It's very simple, don't make so many DB calls for each id, it is very inefficient, it is possible to execute a single query which will return all documents relevant to each of the ids in a single pass using the $in operator in MongoDB, which is synonymous to in syntax in SQL so for example if you need to find out the documents for 5 ids in a single pass then
const ids = ['id1', 'id2', 'id3', 'id4', 'id5'];
const results = db.collectionName.find({ _id : { $in : ids }})
This will get you all the relevant documents in a single pass.
This is basically a join in SQL parlance and you can do it in mongo using an aggregate query called $lookup.
For example if you had some types in collections like these:
interface IFoo {
_id: ObjectId
name: string
}
interface IBar {
_id: ObjectId
fooId: ObjectId
title: string
}
Then query with an aggregate like this:
await db.foos.aggregate([
{
$match: { _id: { $in: ids } } // query criteria here
},
{
$lookup: {
from: 'bars',
localField: '_id',
foreignField: 'fooId',
as: 'bars'
}
}
])
May produce resulting objects like this:
{
"_id": "foo0",
"name": "example foo",
"bars": [
{ _id: "bar0", "fooId": "foo0", title: "crow bar" }
]
}