Mongo aggregation framework on big data - mongodb

Could you please help me with mongoDB aggregation. Here is what I would like to do next:
I have collection A. A document from A represents an object like:
{
nameA: 'first',
items: [
'item1',
'item2',
'item3',
'item4'
]
}
And I have the Collection B with documents like:
[
{
item: 'item3',
info: 'info1'
},
{
item: 'item3',
info: 'info2'
},
{
item: 'item3',
info: 'info3'
}
]
I work with big data, so it would be better to do it in one query. Imagine that we already have all data from collection A. I would like to build a query on collection B to get next structure result:
{
'first'/*nameA*/: ['info1', 'info2', 'info3'],
....
}
How do I achieve the desired result with MongoDB aggregation?

As Rahul Kumar mentioned in his comment, your design is more leaning towards a relational database schema design, and it makes it quite difficult to design efficient MongoDB it.
However, it is still possible to achieve the functionality you are looking for by leveraging the $lookup stage of the aggregation framework, as follows:
db.A.aggregate([
{
$unwind: {
path: "$items"
}
},
{
$lookup: {
from: "B",
localField: "items",
foreignField: "item",
as: "item_info"
}
},
{
$unwind: {
path: "$item_info"
}
},
{
$group: {
_id: "$nameA",
item_info: { $addToSet: "$item_info.info" }
}
}
]);
In the first $unwind stage you normalize the items array on
collection A in order to be able to pass its output to the next
stage
In the $lookup stage you make a left join between two collections
that are part of the same database, in this case used to get the
item information from collection B
In the second $unwind stage you normalize the data you extracted
from collection B in order to flatten the array containing the
objects from collection B that were mapped to the corresponding
items in collection A
Finally, in the $group stage you group all the entries of the
result set by nameA and create an array of unique item information
values. If you would like to have all the duplicate occurrences of
the item information values, you can replace the $addToSet
accumulator with $push.
Below is the result of running the above aggregation pipeline on the collections that you provided:
{ "_id" : "second", "item_info" : [ "info3", "info2", "info1" ] }
{ "_id" : "first", "item_info" : [ "info3", "info2", "info1" ] }

Related

How to update fields in a MongoDB collection if certain conditions met between two collections?

What am I doing?
So I am trying to update two fields in my MongoDB collection. The collection name is mydata and looks like this:
{
id: 123,
name: "John",
class: "A-100",
class_id: "", <-- Need to update this field,
class_type: "", <-- Need to update this field
}
What do I want to do?
I have another collection that is older, but it contains two fields that I need that I do not have in my current collection. But they both have the id field that corresponds. This is how it looks like the other collection:
{
id: 123,
name: "John",
class: "A-100",
class_id: 235, <-- Field That I need,
class_type: "Math" <-- Field That I need
}
What have I done so far?
I started an aggregate function that starts with a $lookup then $unwind then $match then $project. Looks like this:
db.mydata.aggregate([
{
$lookup: {
from: "old_collection",
localField: "id",
foreignField: "id",
as: "newData"
}
},
{
$unwind: "newData"
},
{
$match: {"class": "A-100"}
},
{
$project: {
_id: 0,
"id": "$newData.id",
"class_id": "$newData.class_id",
"class_type": "$newData.class_type"
}
},
Need help here to update mydata collection in the
fields that I pointed in the top
])
In summary
What I am trying to do is: If two objects from different collections have the same Id then pick the keys from the second object and update the keys in the first object.
Is there a way to do that in MongoDB?
Thanks.

Is it possible to retrieve an item randomly and modify it with Mongodb?

I'm looking to retrieve an item randomly from my collection and modify it with a fixed value, unfortunately I can't find anything to do this in a single query. I have looked at the pipelines that aggregate () offers but none of them allow me to modify my database.
you may try to play with $sample and $merge aggregation stages:
db.test.aggregate(
[{
$sample: {
size: 1
}
},
{
$addFields: { // you may also look at $project
test: 1
}
},
{
$merge: {
into: "test",
on: "_id"
}
}
])

Find documents matching ObjectIDs in a foreign array

I have a collection Users:
{
_id: "5cds8f8rfdshfd"
name: "Ted"
attending: [ObjectId("2cd9fjdkfsld")]
}
I have another collection Events:
{
_id: "2cd9fjdkfsld"
title: "Some Event Attended"
},
{
_id: "34dshfj29jg"
title: "Some Event NOT Attended"
}
I would like to return a list of all events being attended by a given user. However, I need to do this query from the Events collection as this is part of a larger query.
I have gone through the following questions:
$lookup on ObjectId's in an array - This question has the array as a local field; mine is foreign
MongoDB lookup when foreign field is an array of objects - The array is of objects themselves
MongoDB lookup when foreign field is an array
I have tried various ways of modifying the above answers to fit my situation but have been unsuccessful. The second answer from the third question gets me closest but I would like to filter out unmatching results rather than have them returned with a value of 0.
My desired output:
[
{
_id: "2cd9fjdkfsld"
title: "Some Event Attended"
},
]
One option would be like this:
db.getCollection('Events').aggregate({
$lookup: // join
{
from: "Users", // on Users collection
let: { eId: "$_id" }, // keep a local variable "eId" that points to the currently looked at event's "_id"
pipeline: [{
$match: { // filter where
"_id": ObjectId("5c6efc937ef75175b2b8e7a4"), // a specific user
$expr: { $in: [ "$$eId", "$attending" ] } // attends the event we're looking at
}
}],
as: "users" // push all matched users into the "users" array
}
}, {
$match: { // remove events that the user does not attend
"users": { $ne: [] }
}
})
You could obviously get rid of the users field by adding another projection if needed.

How can I compare two fields in diffrent two collections in mongodb?

I am beginner in the MongoDB.
Right now, I am making one query by using mongo. Please look this and let me know is it possible? If it is possible, how can I do?
collection:students
[{id:a, name:a-name}, {id:b, name:b-name}, {id:c, name:c-name}]
collection:school
[{
name:schoolA,
students:[a,b,c]
}]
collection:room
[{
name:roomA,
students:[c,a]
}]
Expected result for roomA
{
name:roomA,
students:[
{id:a name:a-name isRoom:YES},
{id:b name:b-name isRoom:NO},
{id:c name:c-name isRoom:YES}
]
}
Not sure about the isRoom property, but to perform a join across collections, you'd have two basic options:
code it yourself, with multiple queries
use the aggregation pipeline with $lookup operator
As a quick example of $lookup, you can take a given room, unwind its students array (meaning separate out each student element into its own entity), and then look up the corresponding student id in the student collection.
Assuming a slight tweak to your room collection document:
[{
name:"roomA",
students:[ {studentId: "c"}, {studentId: "a"}]
}]
Something like:
db.room.aggregate([
{
$unwind: "$students"
},
{
$lookup:
{
from: "students",
localField: "studentid",
foreignField: "id",
as: "classroomStudents"
}
},
{
$project:
{ _id: 0, name : 1 , classroomStudents : 1 }
}
])
That would yield something like:
{
name:"roomA",
classroomStudents: [
{id:"a", name:"a-name"},
{id:"c", name:"c-name"}
]
}
Disclaimer: I haven't actually run this aggregation, so there may be a few slight issues. Just trying to give you an idea of how you'd go about solving this with $lookup.
More info on $lookup is here.

mongodb - perform batch query

I need query data from collection a first, then according to those data, query from collection b. Such as:
For each id queried from a
query data from b where "_id" == id
In SQL, this can be done by join table a & b in a single select. But in mongodb, it needs do multi query, it seems inefficient, doesn't it? Or it can be done by just 2 queries?(one for a, another for b, rather than 1 plus n) I know NoSQL doesn't support join, but is there a way to batch execute queries in for loop into a single query?
You'll need to do it as two steps.
Look into the $in operator (reference) which allows passing an array of _ids for example. Many would suggest you do those in batches of, say, 1000 _ids.
db.myCollection.find({ _id : { $in : [ 1, 2, 3, 4] }})
It's very simple, don't make so many DB calls for each id, it is very inefficient, it is possible to execute a single query which will return all documents relevant to each of the ids in a single pass using the $in operator in MongoDB, which is synonymous to in syntax in SQL so for example if you need to find out the documents for 5 ids in a single pass then
const ids = ['id1', 'id2', 'id3', 'id4', 'id5'];
const results = db.collectionName.find({ _id : { $in : ids }})
This will get you all the relevant documents in a single pass.
This is basically a join in SQL parlance and you can do it in mongo using an aggregate query called $lookup.
For example if you had some types in collections like these:
interface IFoo {
_id: ObjectId
name: string
}
interface IBar {
_id: ObjectId
fooId: ObjectId
title: string
}
Then query with an aggregate like this:
await db.foos.aggregate([
{
$match: { _id: { $in: ids } } // query criteria here
},
{
$lookup: {
from: 'bars',
localField: '_id',
foreignField: 'fooId',
as: 'bars'
}
}
])
May produce resulting objects like this:
{
"_id": "foo0",
"name": "example foo",
"bars": [
{ _id: "bar0", "fooId": "foo0", title: "crow bar" }
]
}