Problem 1
I have a collection named recipe in which all docs have a array field ingredients. I want to count those array items and write them into a new field ingredient_count.
Problem 2
There is also a collection named ingredient. The docs have a count field which is the total number of uses in all recipes.
My Current Approach
My solution right now is a script that aggregates over the collection and updates all documents one by one:
// PROBLEM 1: update recipe documents
db.recipe.aggregate(
[
{
$project: {
numberOfIngredients: { $size: "$ingredients" }
}
}
]
).forEach(function(recipe) {
db.recipe.updateOne(
{ _id: recipe._id },
{ $set: { incredient_count: recipe.numberOfIngredients } }
)
});
// PROBLEM 2: update ingredient documents
db.ingredient.find().snapshot().forEach(function(ingredient) {
db.ingredient.updateOne(
{ _id: ingredient._id },
{ $set: { count: db.recipe.count({ ingredients: { $in: [ingredient.name] } })) } }
)
});
This is terribly slow. Any idea how to do this more efficiently?
For both problem it's possible to only perform aggregation that output to new collections that would replace existing one :
Problem1
The aggregation contains one $project for counting ingredients with the list of field to keep :
db.recipe.aggregate([{
$project: {
ingredients: 1,
numberOfIngredients: { $size: "$ingredients" }
}
}, {
$out: "recipeNew"
}])
that give you :
{ "_id" : ObjectId("58155bc09c924e717c5c4240"), "ingredients" : [......], "numberOfIngredients" : 5 }
{ "_id" : ObjectId("58155bc19c924e717c5c4241"), "ingredients" : [......], "numberOfIngredients" : 3 }
The result of the aggregation is written to a new collection recipeNew that can replace the existing recipe collection
Problem2
The aggregation contains :
1 $unwind to remove ingredients array
1 $group to sum occurence of each ingredients & group by ingredients _id
1 $lookup that join ingredients collection to the current aggregation to retrieve all fields for specified ingredients
1 $unwind to remove the array of imported ingredients items
1 $project to select fields to keep
1 $out to output the result to a new collection
Query is :
db.recipe.aggregate([{
$unwind: "$ingredients"
}, {
$group: { _id: "$ingredients", IngredientsNumber: { $sum: 1 } }
}, {
$lookup: {
from: "ingredients",
localField: "_id",
foreignField: "_id",
as: "ingredientsDB"
}
}, {
$unwind: { path: "$ingredientsDB", preserveNullAndEmptyArrays: true }
}, {
$project: {
ingredientsNumber: "$IngredientsNumber",
name: "$ingredientsDB.name"
}
}, {
$out: "ingredientsTemp"
}])
That gives :
{ "_id" : ObjectId("5812caaeb4829937f4599b54"), "ingredientsNumber" : 2, "name" : "ingredients5" }
{ "_id" : ObjectId("5812caaeb4829937f4599b53"), "ingredientsNumber" : 1, "name" : "ingredients4" }
{ "_id" : ObjectId("5812caaeb4829937f4599b52"), "ingredientsNumber" : 2, "name" : "ingredients3" }
{ "_id" : ObjectId("5812caaeb4829937f4599b51"), "ingredientsNumber" : 1, "name" : "ingredients2" }
{ "_id" : ObjectId("5812caaeb4829937f4599b50"), "ingredientsNumber" : 2, "name" : "ingredients1" }
The cons of this solution :
It uses $project so you need to specify the fields to keep
you will get a new ingredientsTemp collection containing only ingredients that are actually present in recipes so one additionnal aggregation with a $lookup should be necessary to join the existing one with the one you got from that aggregation :
The following will join the existing ingredients collection with the one we have created :
db.ingredients.aggregate([{
$lookup: {
from: "ingredientsTemp",
localField: "_id",
foreignField: "_id",
as: "ingredientsDB"
}
}, {
$unwind: { path: "$ingredientsDB", preserveNullAndEmptyArrays: true }
}, {
$project: {
name: "$name",
ingredientsNumber: "$ingredientsDB.ingredientsNumber"
}
}])
Then you would have :
{ "_id" : ObjectId("5812caaeb4829937f4599b50"), "name" : "ingredients1", "ingredientsNumber" : 2 }
{ "_id" : ObjectId("5812caaeb4829937f4599b51"), "name" : "ingredients2", "ingredientsNumber" : 1 }
{ "_id" : ObjectId("5812caaeb4829937f4599b52"), "name" : "ingredients3", "ingredientsNumber" : 2 }
{ "_id" : ObjectId("5812caaeb4829937f4599b53"), "name" : "ingredients4", "ingredientsNumber" : 1 }
{ "_id" : ObjectId("5812caaeb4829937f4599b54"), "name" : "ingredients5", "ingredientsNumber" : 2 }
{ "_id" : ObjectId("5812caaeb4829937f4599b57"), "name" : "ingredients6" }
The goods :
It uses only aggregation so it should be quicker
Related
I have two collections events & members :
events Schema :
{
name : String,
members: [{status : Number, memberId : {type: Schema.Types.ObjectId, ref: 'members'}]
}
events Sample Doc :
"_id" : ObjectId("5e8b0bac041a913bc608d69d")
"members" : [
{
"status" : 4,
"_id" : ObjectId("5e8b0bac041a913bc608d69e"),
"memberId" : ObjectId("5e7dbf5b257e6b18a62f2da9"),
"date" : ISODate("2020-04-06T10:59:56.997Z")
},
{
"status" : 1,
"_id" : ObjectId("5e8b0bf2041a913bc608d6a3"),
"memberId" : ObjectId("5e7e2f048f80b46d786bfd67"),
"date" : ISODate("2020-04-06T11:01:06.463Z")
}
],
members Schema :
{
firstname : String
photo : String
}
members Sample Doc :
[{
"_id" : ObjectId("5e7dbf5b257e6b18a62f2da9"),
"firstname" : "raed",
"photo" : "/users/5e7dbf5b257e6b18a62f2da9/profile/profile-02b13aef6e.png"
},
{
"_id" : ObjectId("5e7e2f048f80b46d786bfd67"),
"firstname" : "sarra",
"photo" : "/5e7e2f048f80b46d786bfd67/profile/profile-c79f91aa2e.png"
}]
I made a query with aggregate, and lookup to get populated data of members, and I want to concat the photo fields of the members by a string, but I get an error,
How can I do the concat ?
Query :
db.getCollection('events').aggregate([
{ $match: { _id: ObjectId("5e8b0bac041a913bc608d69d")}},
{
"$lookup": {
"from": "members",
"localField": "members.memberId",
"foreignField": "_id",
"as": "Members"
}
},
{
$project: {
"Members.firstname" : 1,
"Members.photo": 1,
//"Members.photo": {$concat:["http://myurl", "$Members.photo"]},
"Members._id" : 1,
},
}
])
Result without the concat :
{
"_id" : ObjectId("5e8b0bac041a913bc608d69d"),
"Members" : [
{
"_id" : ObjectId("5e7dbf5b257e6b18a62f2da9"),
"firstname" : "raed",
"photo" : "/users/5e7dbf5b257e6b18a62f2da9/profile/profile-02b13aef6e.png"
},
{
"_id" : ObjectId("5e7e2f048f80b46d786bfd67"),
"firstname" : "sarra",
"photo" : "/5e7e2f048f80b46d786bfd67/profile/profile-c79f91aa2e.png"
}
]
}
Error :
$concat only supports strings, not array
You can do that simply by adding pipeline to $lookup stage
db.events.aggregate([
{
$match: {
_id: ObjectId("5e8b0bac041a913bc608d69d"),
},
},
{
$lookup: {
from: "members",
let: { memberId: "$members.memberId" },
pipeline: [
{ $match: { $expr: { $in: ["$_id", "$$memberId"] } } },
{
$project: {
firstname: 1,
photo: { $concat: ["http://myurl", "$photo"] }
}
}
],
as: "Members",
}
},
/** Optional */
{$project : {Members: 1}}
]);
Test : MongoDB-Playground
the alternative of using a pipeline in the above answer
we may use project and group
db.events.aggregate([
{
$match: { _id: ObjectId("5e8b0bac041a913bc608d69d") }
},
{
$unwind: '$members' // to spread the members array into a stream of documents
},
{
$lookup: {
from: "members",
localField: "members.memberId",
foreignField: "_id",
as: "member"
}
},
{
$unwind: '$member' // each document will have array of only one member, so do this unwind to convert it to an object
},
{
$project: { // do the project here to be able to use the $concat operator
'member._id': 1,
'member.firstname': 1,
'member.photo': 1,
'member.photo': { $concat: ['http://myurl', '$member.photo'] } // now we can use the $concat as member is an object, then member.photo exists
}
},
{
$group: { // do that grouping stage to gather all the members belong to the same document in one array again
_id: '$_id',
Members: {
$addToSet: '$member'
}
}
}
])
I want to get the order of some user from a list after $sort aggregation pipeline.
Let's say we have a leaderboard, and I need to get my rank in the leaderboard with only one query getting only my data.
I have tried $addFields and some queries with $map
Let's say we have these documents
/* 1 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f067"),
"name" : "x4",
"points" : 69
},
/* 2 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f07b"),
"name" : "x24",
"points" : 968
},
/* 3 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f06a"),
"name" : "x7",
"points" : 997
},
And I want to write a query like this
db.table.aggregate(
[
{ $sort : { points : 1 } },
{ $addFields: { order : "$index" } },
{ $match : { name : "x24" } }
]
)
I need to inject the order field with something like $index
I expect to have something like this in return
{
"_id" : ObjectId("5d5963e1c6c93b2da849f07b"),
"name" : "x24",
"points" : 968,
"order" : 2
}
I need something like the metadata of the result here which return 2
/* 2 createdAt:8/18/2019, 4:42:41 PM*/
One of the workaround for this situation is to convert your all documents into one single array and hence resolve the index of the document using this array with help of $unwind and finally project the data with fields as required.
db.collection.aggregate([
{ $sort: { points: 1 } },
{
$group: {
_id: 1,
register: { $push: { _id: "$_id", name: "$name", points: "$points" } }
}
},
{ $unwind: { path: "$register", includeArrayIndex: "order" } },
{ $match: { "register.name": "x4" } },
{
$project: {
_id: "$register._id",
name: "$register.name",
points: "$register.points",
order: 1
}
}
]);
To make it more efficient you can apply limit, match, and filter as per your requirement.
I have many collections. The view of these collections is the same as the JSON. What I want to do is to collect the collections according to their id and create a collection. How can I do it?
A.json
{
"_id" : ObjectId("58455d2d506c1cab1c82152c"),
"value" : 515835.0
}
{
"_id" : ObjectId("58455d2d506c1cab1c82153c"),
"value" : 6621696.0
}
B.json
{
"_id" : ObjectId("58455d2d506c1cab1c82152c"),
"value" : 2118.0
}
{
"_id" : ObjectId("58455d2d506c1cab1c82153c"),
"value" : 1190.0
}
{
"_id" : ObjectId("423232d2d506c1cab1c1232c"),
"value" : 10.0
}
Collect in A collection, id: 1, collection B id: 1 if it matches.
A in the collection, id: 2, if you are not in any collection, you are only showing that value.
In the last collection I have collected, I want to make a collection of objects and id in pairs.
Result.json
{
"_id" : ObjectId("58455d2d506c1cab1c82152c"),
"value" : 517953.0 // A.value + B.value
}
{
"_id" : ObjectId("58455d2d506c1cab1c82153c"),
"value" : 6633596.0 // A.value + B.value
}
{
"_id" : ObjectId("423232d2d506c1cab1c1232c"),
"value" : 10.0 // B.value (A.value : null)
}
i want this for multiple collections.
For 1-0/1 relation given in the example you can use $lookup as following:
db.B.aggregate([
{$lookup: {
from: "A",
localField: "_id",
foreignField: "_id",
as: "a"
}},
{$unwind: {path: "$a", preserveNullAndEmptyArrays: true}},
{$project: {
value: {$add: ["$value", {$ifNull: ["$a.value", 0 ]}]}
}}
]);
It does ignore any documents in A, which have no corresponding documents in B, i.e. result have the same number of documents as in collection B.
Is there a query i can use on the following collection to get the result at the bottom?
Example:
{
"_id" : ObectId(xyz),
"name" : "Carl",
"something":"else"
},
{
"_id" : ObectId(aaa),
"name" : "Lenny",
"something":"else"
},
{
"_id" : ObectId(bbb),
"name" : "Carl",
"something":"other"
}
I need a query to get this result:
{
"_id" : ObectId(xyz),
"name" : "Carl"
},
{
"_id" : ObectId(aaa),
"name" : "Lenny"
},
A set of documents with no identical names. Its not important which _ids are kept.
You can use aggregation framework to get this shape, the query could look like this:
db.collection.aggregate(
[
{
$group:
{
_id: "$name",
id: { $first: "$_id" }
}
},
{
$project:{
_id:"$id",
name:"$_id"
}
}
]
)
As long as you don't need other fields this will be sufficient.
If you need to add other fields - please update document structure and expected result.
as you don't care about ids it can be simplified
db.collection.aggregate([{$group:{_id: "$name"}}])
I have the following MongoDB collection db.students:
/* 0 */
{
"id" : "0000",
"name" : "John"
"subjects" : [
{
"professor" : "Smith",
"day" : "Monday"
},
{
"professor" : "Smith",
"day" : "Tuesday"
}
]
}
/* 1 */
{
"id" : "0001",
"name" : "Mike"
"subjects" : [
{
"professor" : "Smith",
"day" : "Monday"
}
]
}
I want to find the number of subjects for a given student. I have a query:
db.students.find({'id':'0000'})
that will return the student document. How do I find the count for 'subjects'? Is it doable in a simple query?
If query will return just one element :
db.students.find({'id':'0000'})[0].subjects.length;
For multiple elements in cursor :
db.students.find({'id':'0000'}).forEach(function(doc) {
print(doc.subjects.length);
})
Do not forget to check existence of subjects either in query or before check .length
You could use the aggregation framework
db.students.aggregate(
[
{ $match : {'_id': '0000'}},
{ $unwind : "$subjects" },
{ $group : { _id : null, number : { $sum : 1 } } }
]
);
The $match stage will filter based on the student's _id
The $unwind stage will deconstruct your subjects array to multiple documents
The $group stage is when the count is done. _id is null because you are doing the count for only one user and only need to count.
You will have a result like :
{ "result" : [ { "_id" : null, "number" : 187 } ], "ok" : 1 }
Just another nice and simple aggregation solution:
db.students.aggregate([
{ $match : { 'id':'0000' } },
{ $project: {
subjectsCount: { $cond: {
if: { $isArray: "$subjects" },
then: { $size: "$subjects" },
else: 0
}
}
}
}
]).then(result => {
// handle result
}).catch(err => {
throw err;
});
Thanks!