Mongodb Aggregate function with reference to another collection - mongodb

I have two collections
Persons
{
"name": "Tom",
"car_id": "55b73e3e8ead0e220d8b45f3"
}
...
Cars
{
"model": "BMW",
"car_id": "55b73e3e8ead0e220d8b45f3"
}
...
How can i do a query such that i can get the following results (for example)
BMW : 2
Toyota: 3
Below is my aggregate function. I can get it the data out, however it does the car _id instead of the model name.
db.persons.aggregate([
{
$group: {
_id: {
car_Id: "$car_id",
},
carsCount: {$sum: 1}
},
},
]);
Appreciate any assistance.

you can aggregate internally in one collection with one query. I would suggest to keep aggregated data like that or similar:
{
car_id: 273645jhg2f52345hj3564jh6
sum: 4
}
And replace ID with name when you will need to expose data to the user.

Related

Can't remove object in array using Mongoose

This has been extensively covered here, but none of the solutions seems to be working for me. I'm attempting to remove an object from an array using that object's id. Currently, my Schema is:
const scheduleSchema = new Schema({
//unrelated
_id: ObjectId
shifts: [
{
_id: Types.ObjectId,
name: String,
shift_start: Date,
shift_end: Date,
},
],
});
I've tried almost every variation of something like this:
.findOneAndUpdate(
{ _id: req.params.id },
{
$pull: {
shifts: { _id: new Types.ObjectId(req.params.id) },
},
}
);
Database:
Database Format
Within these variations, the usual response I've gotten has been either an empty array or null.
I was able slightly find a way around this and accomplish the deletion by utilizing the main _id of the Schema (instead of the nested one:
.findOneAndUpdate(
{ _id: <main _id> },
{ $pull: { shifts: { _id: new Types.ObjectId(<nested _id>) } } },
{ new: true }
);
But I was hoping to figure out a way to do this by just using the nested _id. Any suggestions?
The problem you are having currently is you are using the same _id.
Using mongo, update method allows three objects: query, update and options.
query object is the object into collection which will be updated.
update is the action to do into the object (add, change value...).
options different options to add.
Then, assuming you have this collection:
[
{
"_id": 1,
"shifts": [
{
"_id": 2
},
{
"_id": 3
}
]
}
]
If you try to look for a document which _id is 2, obviously response will be empty (example).
Then, if none document has been found, none document will be updated.
What happens if we look for a document using shifts._id:2?
This tells mongo "search a document where shifts field has an object with _id equals to 2". This query works ok (example) but be careful, this returns the WHOLE document, not only the array which match the _id.
This not return:
[
{
"_id": 1,
"shifts": [
{
"_id": 2
}
]
}
]
Using this query mongo returns the ENTIRE document where exists a field called shifts that contains an object with an _id with value 2. This also include the whole array.
So, with tat, you know why find object works. Now adding this to an update query you can create the query:
This one to remove all shifts._id which are equal to 2.
db.collection.update({
"shifts._id": 2
},
{
$pull: {
shifts: {
_id: 2
}
}
})
Example
Or this one to remove shifts._id if parent _id is equal to 1
db.collection.update({
"_id": 1
},
{
$pull: {
shifts: {
_id: 2
}
}
})
Example

MongoDB - Query to fetch all related documents with same relation

Let's say I have a student document that looks something like that:
{
"_id": 1,
"full_name": "John doe",
"gpa": 87,
"class_id": 17
}
My input is one student_id and i need to fetch all of the students from the same class.
I can of course first fetch the student by it's id and then fetch all of the related students by the class_id, but that would take 2 queries.
The question is, can I do it using only one query that returns me an array of all the class students (including the student with the student_id I have as an input)?
First you can group by class_id, so all students are divided into class. Then match the student_id to get all students of that class.
db.collection.aggregate([
{
$group: {
_id: "$class_id",
students: {
$push: "$$ROOT"
}
}
},
{
$match: {
"students._id": 1
}
}
])
Working Mongo playground

Mongo DB - How to create a dynamic field based on the presence of element in array?

I have a problem in Mongo for which I am not getting any clue to resolve it efficiently.
Say I have a 'Course' collection something like this (index is created on the 'studentIds' field):
{
"courseId": 1,
"name": "Mathematics",
"studentIds": [1,3,5]
...
...
}
{
"courseId": 2,
"name": "Physics",
"studentIds": [2,3,5]
...
...
}
I am trying to write a query which would return records in the below format:
Say student 1 is querying the courses, he is enrolled for courseId 1, so the 'enrolled' is true, but student 1 not enrolled for courseId 2 and so the 'enrolled' is false:
{
"courseId": 1,
"name": "Mathematics",
"enrolled": true
}
{
"courseId": 2,
"name": "Physics",
"enrolled": false
}
Only solution I can think of is have two queries, first query to find all course IDs the student is enrolled in and while running through the cursor on the courses in the second query, add 'enrolled' field based on the existence of the courseId in the result of the first query, but looking for a way to achieve this in a single query.
Thanks.
You just need $in operator:
let studentId = 1;
db.collection.aggregate([
{
$project: {
courseId: 1,
name: 1,
enrolled: { $in: [ studentId, "$studentIds" ] }
}
}
])
Mongo Playground

MongoDB aggregate query extremely slow

I've a MongoDB query here which is running extremely slow without an index but the query fields are too big to index so i'm looking for advice on how to optimise this query or create valid index for it:
collection.aggregate([{
$match: {
article_id: {
$nin: read_article_ids
},
author_id: {
$in: liked_authors,
$nin: disliked_authors
},
word_count: {
$gte: 1000,
$lte: 10000
},
article_sentiment: {
$elemMatch: {
sentiments: mood
}
}
}
}, {
$sample: {
size: 4
}
}])
The collection in this case is a collection of articles with article_id, author_id, word_count, and article_sentiment. There is around 1.6 million documents in the collection and a query like this takes upwards of 10 seconds without an index. The box has 56gb of memory and is all around pretty specced out.
The query's function is to retrieve a batch of 4 articles by authors the user likes and that they've not read and that match a given sentiment (The article_sentiment key holds a nested array of key:value pairs)
So is this query incorrect for what i'm trying to achieve? Is there a way to improve it?
EDIT: Here is a sample document for this collection.
{
"_id": ObjectId("57f7dd597a1026d326fc02c4"),
"publication_name": "National News Inc",
"author_name": "John Hardwell",
"title": "How Shifting Policy Has Stunted Cultural Growth",
"article_id": "2f0896cd47c9423cb5a309c7277dd90d",
"author_id": "51b7f46f6c0f46f2949608c9ec2624d4",
"word_count": 1202,
"article_sentiment": [{
"sentiments": "happy",
"weight": 0.528596282005
}, {
"sentiments": "serious",
"weight": 0.569274544716
}, {
"sentiments": "relaxed",
"weight": 0.825395524502
}]
}

Can I use populate before aggregate in mongoose?

I have two models, one is user
userSchema = new Schema({
userID: String,
age: Number
});
and the other is the score recorded several times everyday for all users
ScoreSchema = new Schema({
userID: {type: String, ref: 'User'},
score: Number,
created_date = Date,
....
})
I would like to do some query/calculation on the score for some users meeting specific requirement, say I would like to calculate the average of score for all users greater than 20 day by day.
My thought is that firstly do the populate on Scores to populate user's ages and then do the aggregate after that.
Something like
Score.
populate('userID','age').
aggregate([
{$match: {'userID.age': {$gt: 20}}},
{$group: ...},
{$group: ...}
], function(err, data){});
Is it Ok to use populate before aggregate? Or I first find all the userID meeting the requirement and save them in a array and then use $in to match the score document?
No you cannot call .populate() before .aggregate(), and there is a very good reason why you cannot. But there are different approaches you can take.
The .populate() method works "client side" where the underlying code actually performs additional queries ( or more accurately an $in query ) to "lookup" the specified element(s) from the referenced collection.
In contrast .aggregate() is a "server side" operation, so you basically cannot manipulate content "client side", and then have that data available to the aggregation pipeline stages later. It all needs to be present in the collection you are operating on.
A better approach here is available with MongoDB 3.2 and later, via the $lookup aggregation pipeline operation. Also probably best to handle from the User collection in this case in order to narrow down the selection:
User.aggregate(
[
// Filter first
{ "$match": {
"age": { "$gt": 20 }
}},
// Then join
{ "$lookup": {
"from": "scores",
"localField": "userID",
"foriegnField": "userID",
"as": "score"
}},
// More stages
],
function(err,results) {
}
)
This is basically going to include a new field "score" within the User object as an "array" of items that matched on "lookup" to the other collection:
{
"userID": "abc",
"age": 21,
"score": [{
"userID": "abc",
"score": 42,
// other fields
}]
}
The result is always an array, as the general expected usage is a "left join" of a possible "one to many" relationship. If no result is matched then it is just an empty array.
To use the content, just work with an array in any way. For instance, you can use the $arrayElemAt operator in order to just get the single first element of the array in any future operations. And then you can just use the content like any normal embedded field:
{ "$project": {
"userID": 1,
"age": 1,
"score": { "$arrayElemAt": [ "$score", 0 ] }
}}
If you don't have MongoDB 3.2 available, then your other option to process a query limited by the relations of another collection is to first get the results from that collection and then use $in to filter on the second:
// Match the user collection
User.find({ "age": { "$gt": 20 } },function(err,users) {
// Get id list
userList = users.map(function(user) {
return user.userID;
});
Score.aggregate(
[
// use the id list to select items
{ "$match": {
"userId": { "$in": userList }
}},
// more stages
],
function(err,results) {
}
);
});
So by getting the list of valid users from the other collection to the client and then feeding that to the other collection in a query is the onyl way to get this to happen in earlier releases.