Sum value when satisfy condition in MongoDB - mongodb

I am trying to get sum of values when certain condition is satisfied in the document.
In the below query i want to get sum of currentValue only when componentId = "ABC"
db.Pointnext_Activities.aggregate(
{ $project: {
_id: 0,
componentId:1,
currentValue:1
}
},
{ $group:
{ _id: "$componentId",
total: { $sum: "$currentValue" }
}
}
)

Please try this :
db.Pointnext_Activities.aggregate([{ $match: { componentId: 'ABC' } },
{
$group:
{
_id: "$componentId",
total: { $sum: "$currentValue" }
}
}, { $project: { 'componentId': '$_id', total: 1, _id: 0 } }])
If you just need the total value & doesn't care about componentId to be returned try this :
db.Pointnext_Activities.aggregate([{ $match: { componentId: 'ABC' } },
{
$group:
{
_id: "",
total: { $sum: "$currentValue" }
}
}, {$project :{total :1, _id:0}}])
It would be ideal in aggregation, if you always start with filter operation i.e; $match, as it would persist only needed documents for further steps.

Related

Finding top 3 students in each subject MongoDB

I have tried searching for ways to solve my problem, except that my database is set up differently,
My documents in my collection are something like this:
{name:"MAX",
date:"2020-01-01"
Math:98,
Science:60,
English:80},
{name:"JANE",
date:"2020-01-01"
Math:80,
Science:70,
English:79},
{name:"ALEX",
date:"2020-01-01"
Math:95,
Science:68,
English:70},
{name:"JOHN",
date:"2020-01-01"
Math:95,
Science:68,
English:70}
{name:"MAX",
date:"2020-06-01"
Math:97,
Science:78,
English:90},
{name:"JANE",
date:"2020-06-01"
Math:78,
Science:76,
English:66},
{name:"ALEX",
date:"2020-06-01"
Math:93,
Science:75,
English:82},
{name:"JOHN",
date:"2020-06-01"
Math:92,
Science:80,
English:50}
I want to find the top 3 students for each subject without regard for the dates. I only managed to find the top 3 students in 1 subject.
So i group the students by name first, and add a column for max scores of a subject. Math in this case. Sort it in descending order and limit results to 3.
db.student_scores.aggregate(
[
{$group:{
_id: "$name",
maxMath: { $max: "$Math" }}},
{$sort:{"maxMath":-1}},
{$limit : 3}
]
)
Is there any way to get the top 3 students for each subject?
So, it would be top 3 for math, top 3 for science, top 3 for english
{
Math:{MAX, JANE, JOHN},
Science:{JOHN, ALEX, JANE},
English:{JANE, MAX, JOHN}
}
I just applied your code 3 times, using $facet
If you prefer a more compact result add
{$project:{English:"$Eng._id", Science:"$sci._id", Math:"$math._id"}}
PLAYGROUND
PIPELINE
db.collection.aggregate([
{
"$facet": {
"math": [
{
$group: {
_id: "$name",
maxMath: {
$max: "$Math"
}
}
},
{
$sort: {
"maxMath": -1
}
},
{
$limit: 3
}
],
"sci": [
{
$group: {
_id: "$name",
maxSci: {
$max: "$Science"
}
}
},
{
$sort: {
"maxSci": -1
}
},
{
$limit: 3
}
],
"Eng": [
{
$group: {
_id: "$name",
maxEng: {
$max: "$English"
}
}
},
{
$sort: {
"maxEng": -1
}
},
{
$limit: 3
}
]
}
}
])
Your question is not clear, but i can predict 2 scenario,
Get repetitive students along with date:
$project to show required fields and convert subjects object to array using $objectToArray
$unwind subjects array
$sort by subjects name in descending order
$group by subject name and get array of students
$project to get latest 3 students from students array
db.collection.aggregate([
{
$project: {
name: "$name",
date: "$date",
subjects: {
$objectToArray: {
Math: "$Math",
Science: "$Science",
English: "$English"
}
}
}
},
{ $unwind: "$subjects" },
{ $sort: { "subjects.v": -1 } },
{
$group: {
_id: "$subjects.k",
students: {
$push: {
name: "$name",
date: "$date",
score: "$subjects.v"
}
}
}
},
{
$project: {
_id: 0,
subject: "$_id",
students: { $slice: ["$students", 3] }
}
}
])
Playground
Sum of all date's score (means unique students):
$group by name, and get sum of all subjects using $sum,
$project to convert subjects object to array using $objectToArray
$unwind subjects array
$sort by subjects name in descending order
$group by subject name and get array of students
$project to get latest 3 students from students array
db.collection.aggregate([
{
$group: {
_id: "$name",
Math: { $sum: "$Math" },
Science: { $sum: "$Science" },
English: { $sum: "$English" }
}
},
{
$project: {
subjects: {
$objectToArray: {
Math: "$Math",
Science: "$Science",
English: "$English"
}
}
}
},
{ $unwind: "$subjects" },
{ $sort: { "subjects.v": -1 } },
{
$group: {
_id: "$subjects.k",
students: {
$push: {
name: "$_id",
score: "$subjects.v"
}
}
}
},
{
$project: {
_id: 0,
subject: "$_id",
students: { $slice: ["$students", 3] }
}
}
])
Playground

MongoDB multiple levels embedded array query

I have a document like this:
{
_id: 1,
data: [
{
_id: 2,
rows: [
{
myFormat: [1,2,3,4]
},
{
myFormat: [1,1,1,1]
}
]
},
{
_id: 3,
rows: [
{
myFormat: [1,2,7,8]
},
{
myFormat: [1,1,1,1]
}
]
}
]
},
I want to get distinct myFormat values as a complete array.
For example: I need the result as: [1,2,3,4], [1,1,1,1], [1,2,7,8]
How can I write mongoDB query for this?
Thanks for the help.
Please try this, if every object in rows has only one field myFormat :
db.getCollection('yourCollection').distinct('data.rows')
Ref : mongoDB Distinct Values for a field
Or if you need it in an array & also objects in rows have multiple other fields, try this :
db.yourCollection.aggregate([{$project :{'data.rows.myFormat':1}},{ $unwind: '$data' }, { $unwind: '$data.rows' },
{ $group: { _id: '$data.rows.myFormat' } },
{ $group: { _id: '', distinctValues: { $push: '$_id' } } },
{ $project: { distinctValues: 1, _id: 0 } }])
Or else:
db.yourCollection.aggregate([{ $project: { values: '$data.rows.myFormat' } }, { $unwind: '$values' }, { $unwind: '$values' },
{ $group: { _id: '', distinctValues: { $addToSet: '$values' } } }, { $project: { distinctValues: 1, _id: 0 } }])
Above aggregation queries would get what you wanted, but those can be tedious on large datasets, try to run those and check if there is any slowness, if you're using for one-time then if needed you can consider using {allowDiskUse: true} & irrespective of one-time or not you need to check on whether to use preserveNullAndEmptyArrays:true or not.
Ref : allowDiskUse , $unwind preserveNullAndEmptyArrays

Mongodb Aggregate - Count fields that equals value in array, but keep both arrays

I need to calculate the percentage of finalized/total items. The problem I have is calculating how many fields in the array equal to 'finished'. With my current solution I get finished items correctly, but total items are the same number as finished.
This is what I'm doing:
Items.aggregate([
{
$match: {
status: {
$ne: ['cancelled','pending']
}
}
},
{
$group: {
_id: '$person',
items: {
$push: {
total: '$status',
finished: {
$cond: [
{
$eq: ['$status', 'finished']
},
'$status',
null
]
}
}
}
}
},
{
$unwind: '$items'
},
{
$match: {
'items.finished': {
$ne: null
},
}
},
{
$group: {
_id: '$_id',
success: {
$push : '$items.finished'
},
total: {
$push: '$items.total'
}
}
},
{
$project: {
successCount: {
$size: '$success'
},
totalCount: {
$size: '$total'
}
}
},
{
$project: {
successScore: {
$divide: [ "$successCount", "$totalCount"]
}
}
}
]);
I also tried simpler solution, but can't figure how to keep total count field in the loop after doing $unwind
Items.aggregate([
{
$group: {
_id: '$_id',
totalCount: {$sum: 1},
finished: { $cond : [ {$eg: ['status', 'finished']}, $status, null] }
}
},
{ $unwind: '$finished'},
...
Then I can't access totalCount later

MongoDB -- Find duplicate documents by multiple keys

I have a collection with documents that look like the following:
{
"_id" : ObjectId("55b377cb66b393427367c3e2"),
"comment" : "This is a comment",
"url_key" : "55b377cb66b393427367c3df", //This is an ObjectId from another record in a different collection
}
I need to find records in this collection that contain duplicate values for the both the comment AND the url_key.
I can easily generate (using aggregate) duplicate records for the same, single, key (eg: comment), but I can't figure out how to group by/aggregate for multiple keys.
Here's my current aggregation pipeline:
db.comments.aggregate([ { $group: { _id: { comment: "$comment" }, uniqueIds: { $addToSet: "$_id" }, count: { $sum: 1 } } }, { $match: { count: { $gte: 2 } } }, { $sort: { count : -1} }, {$limit 10 } ]);
Is it as simple as grouping by multiple keys or did I misunderstand your question?
...
{ $group: { _id: { id: "$_id", comment: "$comment" }, count: { $sum: 1 } } },
{ $match: { count: { $gte: 2 } } },
...

How to count the documents duplicated in mongodb?

I tried to search how to count the documents duplicated in mongodb and i got this function, it return the documents duplicated.
db.job_crawler_models_jobs_crawlings.aggregate(
{ $group: {
_id: { field1: "$field1", field2: "$field2" },
count: { $sum: 1 }
}},
{ $match: {
count: { $gt : 1 }
}}
)
But i want to get the number of documents duplicated. How can i do that?
You could try adding another $group in the pipeline. Not sure this is exactly what you are looking for though.
db.job_crawler_models_jobs_crawlings.aggregate(
{ $group: {
_id: { field1: "$field1", field2: "$field2" },
count: { $sum: 1 }
}},
{ $match: {
count: { $gt : 1 }
}},
{ $group: { _id: null, duplicatedCounts: { $sum:1 } } }
)