Getting distinct value for a field in Mongo - mongodb

I have a database collection and this is its document structure.
{
_id: ObjectId("xxxxddsdsfdfdfdf")
category: electronics
sku: 10902
}
{
_id: ObjectId("dfdfdgfsdfdsgsf")
category: apparels
sku: 90345
}
{
_id: ObjectId("sdfdfdsggfgsgsdgsgsf")
category: electronics
sku: 10345
}
{
_id: ObjectId("dfndsnfkjdfdfsdnfsdf")
category: electronics
sku: 43435
}
I am trying to find the total number of SKUs per category. It should eliminate duplication and keep the values distinct. For example, electronics: 3, apparels: 1.
I have written a query, but it is giving me a total number of SKUs across categories which is not at all intended.
db.ecomm_sku_count.aggregate([
{
$group: {
_id: {
category: '$category',
sku_count: '$sku'
},
total_sku: {
$sum: 1
}
}
},
{
$count: "total_sku_units"
}
])
#output= [ { total_sku_units: 4 } ]
The intended output must be somewhat like this.
[
{ _id: { category: 'electronics', sku_count: 3 } },
{ _id: { category: 'apparels', sku_count: 1} }
]
I am trying to find the distinct SKU values per category.
I am beginner to mongo aggregation framework. Pardon me if the question is of noob type.

I think the below code is what you are looking for:
db.collection.aggregate([
{
"$group": {
"_id": {
"category": "$category",
},
"total_sku": {
"$addToSet": "$sku"
}
},
},
{
"$project": {
"total_sku": {
"$size": "$total_sku"
}
},
},
])
Mongo Playground Sample Execution

Related

Mongodb aggregate grouping elements of an array type field

I have below data in my collection:
[
{
"_id":{
"month":"Jan",
"year":"2022"
},
"products":[
{
"product":"ProdA",
"status":"failed",
"count":15
},
{
"product":"ProdA",
"status":"success",
"count":5
},
{
"product":"ProdB",
"status":"failed",
"count":20
},
{
"product":"ProdB",
"status":"success",
"count":10
}
]
},
...//more such data
]
I want to group the elements of products array on the name of the product, so that we have record of how what was the count of failure of success of each product in each month. Every record is guaranteed to have both success and failure count each month. The output should look like below:
[
{
"_id":{
"month":"Jan",
"year":"2022"
},
"products":[
{
"product":"ProdA","status":[{"name":"success","count":5},{"name":"failed","count":15}]
},
{
"product":"ProdB","status":[{"name":"success","count":10},{"name":"failed","count":20}]
}
]
},
...//data for succeeding months
]
I have tried to do something like this:
db.collection.aggregate([{ $unwind: "$products" },
{
$group: {
"_id": {
month: "$_id.month",
year: "$_id.year"
},
products: { $push: { "product": "$product", status: { $push: { name: "$status", count: "$count" } } } }
}
}]);
But above query doesn't work.
On which level I need to group fields so as to obtain above output.
Please help me to find out what I am doing wrong.
Thank You!
Your first group stage needs to group by both the _id and the product name, aggregate a list of status counts and then another group stage which then forms the products list:
db.collection.aggregate([
{$unwind: "$products"},
{$group: {
_id: {
id: "$_id",
product: "$products.product",
},
status: {
$push: {
name: "$products.status",
count: "$products.count"
}
}
}
},
{$group: {
_id: "$_id.id",
products: {
$push: {
product: "$_id.product",
status: "$status"
}
}
}
}
])
Mongo Playground

MongoDB Aggregation total fields and group by field name

I have a collection of documents like so:
{
gameId: '0001A',
score: 40,
name: 'Bob',
city: 'London'
}
I am trying to run an aggregation on my documents that will output the following view FOR EACH gameId:
{
cities: [
London: {
totalScore: 500 // sum of the scores for all documents that have a city of London
people: [
'Bob',
'Anna',
'Sally',
'Sue'
],
peopleCount: 4 // a count of all people who also have the city of London
},
Barcelona: {
totalScore: 400 // sum of the scores for all documents that have a city of Barcelona
people: [
'Tim',
'Tina',
'Amir'
], // names of all people who also have the city of Barcelona
peopleCount: 3 // count of how many names appear
},
]
I've tried to achieve this using $facet$ and also $bucket in the aggregation pipeline. However this doesn't seem to fit the bill, as $bucket / $bucketAuto seem to require ranges or a number of buckets respectively. The $bucketAuto then sets a min and max value in the objects.
I'm able to group the total number of people, names, and scores straightforwardly like so at the moment:
$group: {
_id: '$gameId',
totalScore: {
$sum: '$score'
},
uniqueClients: {
$addToSet: '$name'
}
},
$addFields: {
uniqueClientCount: {
$size: '$uniqueClients'
}
}
How do I break it down by city?
you could try two $group stages as follow :
db.collection.aggregate([
{
"$group": {
"_id": {
game: "$gameId",
city: "$city"
},
"totalScore": {
"$sum": "$score"
},
"people": {
"$addToSet": "$name"
}
}
},
{
"$addFields": {
"peopleCount": {
"$size": "$people"
}
}
},
{
"$group": {
"_id": "$_id.game",
"cities": {
"$push": {
"$arrayToObject": [
[
{
k: "$_id.city",
v: {
people: "$people",
totalScore: "$totalScore",
peopleCount: "$peopleCount"
}
}
]
]
}
}
}
}
])
See on mongoplayground https://mongoplayground.net/p/f4uItCb0BwW

Complex nested array of object get record from mongodb

in project it has following complex format of document in mongodb
{
_id: class_1,
students:[
{
_id: student_1,
questions: [
{
_id: s1q1,
answers:[
{
_id: s1q1a1,
},
{
_id: s1q1a2,
},
{
_id: s1q1a3,
}
]
},
{
_id: s1q2,
answers:[
{
_id: s1q2a1,
},
{
_id: s1q2a2,
},
{
_id: s1q2a3,
}
]
}
]
},
{
_id: student_2,
questions: [
{
_id: s2q1,
answers:[
{
_id: s2q1a1,
},
{
_id: s2q1a2,
},
{
_id: s2q1a3,
}
]
},
{
_id: s2q2,
answers:[
{
_id: s2q2a1,
},
{
_id: s2q2a2,
},
{
_id: s2q2a3,
}
]
}
]
}
]
}
I tried lot with aggregation but unable to get response as following
{
_id: class_1,
students:[
{
_id: student_1,
questions: [
{
_id: s1q1,
answers:[
{
_id: s1q1a1,
}
]
}
]
}
]
}
the query is like db.collection.find({students.questions.answers._id:s1q1a1})
if I query like above it returns get all child elements as well, so how to get only selected object with keeping nesting hierarchy?
I also tried with aggregation it gives me result till second hierarchy after it unable to filter because of mongo errors.
Try this, I did this using aggregation. Also this might not be the best way to achieve the desired result but it will work
https://mongoplayground.net/p/WQGWuZKh2zQ

Finding top 3 students in each subject MongoDB

I have tried searching for ways to solve my problem, except that my database is set up differently,
My documents in my collection are something like this:
{name:"MAX",
date:"2020-01-01"
Math:98,
Science:60,
English:80},
{name:"JANE",
date:"2020-01-01"
Math:80,
Science:70,
English:79},
{name:"ALEX",
date:"2020-01-01"
Math:95,
Science:68,
English:70},
{name:"JOHN",
date:"2020-01-01"
Math:95,
Science:68,
English:70}
{name:"MAX",
date:"2020-06-01"
Math:97,
Science:78,
English:90},
{name:"JANE",
date:"2020-06-01"
Math:78,
Science:76,
English:66},
{name:"ALEX",
date:"2020-06-01"
Math:93,
Science:75,
English:82},
{name:"JOHN",
date:"2020-06-01"
Math:92,
Science:80,
English:50}
I want to find the top 3 students for each subject without regard for the dates. I only managed to find the top 3 students in 1 subject.
So i group the students by name first, and add a column for max scores of a subject. Math in this case. Sort it in descending order and limit results to 3.
db.student_scores.aggregate(
[
{$group:{
_id: "$name",
maxMath: { $max: "$Math" }}},
{$sort:{"maxMath":-1}},
{$limit : 3}
]
)
Is there any way to get the top 3 students for each subject?
So, it would be top 3 for math, top 3 for science, top 3 for english
{
Math:{MAX, JANE, JOHN},
Science:{JOHN, ALEX, JANE},
English:{JANE, MAX, JOHN}
}
I just applied your code 3 times, using $facet
If you prefer a more compact result add
{$project:{English:"$Eng._id", Science:"$sci._id", Math:"$math._id"}}
PLAYGROUND
PIPELINE
db.collection.aggregate([
{
"$facet": {
"math": [
{
$group: {
_id: "$name",
maxMath: {
$max: "$Math"
}
}
},
{
$sort: {
"maxMath": -1
}
},
{
$limit: 3
}
],
"sci": [
{
$group: {
_id: "$name",
maxSci: {
$max: "$Science"
}
}
},
{
$sort: {
"maxSci": -1
}
},
{
$limit: 3
}
],
"Eng": [
{
$group: {
_id: "$name",
maxEng: {
$max: "$English"
}
}
},
{
$sort: {
"maxEng": -1
}
},
{
$limit: 3
}
]
}
}
])
Your question is not clear, but i can predict 2 scenario,
Get repetitive students along with date:
$project to show required fields and convert subjects object to array using $objectToArray
$unwind subjects array
$sort by subjects name in descending order
$group by subject name and get array of students
$project to get latest 3 students from students array
db.collection.aggregate([
{
$project: {
name: "$name",
date: "$date",
subjects: {
$objectToArray: {
Math: "$Math",
Science: "$Science",
English: "$English"
}
}
}
},
{ $unwind: "$subjects" },
{ $sort: { "subjects.v": -1 } },
{
$group: {
_id: "$subjects.k",
students: {
$push: {
name: "$name",
date: "$date",
score: "$subjects.v"
}
}
}
},
{
$project: {
_id: 0,
subject: "$_id",
students: { $slice: ["$students", 3] }
}
}
])
Playground
Sum of all date's score (means unique students):
$group by name, and get sum of all subjects using $sum,
$project to convert subjects object to array using $objectToArray
$unwind subjects array
$sort by subjects name in descending order
$group by subject name and get array of students
$project to get latest 3 students from students array
db.collection.aggregate([
{
$group: {
_id: "$name",
Math: { $sum: "$Math" },
Science: { $sum: "$Science" },
English: { $sum: "$English" }
}
},
{
$project: {
subjects: {
$objectToArray: {
Math: "$Math",
Science: "$Science",
English: "$English"
}
}
}
},
{ $unwind: "$subjects" },
{ $sort: { "subjects.v": -1 } },
{
$group: {
_id: "$subjects.k",
students: {
$push: {
name: "$_id",
score: "$subjects.v"
}
}
}
},
{
$project: {
_id: 0,
subject: "$_id",
students: { $slice: ["$students", 3] }
}
}
])
Playground

Mongodb Group stage, and query last two document

In mongodb, if we want to take first or last document of the group stage, then the FIRST and LAST operator will helpful. I want to group the collection based on _id:"$department" and also take the LAST document and the LAST-1 document. Is there any way to achieve this.
Assume you have this collection:
[
{ department: "HR", employees: 20 },
{ department: "Finance", employees: 30 },
{ department: "Sales", employees: 5 },
{ department: "IT", employees: 50 }
]
Then you can run this aggregation:
db.collection.aggregate([
{ $sort: { department: 1 } },
{
$group: {
_id: null,
employees: { $sum: "$employees" },
all_departments: { $push: "$department" }
}
},
{
$set: {
last_departments: {
$concatArrays: [
[{ $arrayElemAt: ["$all_departments", -1] }],
[{ $arrayElemAt: ["$all_departments", -2] }]
]
}
}
}
])
Mongo Playground
Update
With $slice it is even shorter:
db.collection.aggregate([
{ $sort: { department: 1 } },
{
$group: {
_id: null,
employees: { $sum: "$employees" },
all_departments: { $push: "$department" }
}
},
{ $set: { last_departments: { $slice: ["$all_departments", -2] } } }
])