Count the number of duplicate elements in MongoDB - mongodb

I have the collection blow in mongodb:
{
"Id": "5",
"Group": [
{
"Name": "frank",
"Roll": "123"
}
]
},
{
"Id": "6",
"Group": [
{
"Name": "John",
"Roll": "124"
}
]
},
{
"Id": "7",
"Group": [
{
"Name": "John",
"Roll": "125"
}
]
}
The name "John" appears twice. I would like to display the number of each name that appears more than once:
{"Name": "John", "Count":2 }

You can use this aggregation query:
First $unwind to deconstruct the array and get all values as an object.
Then group by the name and $sum 1 for each name.
And then $match to get those values which exists more than one time (i.e. are repeated)
And last stage is to output values you want, in this case Name and Count.
db.collection.aggregate([
{
"$unwind": "$Group"
},
{
"$group": {
"_id": "$Group.Name",
"Count": {
"$sum": 1
},
}
},
{
"$match": {
"Count": {
"$gt": 1
}
}
},
{
"$project": {
"_id": 0,
"Name": "$_id",
"Count": 1
}
}
])
Example here

Related

How to count embedded array object elements in mongoDB

{
"orderNo": "123",
"bags": [{
"type": "small",
"products": [{
"id": "1",
"name": "ABC",
"returnable": true
}, {
"id": "2",
"name": "XYZ"
}
]
},{
"type": "big",
"products": [{
"id": "3",
"name": "PQR",
"returnable": true
}, {
"id": "4",
"name": "UVW"
}
]
}
]
}
I have orders collection where documents are in this format. I want to get a total count of products which has the returnable flag. e.g: for the above order the count should be 2. I am very new to MongoDB wanted to know how to write a query to find this out, I have tried few things but did not help:
this is what I tried but not worked:
db.orders.aggregate([
{ "$unwind": "$bags" },
{ "$unwind": "$bags.products" },
{ "$unwind": "$bags.products.returnable" },
{ "$group": {
"_id": "$bags.products.returnable",
"count": { "$sum": 1 }
}}
])
For inner array you can use $filter to check returnable flag and $size to get number of such items. For the outer one you can take advantage of $reduce to sum the values from inner arrays:
db.collection.aggregate([
{
$project: {
totalReturnable: {
$reduce: {
input: "$bags",
initialValue: 0,
in: {
$add: [
"$$value",
{
$size: {
$filter: {
input: "$$this.products",
as: "prod",
cond: {
$eq: [ "$$prod.returnable", true ]
}
}
}
]
}
}
}
}
}
}
])
Mongo Playground

Group by date in mongoDB while counting other fields

I've been using MongoDB for just a week and I have problems achieving this result: I want to group my documents by date while also keeping track of the number of entries that have a certain field set to a certain value.
So, my documents look like this:
{
"_id" : ObjectId("5f3f79fc266a891167ca8f65"),
"recipe" : "A",
"timestamp" : ISODate("2020-08-22T09:38:36.306Z")
}
where recipe is either "A", "B" or "C". Right now I'm grouping the documents by date using this pymongo query:
mongo.db.aggregate(
# Pipeline
[
# Stage 1
{
"$project": {
"createdAt": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$timestamp"
}
},
"progressivo": 1,
"temperatura_fusione": 1
}
},
# Stage 2
{
"$group": {
"_id": {
"createdAt": "$createdAt"
},
"products": {
"$sum": 1
}
}
},
# Stage 3
{
"$project": {
"label": "$_id.createdAt",
"value": "$products",
"_id": 0
}
}])
Which gives me results like this:
[{"label": "2020-08-22", "value": 1}, {"label": "2020-08-15", "value": 2}, {"label": "2020-08-11", "value": 1}, {"label": "2020-08-21", "value": 5}]
What I'd like to have is also the counting of how many times each recipe appears on every date. So, if for example on August 21 I have 2 entries with the "A" recipe, 3 with the "B" recipe and 0 with the "C" recipe, the desired output would be
{"label": "2020-08-21", "value": 5, "A": 2, "B":3, "C":0}
Do you have any tips?
Thank you!
You can do like following, what have you done is excellent. After that,
In second grouping, We just get total value and value of each recipe.
$map is used to go through/modify each objects
$arrayToObject is used to covert the array what we have done via map (key : value pair) to object
$ifNull is used for, sometimes your data might not have "A" or "B" or "C". But you need the value should be 0 if there is no name as expected output.
Here is the code
[
{
"$project": {
"createdAt": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$timestamp"
}
},
recipe: 1,
"progressivo": 1,
"temperatura_fusione": 1
}
},
{
"$group": {
"_id": {
"createdAt": "$createdAt",
"recipeName": "$recipe",
},
"products": {
$sum: 1
}
}
},
{
"$group": {
"_id": "$_id.createdAt",
value: {
$sum: "$products"
},
recipes: {
$push: {
name: "$_id.recipeName",
val: "$products"
}
}
}
},
{
$project: {
"content": {
"$arrayToObject": {
"$map": {
"input": "$recipes",
"as": "el",
"in": {
"k": "$$el.name",
"v": "$$el.val"
}
}
}
},
value: 1
}
},
{
$project: {
_id: 1,
value: 1,
A: {
$ifNull: [
"$content.A",
0
]
},
B: {
$ifNull: [
"$content.B",
0
]
},
C: {
$ifNull: [
"$content.C",
0
]
}
}
}
]
Working Mongo playground

How to get distinct name and count in MongoDB using PyMongo

I have the below collection as shown below. All I want is the distinct "Name" and the count. For example Betty appears 2 times, so the output I want is Betty:2, Vic:1, Veronica:2. I am able to get the distinct Name by issuing the command "db.Car.find().distinct('Name')" but not sure how to get the count.
{
"Name": "Betty",
"Car": "Jeep",
}
{
"Name": "Betty",
"Car": "Van",
}
{
"Name": "Vic",
"Car": "Ferrari",
}
{
"Name": "Veronica",
"Car": "Bus",
}
{
"Name": "Veronica",
"Car": "Van",
}
You can just use $group to group by Name field and use $sum operator in it to get the Count field.
Something like below:
db.collection.aggregate([
{
"$group": {
"_id": "$Name",
"Count": {
"$sum": 1
}
}
},
{
"$project": {
"Name": "$_id",
"Count": 1,
"_id":0
}
}
])
The above will produce the following output:
[
{
"Count": 2,
"Name": "Betty"
},
{
"Count": 1,
"Name": "Vic"
},
{
"Count": 2,
"Name": "Veronica"
}
]

Aggregate query result in mongodb

I have collection with documents like this one:
{
"_id": 1,
"people": [
{
"name": "Bob",
"age": "15"
},
{
"name": "Alice",
"age": "18"
}
]
}
My query is:
db.groups.aggregate({ $match: { "_id": 1 }}, { $project: { "_id": 0, "people.name": 1 } })
This query returns:
{
"people": [
{
"name": "Bob"
},
{
"name": "Alice"
}
]
}
But I need the result like:
{ "names": [ "Bob", "Alice" ] }
Which parameters should I add to the .aggregate() function?
The solution is:
db.groups.aggregate({ $match: { "_id": 1 }}, { $project: { "_id": 0, "names": "$people.name" } })

How to $push a field depending on a condition?

I'm trying to conditionally push a field into an array during the $group stage of the MongoDB aggregation pipeline.
Essentially I have documents with the name of the user, and an array of the actions they performed.
If I group the user actions like this:
{ $group: { _id: { "name": "$user.name" }, "actions": { $push: $action"} } }
I get the following:
[{
"_id": {
"name": "Bob"
},
"actions": ["add", "wait", "subtract"]
}, {
"_id": {
"name": "Susan"
},
"actions": ["add"]
}, {
"_id": {
"name": "Susan"
},
"actions": ["add, subtract"]
}]
So far so good. The idea would be to now group together the actions array to see which set of user actions are the most popular. The problem is that I need to remove the "wait" action before taking into account the group. Therefore the result should be something like this, taking into account that the "wait" element should not be considered in the grouping:
[{
"_id": ["add"],
"total": 1
}, {
"_id": ["add", "subtract"],
"total": 2
}]
Test #1
If I add this $group stage:
{ $group : { _id : "$actions", total: { $sum: 1} }}
I get the count that I want, but it takes into account the unwanted "wait" array element.
[{
"_id": ["add"],
"total": 1
}, {
"_id": ["add", "subtract"],
"total": 1
}, {
"_id": ["add", "wait", "subtract"],
"total": 1
}]
Test #2
{ $group: { _id: { "name": "$user.name" }, "actions": { $push: { $cond: { if:
{ $ne: [ "$action", 'wait']}, then: "$action", else: null } }}} }
{ $group : { _id : "$actions", total: { $sum: 1} }}
This is as close as I've gotten, but this pushes null values where the wait would be, and I can't figure out how to remove them.
[{
"_id": ["add"],
"total": 1
}, {
"_id": ["add", "subtract"],
"total": 1
}, {
"_id": ["add", null, "subtract"],
"total": 1
}]
UPDATE:
My simplified documents look like this:
{
"_id": ObjectID("573e0c6155e2a8f9362fb8ff"),
"user": {
"name": "Bob",
},
"action": "add",
}
You need a preliminary $match stage in your pipeline to select only those documents where "action" is not equals to "wait".
db.collection.aggregate([
{ "$match": { "action": { "$ne": "wait" } } },
{ "$group": {
"_id": "$user.name",
"actions": { "$push": "$action" },
"total": { "$sum": 1 }
}}
])