How to calculate percentage using MongoDB aggregation - mongodb

I want to calculate percentage of with help of mongoDB aggregation,
My collection has following data.
subject_id
gender
other_data
1
Male
XYZ
1
Male
ABC
1
Male
LMN
2
Female
TBZ
3
Female
NDA
4
Unknown
UJY
I want output something like this:
[{
gender: 'Male',
total: 1,
percentage: 25.0
},{
gender: 'Female',
total: 2,
percentage: 50.0
},{
gender: 'Unknown',
total: 1,
percentage: 25.0
}]
I have tried various methods but none of them works, mainly unable to count total of Male, Female, Unknown summation(to calculate percentage). The trickiest part is there are only 4 members in above example but their subject_id may be repeated according to other_data
Thanks in Advance.

You can use this aggregation query:
First group by subject_id to get the different values (different persons).
Then use $facet to create "two ways". One to use $count and get the total number of docs, and other to get the documents grouped by gender.
Then with all desired values (grouped by gender and total docs) get the first element of the result from nDocs into $facet stage. $facet will generate an array and the value we want will be in the first position.
Later use $unwind to get every groupValue with the nDoc value
And last output the values you want using $project. To get the percentage you can $divide total/nDocs and $multiply by 100.
db.collection.aggregate([
{
"$group": {
"_id": "$subject_id",
"gender": {
"$first": "$gender"
}
}
},
{
"$facet": {
"nDocs": [
{
"$count": "nDocs"
},
],
"groupValues": [
{
"$group": {
"_id": "$gender",
"total": {
"$sum": 1
}
}
},
]
}
},
{
"$addFields": {
"nDocs": {
"$arrayElemAt": [
"$nDocs",
0
]
}
}
},
{
"$unwind": "$groupValues"
},
{
"$project": {
"_id": 0,
"gender": "$groupValues._id",
"total": "$groupValues.total",
"percentage": {
"$multiply": [
{
"$divide": [
"$groupValues.total",
"$nDocs.nDocs"
]
},
100
]
}
}
}
])
Example here

Related

MongoDB merge queries as aggregated pipeline

I have a two separated queries.
First one is to find removed items.
Second one is to check whether all items are removed on same country by itemId.
Can I merge it as pipeline?
Example Input:
itemId removed country
1 true US
1 false TR
2 true US
2 true RU
Example Output:
itemId: 2
Expected: (itemId: 2) Find itemIds that removed from all countries by itemId.
My code looks like:
db.collection.find({removed: true}, {itemId: 1}) -> Extract itemIdList
forEach id: itemIdList: [1, 2]
if db.collection.find({removed: false, itemId: id}).count() > 0
remove itemId from itemIdList
return remainingItems: [2];
Hi you can achieve this by aggregate first group items by item id and counting total removed true, and total items. now in $match stage if the total number of items equals the removed true count that means, this itemId has removed true for all countries, and simply push the itemIds into an array.
Working solution
db.collection.aggregate([
{
"$group": {
"_id": "$itemId",
"itemsArrCount": {
"$sum": 1
},
"totalRemovedTrue": {
"$sum": {
"$cond": [
{ "$eq": ["$removed", true] },
1,
0
]
}
},
}
},
{
"$match": {
"$expr": {
"$eq": [ "$itemsArrCount", "$totalRemovedTrue"]
}
}
},
{
"$group": {
"_id": null,
"itemIds": {
"$push": "$_id"
}
}
}
])

MongoDB - Obtain full document of a group taking into account the minimum value of one property

Good afternoon, I'm starting in MongoDB and I have a doubt with the group aggregation.
From the following set of documents; I need to get the cheapest room of all similar (grouping by identifier room).
{"_id":"874521035","provider":{"id":{"$numberInt":"2"},"name":"HotelBeds"},"accommodation":{"id":{"$numberInt":"36880"},"name":"Hotel Goya"},"room":{"id":{"$numberInt":"1"},"name":"Doble"},"board":{"id":{"$numberInt":"1"},"name":"Sólo alojamiento"},"fare":{"id":"NRF","name":"No reembolsable"},"price":{"cost":{"$numberInt":"115"},"net":{"$numberInt":"116"},"pvp":{"$numberInt":"126"}},"fees":{"agency":{"$numberInt":"10"},"cdv":{"$numberInt":"1"}},"cancellation-deadeline":"2019-12-31","payment-deadeline":"2019-12-30"}
{"_id":"123456789","provider":{"id":{"$numberInt":"2"},"name":"HotelBeds"},"accommodation":{"id":{"$numberInt":"36880"},"name":"Hotel Goya"},"room":{"id":{"$numberInt":"1"},"name":"Doble"},"board":{"id":{"$numberInt":"2"},"name":"Alojamiento y desayuno"},"fare":{"id":"NOR","name":"Reembolsable"},"price":{"cost":{"$numberInt":"120"},"net":{"$numberInt":"121"},"pvp":{"$numberInt":"131"}},"fees":{"agency":{"$numberInt":"10"},"cdv":{"$numberInt":"1"}},"cancellation-deadeline":"2019-12-31","payment-deadeline":"2019-12-30"}
{"_id":"987654321","provider":{"id":{"$numberInt":"2"},"name":"HotelBeds"},"accommodation":{"id":{"$numberInt":"36880"},"name":"Hotel Goya"},"room":{"id":{"$numberInt":"2"},"name":"Triple"},"board":{"id":{"$numberInt":"1"},"name":"Sólo alojamiento"},"fare":{"id":"NOR","name":"Reembolsable"},"price":{"cost":{"$numberInt":"125"},"net":{"$numberInt":"126"},"pvp":{"$numberInt":"136"}},"fees":{"agency":{"$numberInt":"10"},"cdv":{"$numberInt":"1"}},"cancellation-deadeline":"2019-12-31","payment-deadeline":"2019-12-30"}
{"_id":"852963147","provider":{"id":{"$numberInt":"2"},"name":"HotelBeds"},"accommodation":{"id":{"$numberInt":"36880"},"name":"Hotel Goya"},"room":{"id":{"$numberInt":"3"},"name":"Doble uso individual"},"board":{"id":{"$numberInt":"1"},"name":"Sólo alojamiento"},"price":{"cost":{"$numberInt":"99"},"net":{"$numberInt":"100"},"pvp":{"$numberInt":"110"}},"fees":{"agency":{"$numberInt":"10"},"cdv":{"$numberInt":"1"}},"cancellation-deadeline":"2019-12-31","payment-deadeline":"2019-12-30"}
I've got obtain only the cheapest price, the room identifier and the number of repetitions.
db.consolidation.aggregate ([
{
$group: {
_id: "$ room.id",
"cheapest": {$ min: "$ price.pvp"},
        "qty": {$ sum: 1}
}
}]);
{"_id": 2, "cheapest": 136, "qty": 1}
{"_id": 3, "cheapest": 110, "qty": 1}
{"_id": 1, "cheapest": 126, "qty": 2}
Investigating I have seen that data can be obtained with $first or $last, but the data is not the data I need since it is obtained according to the position of the document.
What I need is to obtain from the set of documents, each document with the cheapest room. This is the result I expect:
{"_id":"874521035","provider":{"id":{"$numberInt":"2"},"name":"HotelBeds"},"accommodation":{"id":{"$numberInt":"36880"},"name":"Hotel Goya"},"room":{"id":{"$numberInt":"1"},"name":"Doble"},"board":{"id":{"$numberInt":"1"},"name":"Sólo alojamiento"},"fare":{"id":"NRF","name":"No reembolsable"},"price":{"cost":{"$numberInt":"115"},"net":{"$numberInt":"116"},"pvp":{"$numberInt":"126"}},"fees":{"agency":{"$numberInt":"10"},"cdv":{"$numberInt":"1"}},"cancellation-deadeline":"2019-12-31","payment-deadeline":"2019-12-30"}
{"_id":"987654321","provider":{"id":{"$numberInt":"2"},"name":"HotelBeds"},"accommodation":{"id":{"$numberInt":"36880"},"name":"Hotel Goya"},"room":{"id":{"$numberInt":"2"},"name":"Triple"},"board":{"id":{"$numberInt":"1"},"name":"Sólo alojamiento"},"fare":{"id":"NOR","name":"Reembolsable"},"price":{"cost":{"$numberInt":"125"},"net":{"$numberInt":"126"},"pvp":{"$numberInt":"136"}},"fees":{"agency":{"$numberInt":"10"},"cdv":{"$numberInt":"1"}},"cancellation-deadeline":"2019-12-31","payment-deadeline":"2019-12-30"}
{"_id":"852963147","provider":{"id":{"$numberInt":"2"},"name":"HotelBeds"},"accommodation":{"id":{"$numberInt":"36880"},"name":"Hotel Goya"},"room":{"id":{"$numberInt":"3"},"name":"Doble uso individual"},"board":{"id":{"$numberInt":"1"},"name":"Sólo alojamiento"},"price":{"cost":{"$numberInt":"99"},"net":{"$numberInt":"100"},"pvp":{"$numberInt":"110"}},"fees":{"agency":{"$numberInt":"10"},"cdv":{"$numberInt":"1"}},"cancellation-deadeline":"2019-12-31","payment-deadeline":"2019-12-30"}
I hope I have explained.
Thanks in advance.
Regards.
You can add capture $$ROOT as part of your $group stage and then use $filter to compare a list of your rooms against min value. $replaceRoot will allow you to get original shape:
db.collection.aggregate([
{
$group: {
_id: "$room.id",
"cheapest": {
$min: "$price.pvp"
},
"qty": { $sum: 1 },
docs: { $push: "$$ROOT" }
}
},
{
$replaceRoot: {
newRoot: { $arrayElemAt: [ { $filter: { input: "$docs", cond: { $eq: [ "$$this.price.pvp", "$cheapest" ] } } }, 0 ] }
}
}
])
Mongo Playground

How to aggregate data which an array field sum is between two values?

I have two values which are minCount and maxCount.
In my model I have field which is called counts.Something like this.
{
createdAt: date
counts: [ 0,200,100] ==> Sum 300
},
{
createdAt: date
counts: [ 200,500,0] ==> Sum 700
},
{
createdAt: date
counts: [ 0,1100,100] ==> Sum 1200
},
I need to return sum of counts which sum of counts array elements are between minCount and MaxCount.
Exm:
minCount= 400
maxCount= 1300
Return
{
createdAt: date
total: 700
},
{
createdAt: date
total: 1200
},
I
I have createdAt dates between two dates like this in first step of pipe.
Record.aggregate ([
{
$match: {
createdAt: {
$gte: new Date (req.body.startDate),
$lte: new Date (req.body.endDate),
},
},
},
{}, ==> I have to get total counts with condition which I could not here.
])
I am almost new to aggreagate pipeline so please help.
Working example - https://mongoplayground.net/p/I6LOLhTA-yA
db.collection.aggregate([
{
"$project": {
"counts": 1,
"createdAt": 1,
"totalCounts": {
"$sum": "$counts"
}
}
},
{
"$match": {
"totalCounts": {
"$gte": 400,
"$lte": 1300
}
}
}
])

Getting average of all the values for a field in a Mongo DB collection using mongoose

Here's what the data looks like:
[{_id:1,price:"5"},{_id:2,price:"10"},{_id:3,price: null}]
The expected outcome is the average of all the values in the price field.i.e. average of 5 and 10, which is 7.5
You should be able to do a group with an average (https://docs.mongodb.com/manual/reference/operator/aggregation/avg/)
db.collection.aggregate([
{
$group: {
_id: "",
price: {
$avg: "$price"
}
}
}
])
When executed this will output
[
{
"_id": "",
"price": 7.5
}
]

$facet \ $bucket date manipulation

I am using $facet aggregation for e-commerce style platform.
for this case I have an products collections, and the product schema contain many fields.
I am using same aggregation process to get all the required facets.
the question is, if I am able to use the same $facet \ $bucket aggrigation to group documents by manipulation of specific field -
for example - in product schema I have releaseDate field which is Date (type) field.
currently the aggregation query is looking like this:
let facetsBodyExample = {
'facetName': [
{ $unwind: `$${fieldData.key}` },
{ $sortByCount: `$${fieldData.key}` }
],
'facetName2': [
{ $unwind: `$${fieldData.key}` },
{ $sortByCount: `$${fieldData.key}` }
],
...,
...
};
let results = await Product.aggregate({
$facet: facetsBodyExample
});
the documents looks like
{
_id : 'as5d16as1d65sa65d165as1d',
name : 'product name',
releaseDate: '2015-07-01T00:00:00.000Z',
field1:13,
field2:[1,2,3],
field3:'abx',
...
}
I want to create custom facets (groups) by quarter + year in format like 'QQ/YYY', without defining any boundaries.
now, I am getting groups of exact match of date, and I want to group them into quarter + year groups, if possible in the same $facet aggregation.
Query result of the date field I want to customize:
CURRENT:
{
relaseDate: [
{ "_id": "2017-01-01T00:00:00.000Z", "count": 26 },
{ "_id": "2013-04-01T00:00:00.000Z", "count": 25 },
{ "_id": "2013-07-01T00:00:00.000Z", "count": 23 },
...
]
}
DESIRED:
{
relaseDate: [
{ "_id": "Q1/2014", "count": 100 },
{ "_id": "Q2/2014", "count": 200 },
{ "_id": "Q3/2016", "count": 300 },
...
]
}
Thanks !!