MongoDB - Get rank of the document based on frequency - mongodb

[
{_id: 1, query: 'A', createdAt: 1660610671 },
{_id: 2, query: 'A', createdAt: 1660610672 },
{_id: 3, query: 'A', createdAt: 1660610673 },
{_id: 4, query: 'A', createdAt: 1660610674 },
{_id: 5, query: 'B', createdAt: 1660610675 },
{_id: 6, query: 'C', createdAt: 1660610676 },
{_id: 7, query: 'C', createdAt: 1660610677 },
{_id: 8, query: 'C', createdAt: 1660610678 },
{_id: 9, query: 'D', createdAt: 1660610680 },
{_id: 10, query: 'D', createdAt: 1660610681 },
]
I have the above database structure. I want to get rank from the frequency of the query value in a specific period.
Maybe it would be something like this.
Queries.getRank({ key: 'query', createdAt: {$gte: startUnix, $lt: endUnix } })
I expect the result as below.
Rank
[
{rank: 1, query: 'A', frequency: 4},
{rank: 2, query: 'C', frequency: 3},
{rank: 3, query: 'D', frequency: 2},
{rank: 4, query: 'B', frequency: 1}
]
Is there a way to achieve it? Thanks.

$match - Filter document within the range for createdAt field (if needed).
$group - Group by query and perform $count as frequency.
$project - Decorate the output document(s).
$setWindowFields - With $rank to perform ranking by sorting frequency descending. May consider $denseRank for the document with the same rank.
db.collection.aggregate([
// $match stage
{
$group: {
_id: "$query",
frequency: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
query: "$_id",
frequency: "$frequency"
}
},
{
$setWindowFields: {
partitionBy: null,
sortBy: {
frequency: -1
},
output: {
rank: {
$rank: {}
}
}
}
},
])
Demo # Mongo Playground

You can write the following aggregation pipeline:
db.collection.aggregate([
{
"$group": {
"_id": "$query",
"frequency": {
"$sum": 1
}
}
},
{
"$project": {
"query": "$_id",
"frequency": 1,
"_id": 0
}
},
{
"$sort": {
frequency: -1
}
},
{
"$group": {
"_id": null,
"array": {
"$push": "$$ROOT"
}
}
},
{
"$unwind": {
path: "$array",
"includeArrayIndex": "rank"
}
},
{
"$project": {
_id: 0,
rank: {
"$add": [
"$rank",
1
]
},
frequency: "$array.frequency",
query: "$array.query"
}
}
]);
Playground link.
In this, we first calculate the frequency for each query, then we sort it by the frequency, and finally, we push all documents in an array and calculate the rank, using array index.

Related

Mongodb aggregation pipeline, combine result from two facet pipeline

I'm having a claim type:
type TClaim: {
insuredId: number,
treatmentInfo: { amount: number }[]
}
and a list of claims:
[
{
insuredId: 1,
treatmentInfo: [{amount: 1}, {amount: 2}]
},
{
insuredId: 1,
treatmentInfo: [{amount: 3}, {amount: 4}]
},
{
insuredId: 2,
treatmentInfo: [{amount: 1}, {amount: 2}]
}
]
I want to get the result like:
[{insuredId: 1, numberOfClaims: 2, amount: 10},{insuredId: 2, numberOfClaims: 1, amount: 3}]
I'm using the $facet operator in mongodb aggregation, one for counting numberOfClaims and one for calculating the amount of each insurer. But I can't combine it to get the result that I want.
$facet: {
totalClaims: [ { $group: { _id: '$insuredId', totalClaims: { $count: {} } } } ],
amount: [ { $unwind: { path: '$treatmentInfo'}},
{ $group:
{ _id: '$insuredId',
amount: { $sum: '$treatmentInfo.amount',
},
},
},
]
Is there a reason why you want to use $facet? - I am just curious
You just need to add a new fields that sums up all the amount in the array first and then do a group stage by insuredId. The query is pretty much self-explanatory.
db.collection.aggregate([
{
"$addFields": {
"totalAmount": {
"$sum": "$treatmentInfo.amount"
}
}
},
{
"$group": {
"_id": "$insuredId",
"numberOfClaims": {
"$sum": 1
},
"amount": {
"$sum": "$totalAmount"
}
}
}
])
Result:
[
{
"_id": 1,
"amount": 10,
"numberOfClaims": 2
},
{
"_id": 2,
"amount": 3,
"numberOfClaims": 1
}
]
MongoDB Playground

Projection and group on nested object mongodb aggregation query

How to get the nested object in projection and group in mongodb aggregate query.
[
{
city: "Mumbai",
meta: {
luggage: 2,
scanLuggage: 1,
upiLuggage: 1
},
cash: 10
},
{
city: "Mumbai",
meta: {
luggage: 4,
scanLuggage: 3,
upiLuggage: 1
},
cash: 24
},
]
I want to $match the above on the basis of city, and return the sum of each luggage type.
My code is as follows but $project is not working -
City.aggregate([
{
$match: { city: 'Mumbai' }
},
{
$project: {
city: 1,
mata.luggage: 1,
meta.scanLuggage: 1,
meta.upiLuggage: 1
}
},
{
$group: {
id: city,
luggage: {$sum: '$meta.luggage'},
scanLuggage: {$sum: '$meta.scanLuggage'},
upiLuggage: {$sum: '$meta.upiLuggage'}
}
}
])
But the $project is throwing error. I want my output to look like -
{
city: 'Mumbai',
luggage: 6,
scanLuggage: 4,
upiLuggage: 2
}
You should specify nested fields in quotes when using in $project, and also for grouping key should be _id.
db.collection.aggregate([
{
$match: {
city: "Mumbai"
}
},
{
$project: {
city: 1,
"meta.luggage": 1,
"meta.scanLuggage": 1,
"meta.upiLuggage": 1
}
},
{
$group: {
_id: "$city",
luggage: {
$sum: "$meta.luggage"
},
scanLuggage: {
$sum: "$meta.scanLuggage"
},
upiLuggage: {
$sum: "$meta.upiLuggage"
}
}
}
])
This is the playground link.

MongoDB - Calculate field based on previous item

I have a tricky scenario - I need to calculate extra field, based on value from previous field. I have no idea how to do it in performant manner. Any thoughts?
Data:
{
_id: 1,
score: 66,
created_at: "2021-04-01"
},
{
_id: 2,
score: 12,
created_at: "2021-04-03"
},
{
_id: 3,
score: 7,
created_at: "2021-04-06"
}
What I want to achieve
{
_id: 1,
score: 66,
total_score: 66 // The most oldest item, so total score is the same as current score
created_at: "2021-04-01"
},
{
_id: 2,
score: 12,
total_score: 78 // Sum of current score and previous total_score
created_at: "2021-04-03"
},
{
_id: 3,
score: 7,
total_score: 85 // Sum of current score and previous total_score
created_at: "2021-04-06"
}
Any insights appreciated.
You can try aggregation query,
$lookup with pipeline and match greater than query by _id to select the previous record
$group by null and get the sum of score
$arrayElemAt to get first element from lookup result
$ifNull to check if the result is null then return 0 otherwise return the current value
$add to get total of score and return score from lookup
db.collection.aggregate([
{
$lookup: {
from: "collection",
let: { id: "$_id" },
pipeline: [
{ $match: { $expr: { $gt: ["$$id", "$_id"] } } },
{
$group: {
_id: null,
score: { $sum: "$score" }
}
}
],
as: "total_score"
}
},
{
$addFields: {
total_score: {
$add: [
"$score",
{
$ifNull: [
{ $arrayElemAt: ["$total_score.score", 0] },
0
]
}
]
}
}
}
])
Playground

MongoDB Filter only IF ANY

Mongo Playgound
Lets say I have these results:
A)
[
{_id: 1, Name: 'A', Price: 10, xx:0},
{_id: 2, Name: 'B', Price: 15, xx:0},
{_id: 3, Name: 'A', Price: 100, xx:1},
{_id: 4, Name: 'B', Price: 150, xx:1},
]
B)
[
{_id: 1, Name: 'A', Price: 10, xx:0},
{_id: 2, Name: 'B', Price: 15, xx:0},
]
I want to:
If exists at least one x:1, return all x:1 only
If there is none x:1, return all x:0
Should I do a MAP & FILTER on root docs? or some kind of MATCH with conditionals? or Redact?
Results desired Ex.:
A) Removed x:0 because exists x:1, so returned only x:1
[
{_id: 3, Name: 'A', xx:1},
{_id: 4, Name: 'B', xx:1},
]
B) Returned only x:0 as there are only x:0
[
{_id: 1, Name: 'A', xx:0},
{_id: 2, Name: 'B', xx:0},
]
Group the documents by the xx field and add the grouped docs to the docs array using $push.
Sort the docs by the _id field in descending order.
Limit the result to 1.
If there are documents with both xx: 0 and xx: 1 values, only the xx: 1 group would be returned since we're sorting in descending order and limiting the result to the first group. If there are no documents with xx: 1 but documents with xx: 0 exist, the first group would be xx: 0 which gets returned.
You can then use $unwind to return a document for each grouped document and $replaceRoot to lift the document to the root level.
db.collection.aggregate([
{
$group: {
_id: "$xx",
docs: {
$push: "$$ROOT",
}
}
},
{
$sort: {
_id: -1,
}
},
{
$limit: 1,
},
{
$unwind: "$docs"
},
{
$replaceRoot: {
newRoot: "$docs"
},
}
])
MongoPlayground
If there might be docs with an xx value other than 0 and 1, you should filter those out using $match before grouping the docs using $group.
db.collection.aggregate([
{
$match: {
xx: {
$in: [
0,
1
]
}
}
},
{
$group: {
_id: "$xx",
docs: {
$push: "$$ROOT",
}
}
},
{
$sort: {
_id: -1,
}
},
{
$limit: 1,
},
{
$unwind: "$docs"
},
{
$replaceRoot: {
newRoot: "$docs"
},
}
])
MongoPlayground

Duplicate elements in a mongo db collection

Is there an quick efficient way to duplicate elements in a mongo db collections based on a property. In the example below, I am trying to duplicate the elements based on a jobId.
I am using Spring boot, so any example using Spring boot API would be even more helpful.
Original Collection
{ _id: 1, jobId: 1, product: "A"},
{ _id: 2, jobId: 1, product: "B"},
{ _id: 3, jobId: 1, product: "C"},
After duplication
{ _id: 1, jobId: 1, product: "A"},
{ _id: 2, jobId: 1, product: "B"},
{ _id: 3, jobId: 1, product: "C"},
{ _id: 4, jobId: 2, product: "A"},
{ _id: 5, jobId: 2, product: "B"},
{ _id: 6, jobId: 2, product: "C"},
You can use following aggregation:
db.col.aggregate([
{
$group: {
_id: null,
values: { $push: "$$ROOT" }
}
},
{
$addFields: {
size: { $size: "$values" },
range: { $range: [ 0, 3 ] }
}
},
{
$unwind: "$range"
},
{
$unwind: "$values"
},
{
$project: {
_id: { $add: [ "$values._id", { $multiply: [ "$range", "$size" ] } ] },
jobId: { $add: [ "$values.jobId", "$range" ] },
product: "$values.product",
}
},
{
$sort: {
_id: 1
}
},
{
$out: "outCollection"
}
])
The algorithm is quite simple here: we want to iterate over two sets:
first one defined by all items from your source collection (that's why I'm grouping by null)
second one defined artificially by $range operator. It will define how many times we want to multiply our collection (3 times in this example)
Double unwind generates as much documents as we need. Then the formula for each _id is following: _id = _id + range * size. Last step is just to redirect the aggregation output to your collection.