I store price changes data for each date in MongoDB as following:
{ "_id" : "A1",
"Price" :
{
"2020-08-25": {"P" : [1200, 1300, 1250]},
"2020-08-26": {"P" : [1310, 1400, 1200]},
"2020-08-27": {"P" : [1500, 1300, 1300]},
...
}
},
{ "_id" : "A2",
"Price" :
{
"2020-08-25": {"P" : [1200, 1300, 1250]},
"2020-08-26": {"P" : [1310, 1400, 1200]},
"2020-08-27": {"P" : [1500, 1300, 1300]},
...
}
}
Now, I want to get maximum price for all dates. How I can do it without writing each date field in my query? I mean, for a specific date like as "2020-08-25", someone can uses $group and $max to obtain maximum price value in that date. But, how write a query to get maximum price value among all dates?
Thanks
To get the maximum price for all dates in the collection you need to run an aggregate operation that first gets the maximum for each document and this can be achieved using the $max and $map operators to create an array of price values which can be easily extrapolated for a maximum value.
To get the array values you first need to convert the Price document to an array of key/value pairs of dates and the prices using $objectToArray
db.getCollection('collection').aggregate([
{ '$set': {
'maxPricePerDocument': {
'$max': {
'$map': {
'input': { '$objectToArray': '$Price' },
'in': { '$max': '$$this.v.P' }
}
}
}
} },
{ '$group': {
'_id': 0,
'maxPriceForEntireCollection': { '$max': '$maxPricePerDocument' }
} }
])
Update:
To get across the collection
db.collection.aggregate([
{
$project: {
"prices": {
"$objectToArray": "$Price"
}
}
},
{
$group: {
"_id": null,
P: {
$addToSet: {
$max: "$prices.v.P"
}
}
}
},
{
$project: {
"maxPrice": {
$max: {
$first: "$P"
}
}
}
}
])
play
Date specific:
playground
db.collection.aggregate([
{
$project: {
"prices": {//reshape it to perform object wise operations - mainly converting to array
"$objectToArray": "$Price"
}
}
},
{//getting one by one entries
"$unwind": "$prices"
},
{
$group: {//grouping by date and getting the max from local entry
"_id": "$prices.k",
"values": {
$max: "$prices.v.P"
}
}
},
{
$project: {
"date": "$_id",
"maxPrice": {//getting the max across multiple entries
$max: "$values"
},
"_id": 0
}
}
])
Related
I am trying to find the products whose price is above the average price.
I know how to get the average:
db.products.aggregate([{
"$group": {
"_id": null,
"average": { "$avg": "$price" }
}
},
{ $project : { _id : 0 } } ])
But how can I use it in a $gt clause?
For instance, I tried to save the result in a variable:
var averageValue =
db.products.aggregate([{
"$group": {
"_id": null,
"average": { "$avg": "$price" }
}
},
{ $project : { _id : 0 } } ])
And then use it in the $gt clause:
db.products.find({ "price": { "$gt": averageValue} })
However, it does not seem to print me anything.
I am also wondering if this is possible to be done in a single query.
If you use MongoDB version 5.0, you can use $setWindowFields to perform the average for all documents in the collection and add the field with result to each document.
Performs operations on a specified span of documents in a collection, known as a window, and returns the results based on the chosen window operator.
db.products.aggregate([
{
"$setWindowFields": {
"output": {
"average": {
"$avg": "$price"
}
}
}
},
{
$match: {
$expr: {
$gt: [
"$price",
"$average"
]
}
}
}
])
Sample Mongo Playground
I have below schema where I need to identify the object which has highest rank.
{ "team" : {
"member1" : [ { "rank": 2, "goal": 50 } ],
"member2" : [ { "rank": 5, "goal": 30 } ],
"member3" : [ { "rank": 1, "goal": 80 } ]
}}
$unwind will not work on the nested objects. Tried to convert this object as Array and tried to find the max of rank key. Any help would be appreciated.
If the intent is to only find the maximum rank that exists: The idea is a two stage aggregation query using $project and using $objectToArray to have common keys from which $max on required attribute can be applied.
Query: playground link
db.collection.aggregate([
{
$project: {
teamsData: {
$objectToArray: "$team"
}
}
},
{
$project: {
maxRank: {
$max: "$teamsData.v.rank"
}
}
}
]);
To get the object details that has the maximum rank: Use $unwind on the array projected from previous stage to help in sorting by rank $sort and then picking the the first item $first at $group stage.
Query: playgorund link
db.collection.aggregate([
{
$project: {
team: {
$objectToArray: "$team"
}
}
},
{
$unwind: "$team"
},
{
$sort: {
"team.v.rank": -1
}
},
{
$group: {
_id: null,
maxRankObj: {
$first: "$$ROOT"
}
}
}
]);
Sample O/P:
[
{
"_id": null,
"maxRankObj": {
"_id": ObjectId("5a934e000102030405000000"),
"team": {
"k": "member2",
"v": [
{
"goal": 30,
"rank": 5
}
]
}
}
}
]
I have the following documents stored in a collection:
{
"REQUESTTIMESTAMP" : "26-JUN-19 01.34.10.095000000 AM",
"UNHANDLED_INTENT" : 0,
"USERID" : "John",
"START_OF_INTENT_SKILL_CONVERSATION" : 0,
"PROPERTYCODE" : ""
}
I want to group this by the hour(which we will get from 'REQUESTTIMESTAMP')
Earlier, I had this document stored in the collection in a different way, where I had a separate field for hours, and used that hours field to group:
Previous aggregation query :
collection.aggregate([
{'$match': query}, {
'$group': {
"_id": {
"hour": "$hour",
"sessionId": "$sessionId"
}
}
}, {
"$group": {
"_id": "$_id.hour",
"count": {
"$sum": 1
}
}
}
])
Previous collection structure:
{
"timestamp" : "1581533210921",
"date" : "12-02-2020",
"hour" : "13",
"month" : "02",
"time" : "13:46:50",
"weekDay" : "Wednesday",
"__v" : 0
}
How can I do the above same Previous aggregation query with the new document structure (After extracting hours from 'REQUESTTIMESTAMP' field?)
You should convert your timestamp to Date object then take hour from your date object.
db.collection.aggregate([{
'$match': query
}, {
$project: {
date: {
$dateFromString: {
dateString: '$REQUESTTIMESTAMP',
format: "%m-%d-%Y" //This should be your date format
}
}
}
}, {
$group: {
_id: {
hour: {
$hour: "$date"
}
}
}
}])
Problem is months names are not supported by MongoDB. Either you write a lot of code or you use libraries like moments.js. First update your REQUESTTIMESTAMP to proper Date object, then you can group it.
db.collection.find().forEach(function (doc) {
var d = moment(doc.REQUESTTIMESTAMP, "DD-MMM-YY hh.mm.ss.SSS a");
db.collection.updateOne(
{ _id: doc._id },
{ $set: { date: d.toDate() } }
);
})
db.collection.aggregate([
{
$group: {
_id: { $hour: "$date" },
count: { $sum: 1 }
}
}
])
In case if you're not able to update DB with actual date field & still wanted to proceed with existing format, try this query it will add hour field extracted from given string field REQUESTTIMESTAMP :
Query :
db.collection.aggregate([
{
$addFields: {
hour: {
$let: {
/** split string into three parts date + hours + AM/PM */
vars: { hour: { $slice: [{ $split: ["$REQUESTTIMESTAMP", " "] }, 1, 2] } },
in: {
$cond: [{ $in: ["AM", "$$hour"] }, // Check AM exists in array
{ $toInt: { $substr: [{ $arrayElemAt: ["$$hour", 0] }, 0, 2] } }, // If yes then return int of first 2 letters of first element in hour array
{ $add: [{ $toInt: { $substr: [{ $arrayElemAt: ["$$hour", 0] }, 0, 2] } }, 12] } ] // If PM add 12 to int of first 2 letters of first element in hour array
}
}
}
}
}
])
Test : MongoDB-Playground
I have a document which describes counts of different things observed by a camera within a 15 minute period. It looks like this:
{
"_id" : ObjectId("5b1a709a83552d002516ac19"),
"start" : ISODate("2018-06-08T11:45:00.000Z"),
"end" : ISODate("2018-06-08T12:00:00.000Z"),
"recording" : ObjectId("5b1a654683552d002516ac16"),
"data" : {
"counts" : {
"5b434d05da1f0e00252566be" : 12,
"5b434d05da1f0e00252566cc" : 4,
"5b434d05da1f0e00252566ca" : 1
}
}
}
The keys inside the data.counts object change with each document and refer to additional data that is fetched at a later date. There are unlimited number of keys inside data.counts (but usually about 20)
I am trying to aggregate all these 15 minute documents up to daily aggregated documents.
I have this query at the moment to do that:
db.getCollection("segments").aggregate([
{$match:{
"recording": ObjectId("5bf7f68ad8293a00261dd83f")
}},
{$project:{
"start": 1,
"recording": 1,
"data": 1
}},
{$group:{
_id: { $dateToString: { format: "%Y-%m-%d", date: "$start" } },
"segments": { $push: "$$ROOT" }
}},
{$sort: {_id: -1}},
]);
This does the grouping and returns all the segments in an array.
I want to also aggregate the information inside data.counts, so that I get the sum of values for all keys that are the same within the daily group.
This would save me from having another service loop through each 15 minute segment summing values with the same keys. E.g. the query would return something like this:
{
"_id" : "2019-02-27",
"counts" : {
"5b434d05da1f0e00252566be" : 351,
"5b434d05da1f0e00252566cc" : 194,
"5b434d05da1f0e00252566ca" : 111
... any other keys that were found within a day
}
}
How might I amend the query I already have, or use a different query?
Thanks!
You could use the $facet pipeline stage to create two sub-pipelines; one for segments and another for counts. These sub-pipelines can be joined by using $zip to stitch them together and $map to merge each 2-element array produced from zip. Note this will only work correctly if the sub-pipelines output sorted arrays of the same size, which is why we group and sort by start_date in each sub-pipeline.
Here's the query:
db.getCollection("segments").aggregate([{
$match: {
recording: ObjectId("5b1a654683552d002516ac16")
}
}, {
$project: {
start: 1,
recording: 1,
data: 1,
start_date: { $dateToString: { format: "%Y-%m-%d", date: "$start" }}
}
}, {
$facet: {
segments_pipeline: [{
$group: {
_id: "$start_date",
segments: {
$push: {
start: "$start",
recording: "$recording",
data: "$data"
}
}
}
}, {
$sort: {
_id: -1
}
}],
counts_pipeline: [{
$project: {
start_date: "$start_date",
count: { $objectToArray: "$data.counts" }
}
}, {
$unwind: "$count"
}, {
$group: {
_id: {
start_date: "$start_date",
count_id: "$count.k"
},
count_sum: { $sum: "$count.v" }
}
}, {
$group: {
_id: "$_id.start_date",
counts: {
$push: {
$arrayToObject: [[{
k: "$_id.count_id",
v: "$count_sum"
}]]
}
}
}
}, {
$project: {
counts: { $mergeObjects: "$counts" }
}
}, {
$sort: {
_id: -1
}
}]
}
}, {
$project: {
result: {
$map: {
input: { $zip: { inputs: ["$segments_pipeline", "$counts_pipeline"] }},
in: { $mergeObjects: "$$this" }
}
}
}
}, {
$unwind: "$result"
}, {
$replaceRoot: {
newRoot: "$result"
}
}])
Try it out here: Mongoplayground.
I have a field that matches this:
"invoice_items" : [
{
"price" : 10,
"quantity" : 600
},
{
"price" : 499.99,
"quantity" : 1
}
]
I was looking to aggregate it and get a total, so multiply price by quantity and add the two together, there can be any number of items in the array. The last one that I tried was:
[{
$match: {
"is_closed": false
}
},
{
$group: {
_id: "$_id",
invoice_items: "$invoice_items"
}
},
{
$project: {
_id: 0,
count: {
$multiply: ["$invoice_items.price", "$invoice_items.quantity"]
}
}
}]
Which I thought would give something back but it only errored out saying "The field 'invoice_items' must be an accumulator object"
You can use below aggregation
Basically you need to use $map to iterate over the invoice_items array and $sum to add all the invoice_items values
db.collection.aggregate([
{ "$project": {
"count": {
"$sum": {
"$map": {
"input": "$invoice_items",
"in": { "$multiply": ["$$this.price", "$$this.quantity"] }
}
}
}
}}
])