How to get argmax/argmin of multiple fields simultaneously in mongodb? - mongodb

Here's the data example I'm working with.
[
{
"uid": "111",
"a": 1,
"b": 3,
"c": 1,
},
{
"uid": "222",
"a": 2,
"b": 2,
"c": 2
},
{
"uid": "333",
"a": 3,
"b": 1,
"c": 3
}
]
Then I want to perform argmax on fields "a" and "b", and argmin on field "c" and return the "uid" as the result.
For example:
For "a", it's maximum value is 3, the corresponding "uid" is "333", so argmax of "a" should be "uid" : "333".
The question is what query should be executed so that I can get the result as below?
[
{
"argmax_of_a": "333",
"argmax_of_b": "111",
"argmin_of_c": "111",
}
]
Here's the code snipped I'm playing with https://mongoplayground.net/p/gEDuHd-aCiZ
I can find someway to get argmax/argmin of one specific field, but I have no idea how to work on multiple fields simultaneously.
Thanks in advance!

give this aggreation pipeline a try:
db.collection.aggregate(
[
{
$group: {
_id: null,
a: { $push: { uid: '$uid', val: '$a' } },
b: { $push: { uid: '$uid', val: '$b' } },
c: { $push: { uid: '$uid', val: '$c' } }
}
},
{
$project: {
_id: 0,
max_of_a: { $arrayElemAt: ["$a", { $indexOfArray: ["$a.val", { $max: '$a.val' }] }] },
max_of_b: { $arrayElemAt: ["$b", { $indexOfArray: ["$b.val", { $max: '$b.val' }] }] },
max_of_c: { $arrayElemAt: ["$c", { $indexOfArray: ["$c.val", { $max: '$c.val' }] }] }
}
},
{
$project: {
arg_max_of_a: '$max_of_a.uid',
arg_max_of_b: '$max_of_b.uid',
arg_max_of_c: '$max_of_c.uid'
}
}
])

Related

Get current state from snapshot documents - mongoDB

I'm trying to get a list of current holders at specific times from a collection. My collection looks like this:
[
{
"time": 1,
"holdings": [
{ "owner": "A", "tokens": 2 },
{ "owner": "B", "tokens": 1 }
]
},
{
"time": 2,
"holdings": [
{ "owner": "B", "tokens": 2 }
]
},
{
"time": 3,
"holdings": [
{ "owner": "A", "tokens": 3 },
{ "owner": "B", "tokens": 1 },
{ "owner": "C", "tokens": 1 }
]
},
{
"time": 4,
"holdings": [
{ "owner": "C", "tokens": 0 }
]
}
]
tokens show the current holdings of an owner if the holdings have changed to the last document. I would like to change the collection so that holdings always includes the full current holdings for any point in time.
At time: 1, the holdings are: A: 2, B: 1.
At time: 2, the holdings are: A: 2, B: 2. The collections does not include A's holdings however, because they haven't changed. So what I'd like to get is:
[
{
"time": 1,
"holdings": [
{ "owner": "A", "tokens": 2 },
{ "owner": "B", "tokens": 1 }
]
},
{
"time": 2,
"holdings": [
{ "owner": "A", "tokens": 2 }, // merged from prev doc.
{ "owner": "B", "tokens": 2 }
]
},
{
"time": 3,
"holdings": [
{ "owner": "A", "tokens": 3 },
{ "owner": "B", "tokens": 1 },
{ "owner": "C", "tokens": 1 }
]
},
{
"time": 4,
"holdings": [
{ "owner": "A", "tokens": 3 }, // merged from prev
{ "owner": "B", "tokens": 1 }, // merged from prev
{ "owner": "C", "tokens": 0 }
]
}
]
From what I understand $mergeObjects does that, but I don't understand how I can merge all previous docs in order up to the current doc for each doc. So I'm looking for a way to combine setWindowFields with mergeObjects I think.
This is a nice challenge.
So far, I got this complicated solution:
Get all of our timestamps in all of our documents. This is the purpose of the first 4 steps. $setWindowFields is used to accumulate this data.
$group by owner and calculate the empty timestamps as wantedTimes- next 5 steps.
$set empty timestamps with tokens: null to be filled with actual data and $unwind to separate - next 3 steps
Use $setWindowFields to find the last known token for each owner at each timestamp.
Fill this last known state for documents with unknown token - 2 steps
$group and format answer:
db.collection.aggregate([
{
$setWindowFields: {
sortBy: {time: 1},
output: {
allTimes: {$addToSet: "$time", window: {documents: ["unbounded", "current"]}
}
}
}
},
{
$setWindowFields: {
sortBy: {time: -1},
output: {
allTimes: {$addToSet: "$allTimes", window: {documents: ["unbounded", "current"]}
}
}
}
},
{
$set: {
allTimes: {
$reduce: {
input: "$allTimes",
initialValue: [],
in: {"$concatArrays": ["$$value", "$$this"]}
}
}
}
},
{$set: {allTimes: {$setIntersection: "$allTimes"}}},
{$unwind: "$holdings"},
{$sort: {time: 1}},
{$group: { _id: "$holdings.owner",
tokens: {$push: {tokens: "$holdings.tokens", time: "$time"}},
times: {$push: "$time"}, firstTime: {$first: "$time"},
allTimes: {$first: "$allTimes"}}
},
{
$addFields: {
wantedTimes: {
$filter: {
input: "$allTimes",
as: "item",
cond: {$gte: ["$$item", "$firstTime"]}
}
}
}
},
{
$project: {
tokens: 1,
wantedTimes: {$setDifference: ["$wantedTimes", "$times"]}
}
},
{
$set: {
data: {
$map: {
input: "$wantedTimes",
as: "item",
in: {time: "$$item", tokens: null}
}
}
}
},
{$project: {tokens: {"$concatArrays": ["$tokens", "$data"]}}},
{$unwind: "$tokens"},
{
$setWindowFields: {
partitionBy: "$_id",
sortBy: {"tokens.time": 1},
output: {
lastTokens: {
$push: "$tokens.tokens",
window: {documents: ["unbounded", "current"]}
}
}
}
},
{
$set: {
lastTokens: {
$filter: {
input: "$lastTokens",
as: "item",
cond: {$ne: ["$$item", null]}
}
}
}
},
{
$set: {
"tokens.tokens": {$ifNull: ["$tokens.tokens", {$last: "$lastTokens"}]}
}
},
{
$group: {
_id: "$tokens.time",
holdings: {$push: {owner: "$_id", tokens: "$tokens.tokens" }}
}
},
{$project: {time: "$_id", holdings: 1, _id: 0}},
{$sort: {time: 1}}
])
Playground example
From a performance perspective I recommend you split it into 2 calls, the first will be a quick findOne just to get the maximum time value in the collection.
Once you have that value the pipeline can be much leaner:
const maxItem = await db.collection.findOne({}).sort({ time: -1 });
db.collection.aggregate([
{
$unwind: "$holdings"
},
{
$group: {
_id: "$holdings.owner",
times: {
$push: {
time: "$time",
tokens: "$holdings.tokens"
}
},
minTime: {
$min: "$time"
}
}
},
{
$addFields: {
times: {
$reduce: {
input: {
$range: [
"$minTime",
maxItem.time + 1 // this is max time
]
},
initialValue: {
values: [],
lastIndex: 0
},
in: {
values: {
"$concatArrays": [
"$$value.values",
[
{
$cond: [
{
$in: [
"$$this",
"$times.time"
]
},
{
"$arrayElemAt": [
"$times",
"$$value.lastIndex"
]
},
{
"$mergeObjects": [
{
tokens: 0
},
{
"$arrayElemAt": [
"$times",
{
$subtract: [
"$$value.lastIndex",
1
]
}
]
},
{
time: "$$this"
}
]
}
]
}
]
]
},
lastIndex: {
$cond: [
{
$in: [
"$$this",
"$times.time"
]
},
{
$sum: [
"$$value.lastIndex",
1
]
},
"$$value.lastIndex"
]
}
}
}
}
}
},
{
$unwind: "$times.values"
},
{
$group: {
_id: "$times.values.time",
holdings: {
$push: {
owner: "$_id",
tokens: "$times.values.tokens"
}
}
}
},
{
$project: {
_id: 0,
time: "$_id",
holdings: 1
}
},
{
$sort: {
time: 1
}
}
])
This is still quite a heavy query as it requires to $unwind and $group the entire collection, however there is no workaround this due to the requirements. if the collection is too big for this approach I recommend iteration owner by owner, or time by time and doing separate updates accordingly.
Mongo Playground
If you don't care about performance at all and want it in a single query you can still use the same pipeline, you will have to first extract the max time in the collection, this will require you to add an initial $group stage, like so:
db.collection.aggregate([
{
$group: {
_id: null,
maxTime: {
$max: "$time"
},
roots: {
$push: "$$ROOT"
}
}
},
{
$unwind: "$roots"
},
{
$replaceRoot: {
newRoot: {
"$mergeObjects": [
"$roots",
{
maxTime: "$maxTime"
}
]
}
}
},
... same pipeline ...
])

How can I get a single item from the array and display it as an object? and not as an array Mongodb

I have a collection from which I need specific obj e.g. notes.blok2 and notes.curse5 as an object, not as an array
{
"year":2020,
"grade":4,
"seccion":"A",
"id": 100,
"name": "pedro",
"notes":[{"curse":5,
"block":1,
"score":{ "a1": 5,"a2": 10, "a3": 15}
},{"curse":5,
"block":2,
"score":{ "b1": 10,"b2": 20, "b3": 30}
}
]
}
My query
notas.find({
"$and":[{"grade":1},{"seccion":"A"},{"year":2020}]},
{"projection":{ "grade":1, "seccion":1,"name":1,"id":1,
"notes":{"$elemMatch":{"block":2,"curse":5}},"notes.score":1} })
It works but returns notes like array
{
"_id": "55",
"id": 100,
"grade": 5,
"name": "pedro",
"seccion": "A",
"notes": [
{"score": { "b1": 10,"b2": 20, "b3": 30} }
]
}
But I NEED LIKE THIS: score at the same level as others and if doesn't exist show empty "score":{}
{
"year":2020,
"grade":5,
"seccion":"A",
"id": 100,
"name": "pedro",
"score":{ "b1": 10,"b2": 20, "b3": 30}
}
Demo - https://mongoplayground.net/p/XlJqR2DYW1X
You can use aggregation query
db.collection.aggregate([
{
$match: { // filter
"grade": 1,
"seccion": "A",
"year": 2020,
"notes": {
"$elemMatch": {
"block": 2,
"curse": 5
}
}
}
},
{ $unwind: "$notes" }, //break into individual documents
{
$match: { // match query on individual note
"notes.block": 2,
"notes.curse": 5
}
},
{
$project: { // projection
"grade": 1,
"seccion": 1,
"name": 1,
"id": 1,
"score": "$notes.score"
}
}
])
Update
Demo - https://mongoplayground.net/p/mq5Kue3UG42
Use $filter
db.collection.aggregate([
{
$match: {
"grade": 1,
"seccion": "A",
"year": 2020
}
},
{
$set: {
"score": {
"$filter": {
"input": "$notes",
"as": "note",
"cond": {
$and: [
{
$eq: [ "$$note.block",3]
},
{
$eq: [ "$$note.curse", 5 ]
}
]
}
}
}
}
},
{
$project: {
// projection
"grade": 1,
"seccion": 1,
"name": 1,
"id": 1,
"score": {
"$first": "$score.score"
}
}
}
])
If you want empty object for score when match not found you can do -
Demo - https://mongoplayground.net/p/dumax58kgrc
{
$set: {
score: {
$cond: [
{ $size: "$score" }, // check array length
{ $first: "$score" }, // true - take 1st
{ score: {} } // false - set empty object
]
}
}
},

how to create an array on the output of a response using aggregate in Mongodb

I have in my collection a list of objects with this structure:
[
{
"country": "colombia",
"city":"medellin",
"calification": [
{
"_id": 1,
"stars": 5
},
{
"_id": 2,
"stars": 3
}
]
},
{
"country": "colombia",
"city":"manizales",
"calification": [
{
"_id": 1,
"stars": 5
},
{
"_id": 2,
"stars": 5
}
]
},
{
"country": "argentina",
"city":"buenos aires",
"calification": [
{
"_id": 1,
"stars": 5
},
]
},
{
"country": "perĂº",
"city":"cusco",
"calification": [
{
"_id": 3,
"stars": 3
},
]
}
]
I am trying to make a filter so that the output is an amount of arrays for each country. this is the example of the output i want.
avg would be result sum 'stars'/ calification.length
{
"colombia": [
{
"city": "medellin",
"avg": 4,
"calification": [
{
"_id": 1,
"stars": 5
},
{
"_id": 2,
"stars": 3
}
]
},
{
"city": "manizales",
"avg": 5,
"calification": [
{
"_id": 1,
"stars": 5
},
{
"_id": 2,
"stars": 3
}
]
}
],
"argentina": {
"city": "buenos aires",
"avg": 5,
"calification": [
{
"_id": 1,
"stars": 5
}
]
},
"peru": {
"city": "cusco",
"avg": 4,
"calification": [
{
"_id": 1,
"stars": 4
}
]
}
}
I am trying to do this:
Alcalde.aggregate([
{
$addFields: {
colombia: {
"$push": {
"$cond": [{ $eq: ["$country", "'Colombia'"] }, true, null]
}
}
}
},
{
$project: { colombia: "$colombia" }
}
]
how can i do it
We can make it more elegant.
MongoDB has $avg operator, let's use it. Also, we can use $group operator to group cities for the same country.
At the end, applying $replaceRoot + $arrayToObject** we transform into desired result.
** it's because we cannot use such expression: {"$country":"$city"}
$replaceRoot $arrayToObject
data : { { [ {
"key" : "val", --> "key" : "val", {k:"key", v: "val"}, --> "key" : "val",
"key2" : "val2" "key2" : "val2" {k:"key2", v: "val2"} "key2" : "val2"
} } ] }
Try this one:
Alcalde.aggregate([
{
$group: {
_id: "$country",
city: {
$push: {
"city": "$city",
"avg": { $avg: "$calification.stars"},
"calification": "$calification"
}
}
}
},
{
$replaceRoot: {
newRoot: {
$arrayToObject: [ [{ "k": "$_id", "v": "$city"}] ]
}
}
}
])
MongoPlayground
EDIT: Generic way to populate city inner object
$$ROOT is variable which stores root document
$mergeObjects adds / override fields to final object
Alcalde.aggregate([
{
$group: {
_id: "$country",
city: {
$push: {
$mergeObjects: [
"$$ROOT",
{
"avg": { "$avg": "$calification.stars" }
}
]
}
}
}
},
{
$project: {
"city.country": 0
}
},
{
$replaceRoot: {
newRoot: {
$arrayToObject: [
[ { "k": "$_id", "v": "$city" } ]
]
}
}
}
])
MongoPlayground

MongoDB aggregating multiple arrays of objects based on shared key

I'm writing a query to calculate multiple metrics for each user in my DB.
I've calculated all of the metrics, and have a structure like this
{
"metric1": [{"user_id": 1, "val": 13},{"user_id": 2, "val": 100}],
"metric2": [{"user_id": 2, "val": 29},{"user_id": 1, "val": 123}],
"metric3": [{"user_id": 1, "val": 46},{"user_id": 2, "val": 111]
}
I'm trying to convert the above into this structure
{
"user_id": [1,2],
"metric1": [13, 100],
"metric2": [29,123],
"metric3": [46,111]
}
So that I can display a table showing each user and the three metrics (one metric per column, and one user per row).
considering that your data is what you've said:
{
"metric1": [
{"id1": 1}, {"id2": 2}
],
"metric2": [
{"id2": 22}, {"id1": 11}
],
"metric3": [
{"id2": 222}, {"id1": 111}
]
}
all you've to do is using $unwind to be able to break the array and then $objectToArray to have access to keys
db.blah.aggregate([
{ $unwind: '$metric1' },
{ $unwind: '$metric2' },
{ $unwind: '$metric3' },
{ $project: {'metric1': { $objectToArray: '$metric1' }, 'metric2': { $objectToArray: '$metric2' }, 'metric3': { $objectToArray: '$metric3' }} },
{ $sort: { 'metric1.k' : -1} },
{ $sort: { 'metric2.k' : -1} },
{ $sort: { 'metric3.k' : -1} },
{ $unwind: '$metric1' },
{ $unwind: '$metric2' },
{ $unwind: '$metric3' },
{ $group: {
_id: null,
user_id: { $addToSet: '$metric1.k' },
metric1: { $addToSet: '$metric1.v' },
metric2: { $addToSet: '$metric2.v' },
metric3: { $addToSet: '$metric3.v' },
} },
{ $project: { _id: 0 } }
]).pretty()
which results
{
"user_id" : [
"id1",
"id2"
],
"metric1" : [
1,
2
],
"metric2" : [
11,
22
],
"metric3" : [
111,
222
]
}

Use $size with $sort in array and sub array

Here's the structure part of my collection:
_id: ObjectId("W"),
names: [
{
number: 1,
subnames: [ { id: "X", day: 1 }, { id: "Y", day: 10 }, { id: "Z", day: 2 } ],
list: ["A","B","C"],
day: 1
},
{
number: 2,
day: 5
},
{
number: 3,
subnames: [ { id: "X", day: 8 }, { id: "Z", day: 5 } ],
list: ["A","C"],
day: 2
},
...
],
...
I use this request:
db.publication.aggregate( [ { $match: { _id: ObjectId("W") } }, { $group: { _id: "$_id", SizeName: { $first: { $size: { $ifNull: [ "$names", [] ] } } }, names: { $first: "$names" } } }, { $unwind: "$names" }, { $sort: { "names.day": 1 } }, { $group: { _id: "$_id", SzNames: { $sum: 1 }, names: { $push: { number: "$names.number", subnames: "$names.subnames", list: "$names.list", SizeList: { $size: { $ifNull: [ "$names.list", [] ] } } } } } } ] );
but I would now use $sort for my names array AND my subnames array to obtain this result (subnames may not exist) :
_id: ObjectId("W"),
names: [
{
number: 2,
SizeList: 0,
day: 5
},
{
number: 3,
subnames: [ { id: "Z", day: 5 }, { id: "X", day: 8 } ],
list: ["A","C"],
SizeList: 2,
day: 2
},
{
number: 1,
subnames: [ { id: "X", day: 1 }, { id: "Z", day: 2 }, { id: "Y", day: 10 } ],
list: ["A","B","C"],
SizeList: 3,
day: 1
}
...
],
...
Can you help me ?
You can do this, but with great difficulty. I for one would gladly vote for an inline version of $sort along the lines of the $map operator. That would makes things so much easier.
For now though you need to de-construct and re-build the arrays after sorting. And you have to be very careful about this. Hence make false arrays with a single entry before processing $unwind:
db.publication.aggregate([
{ "$project": {
"SizeNames": {
"$size": {
"$ifNull": [ "$names", [] ]
}
},
"names": { "$ifNull": [{ "$map": {
"input": "$names",
"as": "el",
"in": {
"SizeList": {
"$size": {
"$ifNull": [ "$$el.list", [] ]
}
},
"SizeSubnames": {
"$size": {
"$ifNull": [ "$$el.subnames", [] ]
}
},
"number": "$$el.number",
"day": "$$el.day",
"subnames": { "$ifNull": [ "$$el.subnames", [0] ] },
"list": "$$el.list"
}
}}, [0] ] }
}},
{ "$unwind": "$names" },
{ "$unwind": "$names.subnames" },
{ "$sort": { "_id": 1, "names.subnames.day": 1 } },
{ "$group": {
"_id": {
"_id": "$_id",
"SizeNames": "$SizeNames",
"names": {
"SizeList": "$names.SizeList",
"SizeSubnames": "$names.SizeSubnames",
"number": "$names.number",
"list": "$names.list",
"day": "$names.day"
}
},
"subnames": { "$push": "$names.subnames" }
}},
{ "$sort": { "_id._id": 1, "_id.names.day": 1 } },
{ "$group": {
"_id": "$_id._id",
"SizeNames": { "$first": "$_id.SizeNames" },
"names": {
"$push": { "$cond": [
{ "$ne": [ "$_id.names.SizeSubnames", 0 ] },
{
"number": "$_id.names.number",
"subnames": "$subnames",
"list": "$_id.names.list",
"SizeList": "$_id.names.SizeList",
"day": "$_id.names.day"
},
{
"number": "$_id.names.number",
"list": "$_id.names.list",
"SizeList": "$_id.names.SizeList",
"day": "$_id.names.day"
}
]}
}
}},
{ "$project": {
"SizeNames": 1,
"names": {
"$cond": [
{ "$ne": [ "$SizeNames", 0 ] },
"$names",
[]
]
}
}}
])
You can kind of "hide away" the original empty array from the inner document as shown, but it's really difficult to remove all presence of the outer "names" array without pulling a similar conditional array "push" technique, and that really isn't a practical approach.
If all of this is just about sorting array elements in individual documents though, the aggregation framework should not be the tool to do this. It can be done as shown, but per document this is much easier to do in client side code.
Output:
{
"_id" : ObjectId("54b5cff8102f292553ce9bb5"),
"SizeNames" : 3,
"names" : [
{
"number" : 1,
"subnames" : [
{
"id" : "X",
"day" : 1
},
{
"id" : "Z",
"day" : 2
},
{
"id" : "Y",
"day" : 10
}
],
"list" : [
"A",
"B",
"C"
],
"SizeList" : 3,
"day" : 1
},
{
"number" : 3,
"subnames" : [
{
"id" : "Z",
"day" : 5
},
{
"id" : "X",
"day" : 8
}
],
"list" : [
"A",
"C"
],
"SizeList" : 2,
"day" : 2
},
{
"number" : 2,
"SizeList" : 0,
"day" : 5
}
]
}