Averaging across list indicies during aggregation pipeline?

Averaging across list indicies during aggregation pipeline? - mongodb

I currently have a MongoDB aggregation pipeline that ends with the following type of synthetic documents
[
{
'_id': '2019-09-10',
'grouped_foos':
[
{... 'foo': [1, 78, 100]},
{... 'foo': [8, 66, 98]},
{... 'foo': [99, 5, 33]},
{... 'foo': [120, 32, 2]}
]
},
{
'_id': '2019-09-09',
'grouped_foos':
[
{... 'foo': [10, 27]},
{... 'foo': [19, 66]}
]
},
{
'_id': '2019-09-08',
'grouped_foos':
[
{... 'foo': [1]}
]
}
]
I would like to continue this pipeline and average the indices of the foo lists together to form documents that look like
[
{
'_id': '2019-09-10',
'avg_foo': [57, 45.25, 58.25]
},
{
'_id': '2019-09-09',
'avg_foo': [14.5, 46.5]
},
{
'_id': '2019-09-08',
'avg_foo': [1]
}
]
Is this type of averaging possible during aggregation? Do I potentially need to $unwind the lists with indexing and assign new _id for uniqueness to make documents that look like
[
{
'_id': UUID,
'old_id': '2019-09-10',
'foo': 1,
'index': 0
},
{
'_id': UUID,
'old_id': '2019-09-10',
'foo': 78,
'index': 1
},
........
]

Basically you can try with $unwind but easier and faster approach would be to use $reduce to $map and $sum all the rows from grouped_foos. Then you'll be able to run another $map and use $divide to get the average.
db.collection.aggregate([
{
$project: {
size: { $size: "$grouped_foos" },
foo_sum: {
$reduce: {
input: "$grouped_foos",
initialValue: [],
in: {
$map: {
input: { $range: [ 0, { $size: "$$this.foo" }, 1 ] },
as: "index",
in: {
$add: [
{ $arrayElemAt: [ "$$this.foo", "$$index" ] },
{ $ifNull: [ { $arrayElemAt: [ "$$value", "$$index" ] }, 0 ] }
]
}
}
}
}
}
}
},
{
$project: {
_id: 1,
avg_foo: {
$map: {
input: "$foo_sum",
in: {
$divide: [ "$$this", "$size" ]
}
}
}
}
}
])
Mongo Playground

Related

Mongoose aggregate: how to create fields dynamically from the user request

Please someone help me! I can't find the solution in documentation or other topics.
I'm using mongodb aggregation in Mongoose/Nest.js project to return the document data with some formatting and filtering. I have the structure of the mongo document like
{
_id: '1',
outputs: [
{
fileName: 'fileName1',
data: [
{
columnName1: 3,
columnName2: 4,
........
columnName30: 5
},
{
columnName1: 1,
columnName2: 2,
........
columnName30: 3
},
...........
]
},
{
fileName: 'fileName1',
data: [
{
columnName1: 3,
columnName2: 4,
........
columnName30: 5
},
{
columnName1: 1,
columnName2: 2,
........
columnName30: 3
},
...........
]
}
........
]
}
I've already done some formatting, but now I need to include to the response only requested by the user fields (columnNamesToChoose). And filter their values depending on gte, lte of mainColumnName. Inside $project I was going to use some mapping like this, but it doesn't work. Could you please help me to fix this part of code?
...columnNamesToChoose.map((columnName) => ({ [columnName]: {
$map: {
input: {
$filter: {
input: '$outputs.data',
as: 'item',
cond: {
$and: [
{ $gte: [`$$item.${mainColumnName}`, gte] },
{ $lte: [`$$item.${mainColumnName}`, lte] },
],
},
},
},
as: 'file',
in: `$$file.${columnName}`,
},
} })),
This is the full code of aggregation:
mainColumnName = 'column1' (from the body of the user request)
columnNamesToChoose = ['column2', 'column5'] (from the body of the user request)
myModel.aggregate([
{
$match: { _id: Number(id) },
},
{ $unwind: '$outputs' },
{
$match: { 'outputs.fileName': fileName },
},
{
$project: {
_id: '$_id',
fileName: '$outputs.fileName',
[mainColumnName]: {
$map: {
input: {
$filter: {
input: '$outputs.data',
as: 'item',
cond: {
$and: [
{ $gte: [`$$item.${mainColumnName}`, gte] },
{ $lte: [`$$item.${mainColumnName}`, lte] },
],
},
},
},
as: 'file',
in: `$$file.${mainColumnName}`,
},
},
},
},
])
My result:
{
"0": {
"column2": [
4,
2,
1,
5
]
},
"1": {
"column5": [
1,
8,
9,
0
]
},
"_id": 1,
"fileName": "somefilename.txt",
"column1": [
3,
1,
2,
20
],
}
Expected result:
{
"_id": 1,
"fileName": "somefilename.txt",
"column1": [
3,
1,
2,
20
],
"column2": [
4,
2,
1,
5
],
"column5": [
1,
8,
9,
0
],
}

One option is to first $reduce and then $unwind, $match and $group, where the $group stage is built dynamically on the code (for-loop) according to the input:
db.collection.aggregate([
{$match: {_id: id}},
{$project: {
outputs: {
$reduce: {
input: "$outputs",
initialValue: [],
in: {
$concatArrays: [
"$$value",
{$cond: [
{$eq: ["$$this.fileName", fileName]},
"$$this.data",
[]
]
}
]
}
}
}
}
},
{$unwind: "$outputs"},
{$match: {"outputs.columnName1": {$gte: gte, $lte: lte}}},
{$group: {
_id: 0,
column1: {$push: "$outputs.columnName1"},
column2: {$push: "$outputs.columnName2"},
column5: {$push: "$outputs.columnName5"}
}},
{$set: {fileName: fileName}}
])
See how it works on the playground example
On js it will look something like:
const matchStage = {$match: {}};
matchStage.$match[`outputs.${mainColumnName}`] = {$gte: gte, $lte: lte};
const groupStage = {$group: {_id: 0}};
for (const col of columnNamesToChoose ) {
groupStage.$group[col] = {$push: `"$outputs.${col}"`}
};
const aggregation = [
{$match: {_id: id}},
{$project: {
outputs: {$reduce: {
input: "$outputs",
initialValue: [],
in: {$concatArrays: [
"$$value",
{$cond: [
{$eq: ["$$this.fileName", fileName]},
"$$this.data",
[]
]}
]}
}}
}},
{$unwind: "$outputs"},
matchStage,
groupStage,
{$set: {fileName: fileName}}
],
const res = await myModel.aggregate(aggregation)

More than one 'in' expression within $map

Been recently playing around with an array field within documents of a collection, and was wondering if I could apply more than one expression having in is specified in some way?
$project: {
"new_field": {
$map: {
input: "$test",
as: "item",
in: {$trim: {input: "$$item"}}
}
}
}
And trying to avoid performing another projection I was thinking of doing something like this, but just fails with An object representing an expression must have exactly one field.
$project: {
"new_field": {
$map: {
input: "$test",
as: "item",
in: {$trim: {input: "$$item"}, $concat: ["$$item", "testing"]}
}
}
}
Is my only option to just do another project step with all the fields?

You could apply more than one operation. But the output will not be on top of the first expression' result.
Sample Query:
db.collection.aggregate([
{
$project: {
adjustedGrades: {
$map: {
input: "$quizzes",
as: "grade",
in: [
{
$add: [
"$$grade",
2
]
},
{
$add: [
"$$grade",
3
]
}
]
}
}
}
}
])
Difference:
I specified in as array.
in: [
{
$add: [
"$$grade",
2
]
},
{
$add: [
"$$grade",
3
]
}
]
Sample input:
[
{
_id: 1,
quizzes: [
5,
6,
7
]
},
{
_id: 2,
quizzes: []
},
{
_id: 3,
quizzes: [
3,
8,
9
]
}
]
Sample output:
[
{
"_id": 1,
"adjustedGrades": [
[
7,
8
],
[
8,
9
],
[
9,
10
]
]
},
{
"_id": 2,
"adjustedGrades": []
},
{
"_id": 3,
"adjustedGrades": [
[
5,
6
],
[
10,
11
],
[
11,
12
]
]
}
]
Play

MongoDB - Update a parent array field using another child array field

I've a collection like this
db.aa1.insertMany([
{ parentArr: [] },
{ parentArr: [
{childArr: [ {childField: 2}, {childField: 4} ]}
] },
{ parentArr: [
{childArr: []}
] },
{ parentArr: [
{childArr: [ {childField: 3}, {childField: 5} ]}
] },
])
Now I want the end result to be like
[
{ parentArr: [] },
{ parentArr: [ { childArr: [] } ] },
{ parentArr: [
{
childArr: [ {childField: 2}, {childField: 4} ],
parentField: [2, 4]
},
] },
{ parentArr: [
{
childArr: [ {childField: 3}, {childField: 5} ],
parentField: [3, 5]
}
] },
]
Here I've copied the childArr.childField values in the parentArr.parentField.
Now in plain JS, I could do something like this
parentArr.forEach(p => p.parentField = p.childArr ? p.childArr.map(c => c.childField) : [])
How can I achieve this using a MongoDB Query?
I've tried the following $push $set combinations, of course, one at a time.
For the example sake, I've written all push and set together.
db.myCollection.update(
{
"parentArr.childArr.0": {$exists: true}
},
{
$set: {"parentArr.$[].parentField": ["$parentArr.$[].childArr.$[].childField"]}
$set: {"parentArr.parentField": ["$parentArr.childArr.childField"]}
$push: {
"parentArr.$[].parentField": {$each: ["$parentArr.$[].childArr.$[].childField"]}
}
$push: {
"parentArr.parentField": {$each: ["$parentArr.childArr.childField"]}
}
},
{
upsert: true,
multi: true
}
)

If you're using Mongo version 4.2+ they have introduced pipeline'd updates meaning we now have more power when updating:
db.aa1.updateMany(
{
"parentArr.childArr.childField": {$exists: true}
},
[
{
$set: {
"parentArr.parentField": {
$reduce: {
input: {
$map: {
input: "$parentArr",
as: "parent",
in: {
$map: {
input: "$$parent.childArr",
as: "child",
in: "$$child.childField"
}
}
}
},
initialValue: [],
in: {$setUnion: ["$$value", "$$this"]}
}
}
}
}
]
)
If you're on an older Mongo version then you'll have to do it in code, as you already posted a relevant snippet I have no more to add.

MongoDB aggregations. Get value differences

I've been struggling with mongo trying to find a solution to show the differences between values.
I have values like this:
[
{val: 1},
{val: 4},
{val: 7},
{val: 8},
{val: 11}
]
And I want to receive something like this:
[
{diff: 3},
{diff: 3},
{diff: 1},
{diff: 3}
]
Every value is evaluated by taking the next one (4) and subtracting the previous one (1). After all this, we receive 3 in output, which is located in the second list as the first item.
Is it possible to achieve it using MongoDB aggregations?

You need to group them into array, calculate diff and flatten again.
Pseudocode
//We $group here all values
var _data = [{val: 1}, {val: 4}, ..., {val: 11}];
//With $range operator we get nº of items
// We ensure even items, since odd items will return null as last item
var idx = [0, 1, 2, ..., n];
//Here we store diff items with $map operator
var data = [];
//$map
for (var i in idx) {
data[i] = _data[i+1] - _data[i];
}
//$unwind
{data:[0]}, {data[1]}, {data[2]}, ...
//$replaceRoot
{
data:{ {
diff : 3 --> diff : 3
} }
}
Add these steps into your pipeline:
db.collection.aggregate([
{
$group: {
_id: null,
data: { $push: "$$ROOT" }
}
},
{
$addFields: {
data: {
$map: {
input: {
$range: [
0,
{
$subtract: [
{ $size: "$data" },
{ $mod: [ { $size: "$data" }, 2 ] }
]
},
1
]
},
as: "idx",
in: {
diff: {
$subtract: [
{
$arrayElemAt: [
"$data.val",
{
$add: [ "$$idx", 1 ]
}
]
},
{
$arrayElemAt: [ "$data.val", "$$idx" ]
}
]
}
}
}
}
}
},
{
$unwind: "$data"
},
{
$replaceRoot: {
newRoot: "$data"
}
}
])
MongoPlayground

Mongodb aggregation - count arrays with elements having integer value greater than

I need to write a MongoDB aggregation pipeline to count the objects having arrays containing two type of values:
>=10
>=20
This is my dataset:
[
{ values: [ 1, 2, 3] },
{ values: [12, 1, 3] },
{ values: [1, 21, 3] },
{ values: [1, 2, 29] },
{ values: [22, 9, 2] }
]
This would be the expected output
{
has10s: 4,
has20s: 3
}
Mongo's $in (aggregation) seems to be the tool for the job, except I can't get it to work.
This is my (non working) pipeline:
db.mytable.aggregate([
{
$project: {
"has10s" : {
"$in": [ { "$gte" : [10, "$$CURRENT"]}, "$values"]}
},
"has20s" : {
"$in": [ { "$gte" : [20, "$$CURRENT"]}, "$values"]}
}
},
{ $group: { ... sum ... } }
])
The output of $in seems to be always true. Can anyone help?

You can try something like this:
db.collection.aggregate([{
$project: {
_id: 0,
has10: {
$size: {
$filter: {
input: "$values",
as: "item",
cond: { $gte: [ "$$item", 10 ] }
}
}
},
has20: {
$size: {
$filter: {
input: "$values",
as: "item",
cond: { $gte: [ "$$item", 20 ] }
}
}
}
}
},
{
$group: {
_id: 1,
has10: { $sum: "$has10" },
has20: { $sum: "$has20" }
}
}
])
Using $project with $filter to get the actual elements and then via $size to get the array length.
See it working here

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Averaging across list indicies during aggregation pipeline? - mongodb

Related

Mongoose aggregate: how to create fields dynamically from the user request

More than one 'in' expression within $map

MongoDB - Update a parent array field using another child array field

MongoDB aggregations. Get value differences

Mongodb aggregation - count arrays with elements having integer value greater than

Categories

Resources