MongoDB Aggregation: Combine two arrays - mongodb

I have the following type of documents stored in a collection.
{
"_id" : "318036:2014010100",
"data": [
{"flow": [6, 10, 12], "occupancy": [0.0356, 0.06, 0.0856], time: 0},
{"flow": [2, 1, 4], "occupancy": [0.01, 0.0056, 0.0422], time: 30},
...
]
}
I want to compute an aggregated value from the first, second, ..., nth value in the flow and occupancy arrays. The order within the array should be preserved. Assuming I want compute the sum. The result should look like the following:
{
"_id" : "318036:2014010100",
"data": [
{"flow": [6, 10, 12], "occupancy": [0.0356, 0.06, 0.0856], sum: [6.0356, 10.006, 12.00856], time: 0},
{"flow": [2, 1, 4], "occupancy": [0.01, 0.0056, 0.0422], sum: [2.01, 1.0056, 4.0422], time: 30},
...
]
}
I tried to solve this by using the aggregation framework but my current approach does not preserve the ordering and produces to much sums.
db.sens.aggregate([
{$match: {"_id":/^318036:/}},
{$limit: 1},
{$unwind: "$data"},
{$unwind: "$data.flow"},
{$unwind: "$data.occupancy"},
{
$group: {
_id: {id: "$_id", time: "$data.time", o: "$data.occupancy", f: "$data.flow", s: {$add: ["$data.occupancy", "$data.flow"]}}
}
},
{
$group: {
_id: {id: "$_id.id", time: "$_id.time"}, occ: { $addToSet: "$_id.o"}, flow: {$addToSet: "$_id.f"}, speed: {$addToSet: "$_id.s"}
}
}
])
I am not sure if it is possible to solve this problem with the aggregation framework, so a solution using MapReduce would also be fine. How can I produce the desired result?

An alternative solution with neither aggregation framework nor map/reduce:
db.sens.find().forEach(function (doc) {
doc.data.forEach(function(dataElement) {
var sumArray = [];
for (var j = 0; j < dataElement.flow.length; j++) {
sumArray[j] = dataElement.flow[j] + dataElement.occupancy[j];
}
dataElement.sum = sumArray;
collection.save(doc);
});
});

Related

mongo - get count of a intermediary document

currently I have a query:
const result = await getInstances();
that provides me an array of document:
[{name: "first", age: 13},
{name: "second", age: 21},
{name: "third", age: 11},
{name: "fourth", age: 14}
...]
The query goes something like this:
...
return Instances.aggregate
.match({//condition})
.skip(skipValue).limit(pageSize) // pagination done here
I want a query that appends a count for the total no. of documents before the pagination, but returns the paginated data, e.g:
...
return Instances.aggregate
.match({//condition}) ## I WANT THE COUNT OF THIS STEP TO BE APPENDED
.<SOME_PIPELINE_HERE>
.skip(skipValue).limit(pageSize) // pagination done here
would return something like:
{
data: [{name: "first", age: 12}....<ALL_PAGINATED_DATA>],
totalCount: 54 #count of data before pagination
}
What I tried and didn't work:
Instances.aggregate()
.match({//CONDITION})
.addFields({count: {$size: _id}})
.skip(value).limit(value)
It seems it goes through and calculates this for each document instead of the whole
One option is to use $facet in order to "fork" the query in the middle, so the same data can be used on different pipelines. For example:
db.collection.aggregate([
{$match: {a: {$in: [7, 8, 9]}}},
{
$facet: {
total: [{$group: {_id: null, count: {$sum: 1}}}],
data: [{$skip: 1}]
}
},
{$project: {data: 1, total: {$first: "$total.count"}}}
])
See how it works on the playground example

Is it possible to update a document using its own properties?

I've tried and to no avail, heres the example:
db.examples.insert({
_id: 1,
topLevelValue: 10,
values: [
{value: 10},
{value: 14},
{value: 5}
]
})
db.examples.updateOne({_id: 1}, {
$set: {
'values.$.calculatedValue': {
$divide: ['values.$.value', '$topLevelValue']
}
}
})
The expected outcome is that all the values have a new field calculatedValue which is the result of its value divided by the topLevelValue
I am not saying this SHOULD work, I am saying that if what I want to do is possible via mongodb updates.

Sum reversed Arrays on Mongodb

Supposing I have the following situation on mongodb:
{
"_id" : 654321,
"first_name" : "John",
"demand" : [1, 20, 4, 10 ],
"group" : [1, 2]
}
{
"_id" : 654321,
"first_name" : "Bert",
"demand" : [4, 10 ],
"group" : [1, 3]
}
1 - Is it possible to groupby based on the first index of "group" array ([1]) ?
2- Is it possible to reverse the index order, and sum those demand arrays vertically ?
Desired output:
1 - Select only group.0 : 1
2 - reverse the array order $reverseArray
[1, 20, 4, 10 ] -> [10, 4, 20, 1] (reversed)
[4, 10] -> [10, 4] (reversed)
3 - Sum (vertical axis)
[20, 8, 20, 1]
Finally, return the normal order:
[1, 20, 8, 20]
1 - Is it possible to groupby based on the first index of "group"
array ([1]) ?
To get the first index position (i.e., 0; array indexes start from 0) use the $arrayElemAt aggregation operator:
db.collection.aggregate([ { $group: { _id: { $arrayElemAt: [ "$group", 0 ] } } }, ] )
2- Is it possible to reverse the index order, and sum those demand
arrays vertically ?
You can reverse an array using the $reverseArray aggregation array operator.
To get the sum of values of each array's element position, (i) get the index of each value with unwind, and finally (ii) group by the index and sum the values.
db.collection.aggregate( [
{
$addFields: {
demand: { $reverseArray: "$demand" }
}
},
{
$unwind: { path: "$demand", includeArrayIndex: "ix" }
},
{
$group: {
_id: "$ix",
sum: { $sum: "$demand" }
}
},
{
$sort: { _id: 1 } // this is optional; sorts by index position
}
] )

How to add every other columns together in Mongo?

I've been cracking my head over the addition of every 'other' columns together during aggregation in Mongo.
A sample of my data:
[
{'item': 'X',
'USA': 3,
'CAN': 1,
'CHN': 1,
'IDN': 1,
:
:
:
},
{'item': 'R',
'USA': 2,
'CAN': 2,
'CHN': 1,
'IDN': 2,
:
:
:
}
]
At the aggregate stage, I would like to have a new field called 'OTHER', which is the resultant of the summation of all the fields that are not specified.
My desired result is this:
[
{'item': 'X',
'NAM': 79,
'IDN': 51,
'OTHER': 32
},
{'item': 'R',
'NAM': 42,
'IDN': 11,
'OTHER': 20
}
]
So far, the closest I could get is using this:
mycoll.aggregate([
{'$addFields':{
'NAM': {'$add':[{'$ifNull':['$CAN', 0]},{'$ifNull':['$USA', 0]}]},
'INDIA': {'$ifNull':['$IDN', 0]},
'OTHER': /* $add all the fields that are not $USA, $CAN, $IDN*/
}},
])
Mongo gurus, please enlighten this poor soul. Deeply appreciate it. Thanks!
In general the idea is converting your document to an array so we could iterate over it while ignoring unwanted fields.
{
'$addFields': {
'NAM': {'$add': [{'$ifNull': ['$CAN', 0]}, {'$ifNull': ['$USA', 0]}]},
'INDIA': {'$ifNull': ['$IDN', 0]},
"OTHER": {
$reduce:
{
input: {"$objectToArray": "$$ROOT"},
initialValue: {sum: 0},
in: {
sum: {
$cond: {
if: {$in: ["$$this.k", ['_id', "item", "CAN", "USA", "IDN"]]},
then: "$$value.sum",
else: {$add: ["$$value.sum", "$$this.v"]}
}
}
}
}
}
}
}
Obivously you should also add any other fields that you have in your document that you do not want to sum up / are not of type number.

Remove all leafs from graph

I've built a relations graph in a MongoDB collection, for example:
{ "user_id": 1, "follower_id": 2 }
{ "user_id": 1, "follower_id": 3 }
{ "user_id": 2, "follower_id": 1 }
{ "user_id": 2, "follower_id": 3 }
{ "user_id": 3, "follower_id": 4 }
{ "user_id": 5, "follower_id": 2 }
This represents a directed graph like this:
Is there an efficient way to remove "leafs" from the graph? In the example I'd like to remove node 4 from the graph, because that node only has one link with node 3 and remove node 5 because only node 2 links to it.
Or to say it with graph terminology: only keep vertices with indegree > 1 or outdegree > 1
Short answer would be no - there is no efficient way to do what you want with schema like this. It can be by iterating over all nodes, for example using aggregation framework, and removing nodes as separate operation but I think it is all what can be done. Assuming nodes are in graph collection it could be something like below but it is far from effective:
db.graph.aggregate(
{$project: {index: {$const: [0, 1]}, user_id: 1, follower_id: 1}},
{$unwind: "$index"},
{$project: {id: {$cond: [{$eq: ["$index", 0 ]}, "$user_id", "$follower_id"]} }},
{$group: {_id: "$id", count: {$sum: 1}}},
{$match: {count: {$lte: 1}}}
).result.forEach(function(node) { db.graph.remove({user_id: node._id});})
You could use more document-like schema if you want operations like this to be efficient.
{
user_id: 1,
follows: [2, 3],
followed_by: [2]
}