Need to find quoteIds where having different values and keys should start with 1 or 2 or 3. Could you please help.
{
"quoteId": 1,
"screening": {
"101": 1,
"201": 1,
"301": 1,
"100": 1,
"200": 1,
"300": 1,
"111": 1,
"211": 1,
"311": 1
}
}
{
"quoteId": 2,
"screening": {
"101": 1,
"201": 1,
"301": 1,
"100": 1,
"200": 1,
"300": 1,
"111": 1,
"211": 2,
"311": 1
}
}
$set - Create screenings array field, by converting object (key-value pair) to multiple documents (via $objectToArray) and fulfill the regex with starting of 1 or 2 or 3 in $filter.
$match - Filter documents that screenings is not an empty array.
$unset - Remove screenings field.
db.collection.aggregate([
{
$set: {
screenings: {
$filter: {
input: {
"$objectToArray": "$screening"
},
cond: {
"$regexMatch": {
input: "$$this.k",
regex: "^(1|2|3)"
}
}
}
}
}
},
{
$match: {
screenings: {
$ne: []
}
}
},
{
$unset: "screenings"
}
])
Sample Mongo Playground
Related
I am attempting to prepare aggregation query for faster deep nested elements count , collection is pretty big(100M docs / 1TB / mongodb 4.4) so any $unwind's make the task very slow , please, advice if there is any option to use $reduce / $filter or other faster option:
Example document:
{
"_id": ObjectId("5c05984246a0201286d4b57a"),
f: "x",
"_a": [
{
"_onlineStore": {}
},
{
"_p": [
{
"pid": 1,
"s": {
"a": {
"t": [
{
id: 1,
"dateP": "20200-09-20",
lang: "EN"
},
{
id: 2,
"dateP": "20200-09-20",
lang: "En"
}
]
},
"c": {
"t": [
{
id: 3,
lang: "en"
},
{
id: 4,
lang: "En"
},
{
id: 5,
"dateP": "20300-09-23"
}
]
}
},
h: "Some data"
}
]
}
]
}
I need to count number of "_a[]._p[]._s.c.t[]" array elements where lang: $in:["En","en" ,"EN","En","eN"]
Note elements under "_a._p._s.a.t" or "_a._p._s.d.t" shall not be included in the count ...
Expected result 1:
{ count:2}
Expected result 2:
{
id: 3,
lang: "en"
},
{
id: 4,
lang: "En"
}
Please, advice?
Thanks
1.Extended example that need to be fixed playground (count expected to be 8)
Here is my unwind version , but for big collection it looks pretty expensive:
2. Playground unwind version ( expensive )
db.myCollection.aggregate([
{
$project: {
count: {
$size: {
$filter: {
input: "$_a._p.s.t",
as: "t",
cond: { $ne: ["$$t", null] }
}
}
}
}
}
])
Documents
{ color: 'red',
value: {
red: {
level1: {
level2: 5
}}}}
{ color: 'blue',
value: {
blue: {
level1: {
level2: 8
}}}}
How to aggregate the values of value.red.level1.level2 and value.blue.level1.level2?
The keys red and blue come from the key color.
#turivishal requested more info:
I want to use $bucket.
{ '$bucket': {
groupBy: '$value.*red*.level1.level2',
boundaries: [1,2,3,4,5,6,7,8,9],
output: {
count: { '$sum': 1 }}}}
The expected result would be
[{ id: 5, count: 1}, { id: 8, count: 1 }]
You can access it by converting it to an array of objects,
$objectToArray to convert an object to an array of objects that will convert in k (key) v (value) format
$arrayElemAt to get first element from an array, you can use it directly in $bucket's groupBy property
db.collection.aggregate([
{
$addFields: {
value: { $objectToArray: "$value" }
}
},
{
$bucket: {
groupBy: { $arrayElemAt: ["$value.v.level1.level2", 0] },
boundaries: [1, 2, 3, 4, 5, 6, 7, 8, 9],
output: {
count: { $sum: 1 }
}
}
}
])
Playground
In the second approach, you can use all operations in the direct $bucket's groupBy property using $let operator,
If you are not using projection stages before $bucket then you can use this approach to avoid the more stages
db.collection.aggregate([
{
$bucket: {
groupBy: {
$let: {
vars: { value: { $objectToArray: "$value" } },
in: { $arrayElemAt: ["$$value.v.level1.level2", 0] }
}
},
boundaries: [1, 2, 3, 4, 5, 6, 7, 8, 9],
output: {
count: { $sum: 1 }
}
}
}
])
Playground
I ended up using $addField and $ifNull.
$ifNull takes an array and returns the first value that is not null.
Using mongodb, I have a collection of documents where each document has a fixed length vector of floating point values such as below:
items = [
{"id": "1", "vec": [1, 2, 0]},
{"id": "2", "vec": [6, 4, 1]},
{"id": "3", "vec": [3, 2, 2]},
]
I would like to take the row wise average of these vectors. In this example I would expect the result to return
[ (1 + 6 + 3) / 3, (2 + 4 + 2) / 3, (0 + 1 + 2) / 3 ]
This answer is very close to what I am looking for, but as far as I can tell it will only work on vectors of size 2. mongoDB - average on array values
An answer has been provided that is not very performant for large arrays. For context I am using ~700 dimension vectors.
This should work: https://mongoplayground.net/p/PKXqmmW31nW
[
{
$group: {
_id: null,
a: {
$push: {
$arrayElemAt: ["$vec", 0]
}
},
b: {
$push: {
$arrayElemAt: ["$vec", 1]
}
},
c: {
$push: {
$arrayElemAt: ["$vec", 2]
}
}
}
},
{
$project: {
a: {
$avg: "$a"
},
b: {
$avg: "$b"
},
c: {
$avg: "$c"
}
}
}
]
Which outputs:
[
{
"_id": null,
"a": 3.3333333333333335,
"b": 2.6666666666666665,
"c": 1
}
]
Here's a more efficient without $avg operator. I'll leave other answer up for reference.
https://mongoplayground.net/p/rVERc8YjKZv
db.collection.aggregate([
{
$group: {
_id: null,
a: {
$sum: {
$arrayElemAt: ["$vec", 0]
}
},
b: {
$sum: {
$arrayElemAt: ["$vec", 1]
}
},
c: {
$sum: {
$arrayElemAt: ["$vec", 2]
}
},
totalDocuments: {
$sum: 1
}
}
},
{
$project: {
a: {
$divide: ["$a", "$totalDocuments"]
},
b: {
$divide: ["$b", "$totalDocuments"]
},
c: {
$divide: ["$c", "$totalDocuments"]
}
}
}
])
You can use $unwind to get values into separate documents, the key is to keep the index of the values. Then you can use $group by the index and calculate the average using the $avg operator.
db.collection.aggregate([
{
$unwind: {
path: "$vec",
includeArrayIndex: "i" // unwind and keep index
}
},
{
$group: {
_id: "$i", // group by index
avg: { $avg: "$vec" }
}
}, // at this stage, you already get all the values you need, in separate documents. The following stages will put all the values in an array
{
$sort: { _id: 1 }
},
{
$group: {
_id: null,
avg: { $push: "$avg" }
}
}
])
Mongo Playground
I need to write a MongoDB aggregation pipeline to count the objects having arrays containing two type of values:
>=10
>=20
This is my dataset:
[
{ values: [ 1, 2, 3] },
{ values: [12, 1, 3] },
{ values: [1, 21, 3] },
{ values: [1, 2, 29] },
{ values: [22, 9, 2] }
]
This would be the expected output
{
has10s: 4,
has20s: 3
}
Mongo's $in (aggregation) seems to be the tool for the job, except I can't get it to work.
This is my (non working) pipeline:
db.mytable.aggregate([
{
$project: {
"has10s" : {
"$in": [ { "$gte" : [10, "$$CURRENT"]}, "$values"]}
},
"has20s" : {
"$in": [ { "$gte" : [20, "$$CURRENT"]}, "$values"]}
}
},
{ $group: { ... sum ... } }
])
The output of $in seems to be always true. Can anyone help?
You can try something like this:
db.collection.aggregate([{
$project: {
_id: 0,
has10: {
$size: {
$filter: {
input: "$values",
as: "item",
cond: { $gte: [ "$$item", 10 ] }
}
}
},
has20: {
$size: {
$filter: {
input: "$values",
as: "item",
cond: { $gte: [ "$$item", 20 ] }
}
}
}
}
},
{
$group: {
_id: 1,
has10: { $sum: "$has10" },
has20: { $sum: "$has20" }
}
}
])
Using $project with $filter to get the actual elements and then via $size to get the array length.
See it working here
I have such a collection of documents:
{ _id: 1, Meters: { gasmeter: 1000.0 } }
{ _id: 2, Meters: { gasmeter: 1007.0 } }
{ _id: 3, Meters: { gasmeter: 1010.0 } }
And I am trying to get the difference between the gasmeter elements as such:
{ difference: 7 } // Difference between _id=1 and _id=2
{ difference: 3 } // Difference between _id=2 and _id=3
I've tried:
db.MeterData.aggregate([{$project: { item: 1, difference: {$subtract: [ {$add: [ "$Meters.gasmeter","$Meters.gasmeter" ] }, "$Meters.gasmeter" ] }}}])
But this does not work. I still get the same values.
Any idea how to do that aggregation with mongodb?
Since I'm not entirely sure how you wish to solve the problem (in C# or mongo itself). I've come up with this in Mongo.
db.MeterData.find().forEach(
function (doc) {
var next = db.Sum.findOne({
_id: {
"$gt": NumberInt(doc._id)
}
})
if (next != null) {
var diff = next.Meters.gasmeter - doc.Meters.gasmeter;
printjson("{ difference : " + diff + "} - Difference between Id=" + doc._id + " and Id=" + next._id);
}
})
Which returns
"{ difference : 7} - Difference between Id=1 and Id=2"
"{ difference : 3} - Difference between Id=2 and Id=3"
I hope this helps you a bit.
Starting in Mongo 5, it's a perfect use case for the new $setWindowFields aggregation operator:
// { id: 1, gasmeter: 1000 }
// { id: 2, gasmeter: 1007 }
// { id: 3, gasmeter: 1010 }
db.collection.aggregate([
{ $setWindowFields: {
sortBy: { id: 1 },
output: {
difference: {
$push: "$gasmeter",
window: { range: [-1, "current"] }
}
}
}},
// { id: 1, gasmeter: 1000, difference: [1000] }
// { id: 2, gasmeter: 1007, difference: [1000, 1007] }
// { id: 3, gasmeter: 1010, difference: [1007, 1010] }
{ $match: { $expr: { $eq: [{ $size: "$difference" }, 2] } } },
// { id: 2, gasmeter: 1007, difference: [1000, 1007] }
// { id: 3, gasmeter: 1010, difference: [1007, 1010] }
{ $set: {
difference: { $subtract: [{ $last: "$difference" }, { $first: "$difference" }] }
}}
])
// { id: 2, gasmeter: 1007, difference: 7 }
// { id: 3, gasmeter: 1010, difference: 3 }
This:
starts with a $setWindowFields aggregation stage which adds the difference field in each document (output: { difference: { ... }})
as an array $push of gasmeters ($sum: "$gasmeter")
from the specified span of documents (the window) which here is the "current" document and the previous one "-1": window: { range: [-1, "current"] }
then filters out the first document that doesn't have a previous document (i.e. we can't diff its value with a previous document's value):
to do so we simply check that the size of the array produced with the previous stage is 2: { $match: { $expr: { $eq: [{ $size: "$difference" }, 2] } } },
and finally substracts the 2 values in the array to get the actual difference: $subtract: [{ $last: "$difference" }, { $first: "$difference" }]