MongoDB query: keys and their distinct values (each key independently)? - mongodb

If you have this collection of objects:
{ "a": 10, "b": 20, "c": 30 }
{ "a": 11, "b": 20, "c": 31 }
{ "a": 10, "b": 20, "c": 31 }
There is a way to get distinct values, for example, for field "a":
[10, 11]
There is also a way to get distinct values of any tuple, for example, for pairs of ("b", "c"):
[
{"b": 20, "c": 30},
{"b": 20, "c": 31}
]
Is there a way to query distinct values for each field individually in a single query?
For example, I can simply use query 1 above 3 times for "a", "b", "c":
[10, 11]
[20]
[30, 31]
But I guess it might be less efficient and there should be a better option.
Bonus: How to do it if the list of fields is not known upfront?
Ideally, the single query should return all keys and their distinct values:
{
"a": [10, 11],
"b": [20],
"c": [30, 31]
}

Assuming you don't know the full list of the fields beforehand, you need to use $objectToArray to convert the $$ROOT document into an array of k-v tuples. Then group by the field name and $addToSet the values.
db.collection.aggregate([
{
"$project": {
_id: 0,
arr: {
"$objectToArray": "$$ROOT"
}
}
},
{
"$unwind": "$arr"
},
{
$match: {
"arr.k": {
$ne: "_id"
}
}
},
{
$group: {
_id: "$arr.k",
values: {
"$addToSet": "$arr.v"
}
}
}
])
Mongo Playground

Related

How to add calculated fields inside subdocuments without using unwind?

I'm looking for a simple solution to add a field to a subdocument without using $unwind and $group.
I need only to calculate the sum and the size of a nested subdocuments and show it in a new field.
This is my starting collection:
{
"_id": ObjectId("5a934e000102030405000000"),
"subDoc": [
{
"a": 1,
"subArray": [1,2,3]
},
{
"b": 2
}
]
},
{
"_id": ObjectId("5a934e000102030405000001"),
"subDoc": [
{
"a": 1,
"subArray": [4,5,6]
},
{
"b": 2
},
{
"c": 3,
"subArray": [8,8,8]
}
]
}
And this is my desired result, where I've added sum (sum of subArray) and size (number of elements in subArray):
{
"_id": ObjectId("5a934e000102030405000000"),
"subDoc": [
{
"a": 1,
"subArray": [1,2,3],
"sum": 6,
"size": 3
},
{
"b": 2
"sum": 0,
"size": 0
}
]
},
{
"_id": ObjectId("5a934e000102030405000001"),
"subDoc": [
{
"a": 1,
"subArray": [4,5,6],
"sum": 15,
"size": 3
},
{
"b": 2,
"sum": 0,
"size": 0
},
{
"c": 3,
"subArray": [8,8],
"sum": 16,
"size": 2
}
]
}
I know how to obtain this result using $unwind and then $group, but I'd like to know if there is any other way (or a better way!) to achieve the same result. I've tried using $addFields and $map without success.
Working playground example: https://mongoplayground.net/p/fK8t6SLlOHa
$map to iterate loop of subDoc array
$sum to get total of subArray array of numbers
$ifNull to check if field is not present or null then return empty array because $size operator only allows array input
$size to get total elements in subArray array
$mergeObjects to merge current object with new added fields
db.collection.aggregate([
{
$addFields: {
subDoc: {
$map: {
input: "$subDoc",
in: {
$mergeObjects: [
"$$this",
{
sum: { $sum: "$$this.subArray" },
size: {
$size: { $ifNull: ["$$this.subArray", []] }
}
}
]
}
}
}
}
}
])
Playground

Get value from filed if another filed match condition MongoDB

for example i have such structure of document:
{
"_id": "1230987",
"Z": [{
"A": [{
"B": {
"C": [{
"E": "2104331180",
"D": "boroda.jpg"
}, {
"E": "1450987095",
"D": "small.PNG"
}]
},
}],
}]
}
How could i get value from field E if value in field D matches condition ?
Use an $elemMatch project:
db.collection.find({"Z.A.B.C.D":<condition>},{"Z,A,B,C":{$elemMatch:{D:<condition>} }})
Playground

Get number of documents per field in MongoDB, like Studio3T Schema operation

How do I obtain the distribution of fields among a MongoDB collection, i.e the total count of documents for each field (without knowing the fields) ?
E.g. considering these documents :
{ "doc": { "a": …, "b": … } }
{ "doc": { "a": …, "c": … } }
{ "doc": { "a": …, "c": …, "d": { "e": … } } }
I would like to get
{ "a": 3, "b": 1, "c": 2, "d": 1, "d.e": 1 }
Studio3T has a "Schema" feature which does exactly that (and a bit more) for a random sample of the DB, how is the query constructed ?
One way is to use db.collection.countDocuments() with the $exists operator:
db.collection.countDocuments({ a: { $exists: true});
db.collection.countDocuments({ b: { $exists: true});
db.collection.countDocuments({ c: { $exists: true});

MonogDB document structure: Map vs. Array for element-wise aggregations

We want to store ratings of a metric (say sales, profit) for some category (say city) in MondoDB. Example rating scale: [RED, YELLOW, GREEN], the length will be fixed. We are considering the following two document structures:
Structure 1: Ratings as an array
{
"_id": 1,
"city": "X",
"metrics": ["sales", "profit"],
"ratings" : {
"sales" : [1, 2, 3], // frequency of RED, YELLOW, GREEN ratings, fixed length array
"profit": [4, 5, 6],
},
}
{
"_id": 2,
"city": "X",
"metrics": ["sales", "profit"],
"ratings" : {
"sales" : [1, 2, 3], // frequency of RED, YELLOW, GREEN ratings, fixed length array
"profit": [4, 5, 6],
},
}
Structure 2: Ratings as a map
{
"_id": 1,
"city": "X",
"metrics": ["sales", "profit"],
"ratings" : {
"sales" : { // map will always have "RED", "YELLOW", "GREEN" keys
"RED": 1,
"YELLOW": 2,
"GREEN": 3
},
"profit" : {
"RED":4,
"YELLOW": 5,
"GREEN": 6
},
},
}
{
"_id": 2,
"city": "X",
"metrics": ["sales", "profit"],
"ratings" : {
"sales" : { // map will always have "RED", "YELLOW", "GREEN" keys
"RED": 1,
"YELLOW": 2,
"GREEN": 3
},
"profit" : {
"RED":4,
"YELLOW": 5,
"GREEN": 6
},
},
}
Our use case:
aggregate ratings grouped by city and metric
we do not intend to index on the "ratings" field
So for structure 1, to aggregate ratings, I need element-wise aggregations and it seems it will likely involve unwind steps or maybe map-reduce and the resulting document would look something like this:
{
"city": "X",
"sales": [2, 4, 6]
"profit": [8, 10, 12]
}
For structure 2, I think aggregation would be relatively straightforward using the aggregation pipeline, ex (aggregating just sales):
db.getCollection('Collection').aggregate([
{
$group: {
"_id": {"city": "$city" },
"sales_RED": {$sum: "$ratings.sales.RED"},
"sales_YELLOW": {$sum: "$ratings.sales.YELLOW"},
"sales_GREEN": {$sum: "$ratings.sales.GREEN"}
}
},
{
$project: {"_id": 0, "city": "$_id.city", "sales": ["$sales_RED", "$sales_YELLOW", "$sales_GREEN"]}
}
])
Would give the following result:
{
"city": "X",
"sales": [2, 4, 6]
}
Query:
I am tending towards the second structure mainly because I am not clear on how to achieve element-wise array aggregation in MOngoDB. From what I have seen it will probably involve unwinding. The second document structure will have a larger document size because of the repeated field names for the ratings but the aggregation itself is simple. Can you please point out, based on our use case, how would they compare in terms of computational efficiency, and if I am missing any points worth considering?
I was able to achieve the aggregation with the array structure using $arrayElemAt. (However, this still involves having to specify aggregations for individual array elements, which is the same as the case for document structure 2)
db.getCollection('Collection').aggregate([
{
$group: {
"_id": {"city": "$city" },
"sales_RED": {$sum: { $arrayElemAt: [ "$ratings.sales", 0] }},
"sales_YELLOW": {$sum: { $arrayElemAt: [ "$ratings.sales", 1] }},
"sales_GREEN": {$sum: { $arrayElemAt: [ "$ratings.sales", 2] }},
}
},
{
$project: {"_id": 0, "city": "$_id.city", "sales": ["$sales_RED", "$sales_YELLOW", "$sales_GREEN"]}
}
])

How to Merge Array and Document Field in Mongo DB

I have this document in MongoDB
[
{
"locations": [5, 5],
"id": "fff"
},
{
"locations": [7, 7],
"id": "aaa"
},
{
"locations": [9, 9],
"id": "ccc"
}
]
And I want to merge the array field and string field into a field that contains a combination of them like this
{
"device": [
["fff", 5, 5],
["aaa", 7, 7],
["ccc", 9, 9]
]
}
Is it possible to do this with aggregation? Thank you.
You can use $concatArrays to merge two fields and then $group to make it two dimensional array
db.collection.aggregate([
{ "$group": {
"_id": null,
"devices": { "$push": { "$concatArrays": [["$id"], "$locations"] } }
}}
])
Output
[
{
"devices": [
["fff", 5, 5],
["aaa", 7, 7],
["ccc", 9, 9]
]
}
]