Mongo count unknown amount of field labels - mongodb

assume I have following structure :
"KnownName" : {
"unknownName1" : {
"id" : "unknownName1",
"value" : "5"
},
"unknownName2" : {
"id" : "unknownName2",
"value" : "5"
},
"unknownName3" : {
"id" : "unknownName3",
"value" : "5"
},
"unknownName4" : {
"id" : "unknownName4",
"value" : "5"
},
"unknownName5" : {
"id" : "unknownName5_v2",
"value" : "5"
},
"unknownName6" : {
"id" : "unknownName6",
"value" : "5"
}
... many more documents as above in various ways
and I want to get all of these counted like this :
unknownName1 : 24
unknownName2 : 27
unknownName3 : 10
....
unknownName37 : 12
I do know my structure upon the 'KnownName' node, but within this node I can have several different labels (here unknownName 1 to 6) but there can be more or less, and they can be different by document. Typically the id in the array will have the same name as the array label but it's not a given (as in unknownName5).
I was looking for ways to get a distinct count of all these 'unknownNames' but this seems to be more challenging as expected.
Any advice on how this can be achieved (preferably using the aggregation framework)
If there is an easy way to get all (deep) children labelled as "id" in the "KnownName" tree without the need to know the unknown parent name it would work also for me. I'm aware there is no such thing as wildcards in mongo, but I'm looking for an alternative to something like KnownName.*.id

You need to start with $objectToArray since your keys are unknown. Then you'll get an array of keys and values that can be processed using $group to get counts. You can also use $replaceRoot and $arrayToObject to get dynamic keys in root object
db.col.aggregate([
{
$addFields: {
unknown: { $objectToArray: "$KnownName" }
}
},
{
$unwind: "$unknown"
},
{
$group: {
_id: "$unknown.k",
count: { $sum: 1 }
}
},
{
$sort: { _id: 1 }
},
{
$group: {
_id: null,
data: { $push: { k: "$_id", v: "$count" } }
}
},
{
$replaceRoot: {
newRoot: {
$arrayToObject: "$data"
}
}
}
])

Related

I need limited nested array in mongodb document

I have a document like
{
"deviceId" : "1106",
"orgId" : "5ffe9fe1c9e77c0006f0aad3",
"values" : [
{
"paramVal" : 105.0,
"dateTime" : ISODate("2021-05-05T09:18:08.000Z")
},
{
"paramVal" : 110.0,
"dateTime" : ISODate("2021-05-05T09:18:08.000Z")
},
{
"paramVal" : 115.0,
"dateTime" : ISODate("2021-05-05T10:18:08.000Z")
},
{
"paramVal" : 125.0,
"dateTime" : ISODate("2021-05-05T11:18:08.000Z")
},
{
"paramVal" : 135.0,
"dateTime" : ISODate("2021-05-05T12:18:08.000Z")
}
]
}
Now I need to filter a document which I can do easily with match or find but in that document the subarray i.e. values should have latest 2 values because in future the count can be more than 100.
the output should be like
{
"deviceId" : "1106",
"orgId" : "5ffe9fe1c9e77c0006f0aad3",
"values" : [
{
"paramVal" : 125.0,
"dateTime" : ISODate("2021-05-05T11:18:08.000Z")
},
{
"paramVal" : 135.0,
"dateTime" : ISODate("2021-05-05T12:18:08.000Z")
}
]
}
Try $slice operator, to select number of elements, pass negative value to select documents from below/last elements,
db.collection.aggregate([
{ $set: { values: { $slice: ["$values", -2] } } }
])
Playground
I need for the array values in sorted order by date
There is no straight way to do this, check the below aggregation query, but it will cause the performance issues, i would suggest to change you schema structure to manage this data order by date,
$unwind deconstruct values array
$sort by dateTime in descending order
$group by _id and reconstruct values array and return other required fields
$slice to select number of elements, pass negative value to select documents from below/last elements
db.collection.aggregate([
{ $unwind: "$values" },
{ $sort: { "values.dateTime": -1 } },
{
$group: {
_id: "$_id",
deviceId: { $first: "$deviceId" },
orgId: { $first: "$orgId" },
values: { $push: "$values" }
}
},
{ $set: { values: { $slice: ["$values", 2] } } }
])
Playground

How to convert multiple documents from a single collection to a single document containing one array

I have an aggregation pipeline that nearly does what I want. I've used match / unwind / project / sort to get 99% of the way. It is returning multiple documents:
[
{
"_id" : 254.8
},
{
"_id" : 93.7
},
{
"_id" : 89.9
},
{
"_id" : 94.15
},
{
"_id" : 102.1
},
{
"_id" : 93.9
},
{
"_id" : 102.7
}
]
Note: I've added the array brackets and commas to make it more readable, but you can also read it as:
{
"_id" : 254.8
}
{
"_id" : 93.7
}
{
"_id" : 89.9
}
{
"_id" : 94.15
}
{
"_id" : 102.1
}
I need the contents of the ID fields from all 7 documents in an array of values in one document:
{values: [254.8, 93.7, 89.9, 94.15, 102.1, 93.9, 102.7]}
It would be easy to sort this with JS once I have the results but I'd rather do it in the pipeline if possible so my JS stays 100% generic and only returns pure pipeline data.
Here is what you need to complete the job:
db.collection.aggregate([
{
"$group": {
"_id": null,
"values": {
$push: "$_id"
}
}
},
{
"$project": {
_id: false
}
}
])
The result will be:
[
{
"values": [
254.8,
93.7,
89.9,
94.15,
102.1,
93.9,
102.7
]
}
]
https://mongoplayground.net/p/pTmR_rni0J1

MongoDB Query: How to get the all values of a nested field where parent node is dynamic?

Here is the sample of one of the documents in our collection in MongoDB. I need to get all the values of the NUMBER field from all the documents in this collection with a MongoDB Query. What could be that Query?
{
"_id" : "5w1669ba-3f8a-4695-a585-9fa510d13e59",
"display_title" : "SWE Test Series!",
"production_year" : "2020",
"type" : "series",
"created_timestamp" : 1597940264,
"seasons" : {
"8c399fbc-dc65-4c2e-b86c-5c6289835b45" : {
"number" : "1",
"uuid" : "8c399fbc-dc65-4c2e-b86c-5c6289835b45",
"created_timestamp" : 1597940441
}
}
}
You can do that.
Play
objectToArray makes it easy.
db.collection.aggregate([
{
"$project": {
"a": {
"$objectToArray": "$seasons"
}
}
},
{
"$unwind": "$a"
},
{
$project: {
"a.v.number": 1
}
}
])

Add field to documents after $sort aggregation pipeline which include its index in sorted list using MongoDb aggregation

I want to get the order of some user from a list after $sort aggregation pipeline.
Let's say we have a leaderboard, and I need to get my rank in the leaderboard with only one query getting only my data.
I have tried $addFields and some queries with $map
Let's say we have these documents
/* 1 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f067"),
"name" : "x4",
"points" : 69
},
/* 2 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f07b"),
"name" : "x24",
"points" : 968
},
/* 3 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f06a"),
"name" : "x7",
"points" : 997
},
And I want to write a query like this
db.table.aggregate(
[
{ $sort : { points : 1 } },
{ $addFields: { order : "$index" } },
{ $match : { name : "x24" } }
]
)
I need to inject the order field with something like $index
I expect to have something like this in return
{
"_id" : ObjectId("5d5963e1c6c93b2da849f07b"),
"name" : "x24",
"points" : 968,
"order" : 2
}
I need something like the metadata of the result here which return 2
/* 2 createdAt:8/18/2019, 4:42:41 PM*/
One of the workaround for this situation is to convert your all documents into one single array and hence resolve the index of the document using this array with help of $unwind and finally project the data with fields as required.
db.collection.aggregate([
{ $sort: { points: 1 } },
{
$group: {
_id: 1,
register: { $push: { _id: "$_id", name: "$name", points: "$points" } }
}
},
{ $unwind: { path: "$register", includeArrayIndex: "order" } },
{ $match: { "register.name": "x4" } },
{
$project: {
_id: "$register._id",
name: "$register.name",
points: "$register.points",
order: 1
}
}
]);
To make it more efficient you can apply limit, match, and filter as per your requirement.

Obtaining $group result with group count

Assuming I have a collection called "posts" (in reality it is a more complex collection, posts is too simple) with the following structure:
> db.posts.find()
{ "_id" : ObjectId("50ad8d451d41c8fc58000003"), "title" : "Lorem ipsum", "author" :
"John Doe", "content" : "This is the content", "tags" : [ "SOME", "RANDOM", "TAGS" ] }
I expect this collection to span hundreds of thousands, perhaps millions, that I need to query for posts by tags and group the results by tag and display the results paginated. This is where the aggregation framework comes in. I plan to use the aggregate() method to query the collection:
db.posts.aggregate([
{ "$unwind" : "$tags" },
{ "$group" : {
_id: { tag: "$tags" },
count: { $sum: 1 }
} }
]);
The catch is that to create the paginator I would need to know the length of the output array. I know that to do that you can do:
db.posts.aggregate([
{ "$unwind" : "$tags" },
{ "$group" : {
_id: { tag: "$tags" },
count: { $sum: 1 }
} }
{ "$group" : {
_id: null,
total: { $sum: 1 }
} }
]);
But that would discard the output from previous pipeline (the first group). Is there a way that the two operations be combined while preserving each pipeline's output? I know that the output of the whole aggregate operation can be cast to an array in some language and have the contents counted but there may be a possibility that the pipeline output may exceed the 16Mb limit. Also, performing the same query just to obtain the count seems like a waste.
So is obtaining the document result and count at the same time possible? Any help is appreciated.
Use $project to save tag and count into tmp
Use $push or addToSet to store tmp into your data list.
Code:
db.test.aggregate(
{$unwind: '$tags'},
{$group:{_id: '$tags', count:{$sum:1}}},
{$project:{tmp:{tag:'$_id', count:'$count'}}},
{$group:{_id:null, total:{$sum:1}, data:{$addToSet:'$tmp'}}}
)
Output:
{
"result" : [
{
"_id" : null,
"total" : 5,
"data" : [
{
"tag" : "SOME",
"count" : 1
},
{
"tag" : "RANDOM",
"count" : 2
},
{
"tag" : "TAGS1",
"count" : 1
},
{
"tag" : "TAGS",
"count" : 1
},
{
"tag" : "SOME1",
"count" : 1
}
]
}
],
"ok" : 1
}
I'm not sure you need the aggregation framework for this other than counting all the tags eg:
db.posts.aggregate(
{ "unwind" : "$tags" },
{ "group" : {
_id: { tag: "$tags" },
count: { $sum: 1 }
} }
);
For paginating through per tag you can just use the normal query syntax - like so:
db.posts.find({tags: "RANDOM"}).skip(10).limit(10)