I would like to highlight a list of _id documents (with a limit) ranked in descending order (via their timestamp) based on a list of ObjectId.
Corresponding to this:
db.collection.aggregate( [ { $match: { _id: { $in: [ObjectId("X"), ObjectId("Y") ] } } }, { $sort: { timestamp: -1 } }, { $group: { _id: "$_id" } }, { $skip: 0 }, { $limit: 100 } ] )
Knowing that the list from the loop may contain way more than 1000 ObjectId (in $in array), do you think my solution is viable? Is not there a faster and less resource intensive way?
Best Regards.
Related
I have the following data of users and model cars:
[
{
"user_id":"ebebc012-082c-4e7f-889c-755d2679bdab",
"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d":1,
"car_37c04124-cb12-436c-902b-6120f4c51782":0,
"car_b78ddcd0-1136-4f45-8599-3ce8d937911f":1
},
{
"user_id":"f3eb2a61-5416-46ba-bab4-459fbdcc7e29",
"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d":1,
"car_0d15eae9-9585-4f49-a416-46ff56cd3685":1
}
]
I want to see how many users have a car_ with the value 1 using mongodb, something like:
{"car_1a58db0b-5449-4d2b-a773-ee055a1ab24d": 2}
For this example.
The issue is that I will never know how are the fields car_ are going to be, they will have a random structure (wildcard).
Notes:
car_id and user_id are at the same level.
The car_id is not given, I simply want to know for the entire database which are the most commmon cars_ with value 1.
$group by _id and convert root object to array using $objectToArray,
$unwind deconstruct root array
$match filter root.v is 1
$group by root.k and get total count
db.collection.aggregate([
{
$group: {
_id: "$_id",
root: { $first: { $objectToArray: "$$ROOT" } }
}
},
{ $unwind: "$root" },
{ $match: { "root.v": 1 } },
{
$group: {
_id: "$root.k",
count: { $sum: 1 }
}
}
])
Playground
Assume I have a collection with millions of documents. Below is a sample of how the documents look like
[
{ _id:"1a1", points:[2,3,5,6] },
{ _id:"1a2", points:[2,6] },
{ _id:"1a3", points:[3,5,6] },
{ _id:"1b1", points:[1,5,6] },
{ _id:"1c1", points:[5,6] },
// ... more documents
]
I want to query a document by _id and return a document that looks like below:
{
_id:"1a1",
totalPoints: 16,
rank: 29
}
I know I can query the whole document, sort by descending order then get the index of the document I want by _id and add one to get its rank. But I have worries about this method.
If the documents are in millions won't this be 'overdoing' it. Querying a whole collection just to get one document? Is there a way to achieve what I want to achieve without querying the whole collection? Or the whole collection has to be involved because of the ranking?
I cannot save them ranked because the points keep on changing. The actual code is more complex but the take away is that I cannot save them ranked.
Total points is the sum of the points in the points array. The rank is calculated by sorting all documents in descending order. The first document becomes rank 1 and so on.
an aggregation pipeline like the following can get the result you want. but how it operates on a collection of millions of documents remains to be seen.
db.collection.aggregate(
[
{
$group: {
_id: null,
docs: {
$push: { _id: '$_id', totalPoints: { $sum: '$points' } }
}
}
},
{
$unwind: '$docs'
},
{
$replaceWith: '$docs'
},
{
$sort: { totalPoints: -1 }
},
{
$group: {
_id: null,
docs: { $push: '$$ROOT' }
}
},
{
$set: {
docs: {
$map: {
input: {
$filter: {
input: '$docs',
as: 'x',
cond: { $eq: ['$$x._id', '1a3'] }
}
},
as: 'xx',
in: {
_id: '$$xx._id',
totalPoints: '$$xx.totalPoints',
rank: {
$add: [{ $indexOfArray: ['$docs._id', '1a3'] }, 1]
}
}
}
}
}
},
{
$unwind: '$docs'
},
{
$replaceWith: '$docs'
}
])
I have a documents like
{
data: [{"channel":"712064846325219432","message":1019},{"channel":"712064884812021801","message":4}],
user: '290494169783205888',
},
{
data: [{"channel":"712064846325219432","message":2000},{"channel":"712064884812021801","message":500}],
user: '534099893979971584',
}
So how can I count data's message and sort this documents by descending message?
Use aggregation pipeline stages $unwind and $group to count the message for each user then sort by the total number of messages. Check the example.
db.collection.aggregate([
{
$unwind: {
path: "$data"
}
},
{
$group: {
_id: "$user",
total_message: {
$sum: "$data.message"
}
}
},
{
$sort: {
total_message: -1
}
}
])
Results:
[
{
"_id": "534099893979971584",
"total_message": 2500
},
{
"_id": "290494169783205888",
"total_message": 1023
}
]
you can use Query.sort()
For descending order you can either use -1, desc or descending
Query.sort(message: -1)
I have these documents in my collection
{
id:1,
small:[{k:'A',v:1},{k:'B',v:2},{k:'D',v:3}],
big:[{k:'A',v:2},{k:'B',v:3},{k:'C',v:1},{k:'D',v:4}]
},
{
id:2,
small:[{k:'A',v:1},{k:'B',v:2},{k:'D',v:3}],
big:[{k:'A',v:2},{k:'B',v:3},{k:'C',v:1},{k:'D',v:4}]
},
{
id:3,
small:[{k:'A',v:1},{k:'B',v:2},{k:'D',v:3}],
big:[{k:'A',v:2},{k:'B',v:3},{k:'C',v:1},{k:'D',v:4}]
}
Now, I want to get the sum for each key in both lists. I want my output to look like this:
{k:'A',small:3, big:6},
{k:'B',small:6, big:9},
{k:'D',small:9, big:12}
Notice that the output did not contain the key 'C'. This is because I only want to output the keys that are existing in the 'small' list. What mongodb functions
should I use for this?
Thanks!
Try below aggregation:
db.col.aggregate([
{ $unwind: "$small" },
{ $unwind: "$big" },
{ $redact: {
$cond: {
if: { $eq: [ "$small.k", "$big.k" ] },
then: "$$KEEP",
else: "$$PRUNE"
}
}
},
{
$group: { _id: "$small.k", small: { $sum: "$small.v" }, big: { $sum: "$big.v" } }
},
{
$sort: { "_id": 1 }
}
])
In general we need to have only one small and big in each document (that's why double $unwind). Then we want to keep only documents where keys are equal. That's the moment where C is filtered out - has no pair in small and we're utilizing $redact for that. Aggregation is just a $group with $sum.
Let's say that I want to aggregate and group by documents in MongoDb by the Description field.
Running the following (case-sensitive by default):
db['Products'].aggregate(
{ $group: {
_id: { 'Description': "$Description" },
count: { $sum: 1 },
docs: { $push: "$_id" }
}},
{ $match: {
count: { $gt : 1 }
}}
);
on my sample data gives me 1000 results, which is fine.
But now I expect that running a case-insensitive query (using $toLower) should give me less than or equal to 1000 results:
db['Products'].aggregate(
{ $group: {
_id: { 'Description': {$toLower: "$Description"} },
count: { $sum: 1 },
docs: { $push: "$_id" }
}},
{ $match: {
count: { $gt : 1 }
}}
);
But instead I get more than 1000 results. That can't be right, can it? More common entries should get grouped together to yield less number of total groupings ... I think.
So then probably my aggregation query is wrong! Which brings me to my question:
How should case-insensitive aggregation grouping in MongoDb be performed?
You approach to case-insensitive grouping is correct so perhaps your observation is not? ;)
Try this example:
// insert two documents
db.getCollection('test').insertOne({"name" : "Test"}) // uppercase 'T'
db.getCollection('test').insertOne({"name" : "test"}) // lowercase 't'
// perform the grouping
db.getCollection('test').aggregate({ $group: { "_id": { $toLower: "$name" }, "count": { $sum: 1 } } }) // case insensitive
db.getCollection('test').aggregate({ $group: { "_id": "$name", "count": { $sum: 1 } } }) // case sensitive
You may have a typo somewhere?
The documentation also states that
$toLower only has a well-defined behavior for strings of ASCII characters.
Perhaps that's what's biting you here?