Querying range from collection mongodb - mongodb

I can query the first 20 datapoints from my collection using the following code
db.collections.aggregate([{$project: {"text": 1}}, {$limit:20}])
How do I query a range from my collection? Let's say, from data points 20 through 40?

Below are the possible ways :
Using db..find()
db.collection.find({},{"text": 1}).skip(20).limit(20);
Using aggregation framework
db.collection.aggregate([{$project: {"text": 1}}, { $skip : 20 }, {$limit:20}])

Related

Complex Conditional mongodb query to match value if exists

I'm trying a condition if it's possible or not needed some answers if it's can be done.
Let's say I have a collection named : Resturant
Resturant
id
name
food
so and i have 4 rows :
1 , resturant1, burger
2 , resturant2, sandwich
3 , resturant2, burger
4 , resturant3, burger
So what i'm trying to achieve here is from a single query to fetch resturant1 & resturan2 values like
{"$match": {"$expr": {"$in": ["resturant1", "resturant2"]}
but if the food already exists in resturant1 then don't fetch that value from resturant2 so if burger already exists in resturant1 then do not fetch it from resturant2. so the result will be only 2 rows :
1 , resturant1, burger
2 , resturant2, sandwich
We can achieve this after fetching the result and overriding the already exists values but i was seeing if we can use any condition in mongodb query itself.
One option is using $setWindowFields (since mongodb version 5.0):
$match the relevant documents by name
Use $setWindowFields to temporarily add a set of all food types up to this document
$match only documents with food that is not in the document food.
Remove the added field
db.collection.aggregate([
{$match: {name: {$in: ["resturant1", "resturant2"]}}},
{$setWindowFields: {
sortBy: {name: 1},
output: {foodSet: {$addToSet: "$food", window: {documents: ["unbounded", -1]}}}
}},
{$match: {$expr: {$not: {$in: ["$food", "$foodSet"]}}}},
{$unset: "foodSet"}
])
See how it works on the playground example

Comparing max of column with another column

I'm trying to compare two string columns in Mongodb using aggregate. My job is to list all docs which have a field(field1) greater than max of another field/column(field2).
db.collection.aggregate([
{"$match": {"id": "XXX"}},
{"$match": {"$expr": {'$gt': ["$field1", {"$max": '$field2'}]}}},
{"$project": {"id":1, "field1":1}}])
This doesn't seem to work. I'm using MongoDB 4.

MongoDB sort is slow for non-index dynamic field

Following is my MongoDB query to show the organization listing along with the user count per organization. As per my data model, the "users" collection has an array userOrgMap which maintains the organizations ( by orgId) to which the user belongs to. The "organization" collection doesn't store the list of assigned users in its collection. The "users" collection has 11,200 documents and the "organizations" has 10,500 documents.
db.organizations.aggregate([
{$lookup : {from:"users",localField:"_id", foreignField:"userOrgMap.orgId",as:"user" }},
{ $project : {_id:1,name:1,"noOfUsers":{$size:"$user"}}},
{$sort:{noOfUsers:-1},
{$limit : 15},
{$skip : 0}
]);
Without the sort, the query works fast. With the sort, the query works very slow. It takes around 200 secs.
I even tried another way which is also taking more time.
db.organizations.aggregate([
{$lookup : {from:"users",localField:"_id", foreignField:"userOrgMap.orgId",as:"user" }},
{$unwind:"$user"}
{$group :{_id:"$_id"},name:{"$firstName":"$name"},userCount:{$sum:1}},
{$sort:{noOfUsers:-1},
{$limit : 15},
{$skip : 0}
]);
For the above query, without the $sort itself takes more time.
Need help on how to solve this issue.
Get the aggregation to use an index that begins with noOfUsers as I do not see a $match stage here.
The problem is resolved. I created an index on "userOrgMap.orgId". The query is fast now.

mongodb statistics 5 million doc too slow

5 million mongo doc:
{
_id: xxx,
devID: 123,
logLevel: 5,
logTime: 1468464358697
}
indexes:
devID
my aggregate:
[
{$match: {devID: 123}},
{$group: {_id: {level: "$logLevel"}, count: {$sum: 1}}}
]
aggregate result:
{ "_id" : { "level" : 5 }, "count" : 5175872 }
{ "_id" : { "level" : 1 }, "count" : 200000 }
aggregate explain:
numYields:42305
29399ms
Q:
if mongo without writing(saving) data, it will take 29 seconds
if mongo is writing(saving) data, it will take 2 minutes
my aggregate result need to reply to web, so 29sec or 2min are too long
How can i solve it? preferably 10 seconds or less
Thanks all
In your example, the aggregation query for {devID: 123, logLevel:5} returns a count of 5,175,872 which looks like it counted all the documents in your collection (since you mentioned you have 5 million documents).
In this particular example, I'm guessing that the {$match: {devID: 123}} stage matches pretty much every document, hence the aggregation is doing what is essentially a collection scan. Depending on your RAM size, this could have the effect of pushing your working set out of memory, and slow down every other query your server is doing.
If you cannot provide a more selective criteria for the $match stage (e.g. by using a range of logTime as well as devID), then a pre-aggregated report may be your best option.
In general terms, a pre-aggregated report is a document that contains the aggregated information you require, and you update this document every time you insert into the related collection. For example, you could have a single document in a separate collection that looks like:
{log:
{devID: 123,
levelCount: [
{level: 5, count: 5175872},
{level: 1, count: 200000}
]
}}
where that document is updated with the relevant details every time you insert into the log collection.
Using a pre-aggregated report, you don't need to run the aggregation query anymore. The aggregated information you require is instead available using a single find() query instead.
For more examples on pre-aggregated reports, please see https://docs.mongodb.com/ecosystem/use-cases/pre-aggregated-reports/

MongoDB - sort by subdocument match

Say I have a users collection in MongoDB. A typical user document contains a name field, and an array of subdocuments, representing the user's characteristics. Say something like this:
{
"name": "Joey",
"characteristics": [
{
"name": "shy",
"score": 0.8
},
{
"name": "funny",
"score": 0.6
},
{
"name": "loving",
"score": 0.01
}
]
}
How can I find the top X funniest users, sorted by how funny they are?
The only way I've found so far, was to use the aggregation framework, in a query similar to this:
db.users.aggregate([
{$project: {"_id": 1, "name": 1, "characteristics": 1, "_characteristics": '$characteristics'}},
{$unwind: "$_characteristics"},
{$match: {"_characteristics.name": "funny"}},
{$sort: {"_characteristics.score": -1}},
{$limit: 10}
]);
Which seems to be exactly what I want, except for the fact that according to MongoDB's documentation on using indexes in pipelines, once I call $project or $unwind in an aggregation pipeline, I can no longer utilize indexes to match or sort the collection, which renders this solution somewhat unfeasible for a very large collection.
I think you are half way there. I would do
db.users.aggregate([
{$match: { 'characteristics.name': 'funny' }},
{$unwind: '$characteristics'},
{$match: {'characteristics.name': 'funny'}},
{$project: {_id: 0, name: 1, 'characteristics.score': 1}},
{$sort: { 'characteristics.score': 1 }},
{$limit: 10}
])
I add a match stage to get rid of users without the funny attribute (which can be easily indexed).
unwind and match again to keep only the certain part of the data
keep only the necessary data with project
sort by the correct score
and limit the results.
that way you can use an index for the first match.
The way I see it, if the characteristics you are interested about are not too many, IMO it would be better to have your structure as
{
"name": "Joey",
"shy": 0.8
"funny": 0.6
"loving": 0.01
}
That way you can use an index (sparse or not) to make your life easier!