MongoDB / MongoEngine: Get 2 Counts in 1 Query - mongodb

I'm trying to minimize the number of database calls in an application.
Is it possible to complete these two queries in a single call?
system_0 = System.objects(platform_id=platform_id, type=0).count()
system_1 = System.objects(platform_id=platform_id, type=1).count()

I do not know what is mongoengine, but I think you will be capable of translating my mongo shell answer to what is appropriate for you. Yes, you can achieve it with aggregation. For example:
db.collection.aggregate([{
$match : { platform_id : ... }
}, {
$group: {
_id: "$type",
count: { $sum: 1 }
}
}]);
If you have more types than 0, 1, you can exclude them in $match as well.

Related

What is the best practice to find mongo documents count?

Wanted to know the performance difference between countDocument and find query.
I have to find the count of documents based on certain filter, which approach will be better and takes less time?
db.collection.countDocuments({ userId: 12 })
or
db.collection.find({ userId: 12 }) and then using the length of resulted array.
You should definitely use db.collection.countDocuments() if you don't need the data. This method uses an aggregation pipeline with the filters you pass on and only returns the count so you don't waste processing and time waiting for an array with all results.
This:
db.collection.countDocuments({ userId: 12 })
Is equivalent to:
db.collection.aggregate([
{ $match: { userId: 12 } },
{ $group: { _id: null, n: { $sum: 1 } } }
])

Why sort document by id is slower with $match than not in mongodb?

So, I tried to query
db.collection('collection_name').aggregate([
{
$match: { owner_id: '5be9b2f03ef77262c2bd49e6' }
},
{
$sort: { _id: -1 }
}])
the query above takes up 20s
but If I tried to query
db.collection('collection_name').aggregate([{$sort : {_id : -1}}])
it's only take 0.7s
Why does it the one without $match is actually faster than without match ?
update :
when I try this query
db.getCollection('callbackvirtualaccounts').aggregate([
{
$match: { owner_id: '5860457640b4fe652bd9c3eb' }
},
{
$sort: { created: -1 }
}
])
it's only takes 0.781s
Why sort by _id is slower than by created field ?
note : I'm using mongodb v3.0.0
db.collection('collection_name').aggregate([
{
$match: { owner_id: '5be9b2f03ef77262c2bd49e6' }
},
{
$sort: { _id: -1 }
}])
This collection probably won't be having and index on owner_id; Try using below mentioned index creation query and rerun your previous code.
db.collection('collection_name').createIndexes({ owner_id:1}) //Simple Index
or
db.collection('collection_name').createIndexes({ owner_id:1,_id:-1}) //Compound Index
**Note:: If you don't know how to compound index yet, you can create simple indexes individually on all keys which are used either in match or sort and that should be making query efficient as well.
The query speed depends upon a lot of factors. The size of collection, size of the document, indexes defined on the collection (and used in the queries and properly), the hardware components (like CPU, RAM, network) and other processes running at the time the query is running.
You have to tell what indexes are defined on the collection being discussed for further analysis. The command will retrieve them: db.collection.getIndexes()
Note the unique index on the _id field is created by default, and cannot be modified or deleted.
(i)
But If I tried to query: db.collection.aggregate( [ { $sort : { _id : -1 } } ] ) it's
only take 0.7s.
The query is faster because there is an index on the _id field and it is used in sort process. Aggregation queries use indexes with sort stage and when this sort happens early in the pipeline. You can verify if the index is used or not by generating a query plan (use explain with executionStats mode). There will be an index scan (IXSCAN) in the generated query plan.
(ii)
db.collection.aggregate([
{
$match: { owner_id: '5be9b2f03ef77262c2bd49e6' }
},
{
$sort: { _id: -1 }
}
])
The query above takes up 20s.
When I try this query it's only takes 0.781s.
db.collection.aggregate([
{
$match: { owner_id: '5860457640b4fe652bd9c3eb' }
},
{
$sort: { created: -1 }
}
])
Why sort by _id is slower than by created field ?
Cannot come to any conclusions with the available information. In general, the $match and $sort stages present early in the aggregation query can use any indexes created on the fields used in the operations.
Generating a query plan will reveal what the issues are.
Please run the explain with executionStats mode and post the query plan details for all queries in question. There is documentation for Mongodb v3.0.0 version on generation query plans using explain: db.collection.explain()

MongoDB, right projection subfield [duplicate]

Is it possible to rename the name of fields returned in a find query? I would like to use something like $rename, however I wouldn't like to change the documents I'm accessing. I want just to retrieve them differently, something that works like SELECT COORINATES AS COORDS in SQL.
What I do now:
db.tweets.findOne({}, {'level1.level2.coordinates': 1, _id:0})
{'level1': {'level2': {'coordinates': [10, 20]}}}
What I would like to be returned is:
{'coords': [10, 20]}
So basically using .aggregate() instead of .find():
db.tweets.aggregate([
{ "$project": {
"_id": 0,
"coords": "$level1.level2.coordinates"
}}
])
And that gives you the result that you want.
MongoDB 2.6 and above versions return a "cursor" just like find does.
See $project and other aggregation framework operators for more details.
For most cases you should simply rename the fields as returned from .find() when processing the cursor. For JavaScript as an example, you can use .map() to do this.
From the shell:
db.tweets.find({},{'level1.level2.coordinates': 1, _id:0}).map( doc => {
doc.coords = doc['level1']['level2'].coordinates;
delete doc['level1'];
return doc;
})
Or more inline:
db.tweets.find({},{'level1.level2.coordinates': 1, _id:0}).map( doc =>
({ coords: doc['level1']['level2'].coordinates })
)
This avoids any additional overhead on the server and should be used in such cases where the additional processing overhead would outweigh the gain of actual reduction in size of the data retrieved. In this case ( and most ) it would be minimal and therefore better to re-process the cursor result to restructure.
As mentioned by #Neil Lunn this can be achieved with an aggregation pipeline:
And starting Mongo 4.2, the $replaceWith aggregation operator can be used to replace a document by a sub-document:
// { level1: { level2: { coordinates: [10, 20] }, b: 4 }, a: 3 }
db.collection.aggregate(
{ $replaceWith: { coords: "$level1.level2.coordinates" } }
)
// { "coords" : [ 10, 20 ] }
Since you mention findOne, you can also limit the number of resulting documents to 1 as such:
db.collection.aggregate([
{ $replaceWith: { coords: "$level1.level2.coordinates" } },
{ $limit: 1 }
])
Prior to Mongo 4.2 and starting Mongo 3.4, $replaceRoot can be used in place of $replaceWith:
db.collection.aggregate(
{ $replaceRoot: { newRoot: { coords: "$level1.level2.coordinates" } } }
)
As we know, in general, $project stage takes the field names and specifies 1 or 0/true or false to include the fields in the output or not, we also can specify the value against a field instead of true or false to rename the field. Below is the syntax
db.test_collection.aggregate([
{$group: {
_id: '$field_to_group',
totalCount: {$sum: 1}
}},
{$project: {
_id: false,
renamed_field: '$_id', // here assigning a value instead of 0 or 1 / true or false effectively renames the field.
totalCount: true
}}
])
Stages (>= 4.2)
$addFields : {"New": "$Old"}
$unset : {"$Old": 1}

How to count the number of documents on date field in MongoDB

Scenario: Consider, I have the following collection in the MongoDB:
{
"_id" : "CustomeID_3723",
"IsActive" : "Y",
"CreatedDateTime" : "2013-06-06T14:35:00Z"
}
Now I want to know the count of the created document on the particular day (say on 2013-03-04)
So, I am trying to find the solution using aggregation framework.
Information:
So far I have the following query built:
collection.aggregate([
{ $group: {
_id: '$CreatedDateTime'
}
},
{ $group: {
count: { _id: null, $sum: 1 }
}
},
{ $project: {
_id: 0,
"count" :"$count"
}
}
])
Issue: Now considering above query, its giving me the count. But not based on only date! Its taking time as well into consideration for unique count.
Question: Considering the field has ISO date, Can any one tell me how to count the documents based on only date (i.e excluding time)?
Replace your two groups with
{$project:{day:{$dayOfMonth:'$createdDateTime'},month:{$month:'$createdDateTime'},year:{$year:'$createdDateTime'}}},
{$group:{_id:{day:'$day',month:'$month',year:'$year'}, count: {$sum:1}}}
You can read more about the date operators here: http://docs.mongodb.org/manual/reference/aggregation/#date-operators

Getting first and last element of array in MongoDB

Mongo DB: I'm looking to make one query to return both the first and last element of an array. I realize that I can do this multiple queries, but I would really like to do it with one.
Assume a collection "test" where each objects has an array "arr" of numbers:
db.test.find({},{arr:{$slice: -1},arr:{$slice: 1}});
This will result in the following:
{ "_id" : ObjectId("xxx"), "arr" : [ 1 ] } <-- 1 is the first element
Is there a way to maybe alias the results? Similar to what the mysql AS keyword would allow in a query?
This is not possible at the moment but will be with the Aggregation Framework that's in development now if I understand your functional requirement correctly.
You have to wonder about your schema if you have this requirement in the first place though. Are you sure there isn't a more elegant way to get this to work by changing your schema accordingly?
This can be done with the aggregation framework using the operators $first and $last as follows:
db.test.aggregate([
{ '$addFields': {
'firstElem': { '$first': '$arr' },
'lastElem': { '$last': '$arr' }
} }
])
or using $slice as
db.test.aggregate([
{ '$addFields': {
'firstElem': { '$slice': [ '$arr', 1 ] },
'lastElem': { '$slice': [ '$arr', -1 ] }
} }
])