how to do aggregate after doing createIndexex in mongodb?

how to do aggregate after doing createIndexex in mongodb? - mongodb

I want to optimize my query as much as possible. So I found a method called "createIndexes". But I am not getting how to use it along with "aggregate".
db.createIndexes({age:1})
.then(_ => {
return db.aggregate(
[
{"$match":{"age":age}},
{"$group":{
_id: '$name',
hobby:{$addToSet:'$hobby'}
}},
{$project:{
_id:0,
name:'$_id',
hobbyCount:{$size:'$hobby'}
}}
]
})
I want to put the index in the age column and then want to do the aggregate operation but the indexing is not happening. Anyone any idea why this is not working?

createIndexes() is an one time operation. Once you call this method, the indexes will be built and all the subsequent queries will use the indexes whenever possible.
That said, createIndexes does not affect how you call normally aggregate.

Related

Is it possible to $setOnInsert with aggregation pipeline?

MongoDB has recently added an option to perform an update operation by providing an aggregation pipeline rather than the standard modifier object. Check MongoDB's docs on this topic.
The ability to use aggregation pipeline, whose statements can refer to existing document properties, can be extremely useful in situations when certain fields needs to be evaluated based on other fields, e.g. during data migration.
Moreover, most of the standard update operators like $set, $push, $inc, etc. can be successfully replicated with the aggregation expression language so in some sense this new functionality generalizes the good old modifiers technique. Though, I must admit the pipeline can become quite verbose if one tries to do things like $addToSet. This of course brings up a whole bunch of performance related questions, but let's ignore them for now.
So far, there's been just one thing which I haven't been able to fully replicate with the aggregation pipeline update, namely the $setOnInsert operator. Let's assume that I want to perform an upsert:
db.test.update(selector, pipeline, { upsert: true });
My initial intuition was that the $$ROOT variable (which I can use in the pipeline) will equal null unless there exists a document that matches selector. Unfortunately, but probably for a good reason, MongoDB developers decided that $$ROOT should be derived from selector by default. It makes sense when you think about how normal $setOnInsert works, but it also makes it practically impossible to distinguish between an update and an insert within pipeline.
I know what you're thinking. You can look at $$ROOT._id. This is a good idea, though if _id is part of the selector it doesn't work anymore. I have figured out that this can be bypassed by tricking MongoDB a little bit and doing things like:
selector = {
_id: { $in: [value, 'fake'] },
}
instead if the simpler { _id: value }, but this doesn't look clean. Please note that if $in only contains one element, then Mongo is actually clever enough to figure out what the identifier should be and it populates $$ROOT accordingly (sic!).
I am wondering if anyone has a better idea how to approach this. Maybe there's some hidden variable that I could potentially use inside the pipeline itself to distinguish between update and insert (e.g. in $merge stage there's $$new variable which serves a similar purpose)?

If there is no matching documents, $$ROOT will have only _id field. So you can transform $$ROOT to array by its key/value pairs and check if the size of that array is equal to 1. If it is then create a new document, and if it is not then do nothing.
$objectToArray and $size to convert $$ROOT to an array by its key/value pairs and to get the size of that array
$cond to check if the size of the array above is equal to 1. If it is then merge current $$ROOT (which is only _id field) with the update object. If it is not, return the current $$ROOT. In both scenarios, put result in result feild.
$mergeObjects to merge $$ROOT and the update that you are sending, and put that in the result field
$replaceRoot to replace root to the result field from previous stage
db.collection.update({
_id: 1
},
[
{
$set: {
result: {
$cond: {
if: {
"$eq": [
{
$size: {
$objectToArray: "$$ROOT"
},
},
1
]
},
then: {
$mergeObjects: [
"$$ROOT",
{
key: 3
}
]
},
else: "$$ROOT"
},
}
}
},
{
$replaceRoot: {
newRoot: "$result"
}
}
],
{
upsert: true
})
Working example

getting the latest xx records with mongoose, How to order them?

I'm trying to get the last 20 records of user collection with mongoose:
User.find({'owner': req.params.id}).
sort(date:'-1').
limit(20).
exec(.....)
This works well, show the last 20 items.
But the items inside the array are sorted from the most recent to the oldest, Is there any way to reverse this with mongoose?
Thanks

You can certainly do this with an aggregation, such as this:
db.user.aggregate[(
{ $match : {"owner" : req.params.id}},
{ $sort : {"date" : -1}},
{ $limit : 20},
{ $sort : {"date" : 1}}
])
Notes on this aggregation:
The first three parts do the same job as the Find in your question
The fourth part applies a further sort, which re-orders the returned 20 records from oldest to most recent
I have written it in native MongoDB aggregation syntax; you will need to adjust the code to generate the same aggregation from Mongoose.
Update: I think this is not possible with a find() with cursor methods, because you would need two different sort() operations. But, MongoDB does not treat them as a sequence of independent operations; the docs give an example of methods written in one order — sort().limit() — being equivalent to the opposite order — limit().sort(), showing that the order cannot be relied upon as meaningful.

Find total and select only latest 20 , may be this is not effective way you found , but this will solve your problem.
User.count({'owner': req.params.id},function(err,count){
if(count){
var skipItem=count-20;
User.find({'owner': req.params.id}).
.skip(skipItem)
.limit(20)
.sort(date:'1').
exec(.....)
}
});

db.users.aggregate([
{ $match: {
'owner': req.params.id
}},
{ $unwind: '[arrayFieldName]' },
{ $sort: {
'[arrayFieldName]': -1/1,
'date':-1
}}
])

$group which _id equals null or Array.prototype.length?

After performing a aggregation operation on a Mongo collection, my last step is to get the length of the array result. Now I have two options:
Use one more $group stage which _id equals null:
db.col.aggregate([
// ...,
{
$group: {
_id: null,
length: { $sum: 1},
},
},
]);
Or use the .length method:
db.col.aggregate([
// ...
]).length;
Both of them work well and give me the expected result. I just wonder which way is better in term of performance. What do you think?

I would use the .length method as it's likely to be an attribute in the JS Array object (it might depend on the JS engine your code is using).
I believe that using $group will make the mongo engine to process all the data and then count how many document it returns, which would be much slower.
As felix said, you can run a small benchmark and see which option is faster.

Meteor collection get last document of each selection

Currently I use the following find query to get the latest document of a certain ID
Conditions.find({
caveId: caveId
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
How can I use the same using multiple ids with $in for example
I tried it with the following query. The problem is that it will limit the documents to 1 for all the found caveIds. But it should set the limit for each different caveId.
Conditions.find({
caveId: {$in: caveIds}
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
One solution I came up with is using the aggregate functionality.
var conditionIds = Conditions.aggregate(
[
{"$match": { caveId: {"$in": caveIds}}},
{
$group:
{
_id: "$caveId",
conditionId: {$last: "$_id"},
diveDate: { $last: "$diveDate" }
}
}
]
).map(function(child) { return child.conditionId});
var conditions = Conditions.find({
_id: {$in: conditionIds}
},
{
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});

You don't want to use $in here as noted. You could solve this problem by looping through the caveIds and running the query on each caveId individually.

you're basically looking at a join query here: you need all caveIds and then lookup last for each.
This is a problem of database schema/denormalization in my opinion: (but this is only an opinion!):
You could as mentioned here, lookup all caveIds and then run the single query for each, every single time you need to look up last dives.
However I think you are much better off recording/updating the last dive inside your cave document, and then lookup all caveIds of interest pulling only the lastDive field.
That will give you immediately what you need, rather than going through expensive search/sort queries. This is at the expense of maintaining that field in the document, but it sounds like it should be fairly trivial as you only need to update the one field when a new event occurs.

is it possible to use "$where" in mongodb aggregation functions

I need to get the length of a string value in MongoDB using aggregation functions.
it works in
db.collection_name.find({"$where":"this.app_name.length===12"})
but when implanted to
db.collection_name.aggregate({$match:
{"$where":"this.app_name.length===12"}
},
{
$group :
{
_id : 1,
app_downloads : {$sum: "$app_downloads"}
}
}
);
I got this result:
failed: exception: $where is not allowed inside of a $match aggregation expression
The question is: is it possible to use $where in aggregation functions?
or is there any way of getting the length of a string value in aggregation function?
Thanks in advance
Eric

MongoDB doesn't support $where in aggregation pipeline and hope this will never happen, because JavaScript slows things down. Never the less, you still have options:
1) Мaintain additional field(e.g. app_name_len) than will store app_name length and query it, when needed.
2) You can try extremely slow MapReduce framework, where you allowed to write aggregations with JavaScript.

Today I had the same problem.
Mongodb doesn't support this.app_name.length, but you can do this condition with $regex - this is not very quick, but it still works.
{"app_name": { $regex: /^.{12}$/ }}

A simple way to achieve the behaviour expected of OP would be chaining up $expr with $strLenCP
db.collection.find({
$expr: {
$eq: [
12,
{
$strLenCP: "$app_name"
}
]
}
})
Mongo Playground