I'm trying to sort the values in the collection "Vehicle" by the text field "condition", although it seems that the sorting is having no effect on the result set. I believe the syntax is correct, so what would be the cause of this problem?
db.Vehicle.aggregate([
{
$geoNear: {
near:[26.243640,-80.265397],
maxDistance: 2500/111.12,
query: { isActive: true, condition: {$in: ['New','Used']} },
distanceField: "distance",
limit: 10
}
},
{
$project: {condition: 1, distance: 1}
},
{
$sort: {condition: -1}
}
])
Related
I have all combination of compound indexes for this collection. The aggregattion query i used is:
db.products.aggregate( [
{
$facet: {
"categorizedByColor": [
{
$match: {
size: { $in : [50,60,70] },
brand: { $in : ["Raymond","Allen Solly","Van Heusen"] }
}
},
{
$bucket: {
groupBy: "$color",
default: "Other",
output: {
"count": { $sum: 1 }
}
}
}
],
"categorizedBySize": [
{
$match: {
color: { $in : ["Red","Green","Blue"] },
brand: { $in : ["Raymond","Allen Solly","Van Heusen"] }
}
},
{
$bucket: {
groupBy: "$size",
default: "Other",
output: {
"count": { $sum: 1 }
}
}
}
],
"categorizedByBrand": [
{
$match: {
color: { $in : ["Red","Green","Blue"] },
size: { $in : [50,60,70] }
}
},
{
$bucket: {
groupBy: "$brand",
default: "Other",
output: {
"count": { $sum: 1 }
}
}
}
],
"productResults": [
{
$match: {
color: { $in : ["Red","Green","Blue"] },
size: { $in : [50,60,70] },
brand: { $in : ["Raymond","Allen Solly","Van Heusen"] }
}
}
]
}
}
]);
This query took around 6s to populate the results. Is there any alterative approach available to use mongodb indexing?
Note: This aggregation query have more than 14 facet pipelines. For better understanding i have provided only 4 facet pipelines.
Sometimes 14 queries can do the job and sometimes not.
If the $facet is the first step in the aggregation pipeline, 14 queries are a more efficient option, but if this $facet is following a complex pipeline to create or filter these documents, there are alternatives to this $facet's $match. Sometimes One needs a snapshot of the db, which 14 queries can not give, since the db may change in-between.
Since we don't have any data of former actions in this pipeline, and the question is regarding alternatives that will allow to use the indexes to make the rest of the query faster, I can offer one option for example. It is hard to tell if it will be faster than other options, according to the data we have here, but it will allow to use the indexes, which is the main idea of the question:
The first step is according to both #Takis and #Gibbs smart suggestion.
The second phase will use the indexes to make the $facet's $match much easier, by marking in advance which document belongs to which $facet pipeline.
db.collection.aggregate([
{
$match: {$or: [
{size: {$in: [50, 60, 70]}},
{color: {$in: ["Red", "Green", "Blue"]}},
{brand: {$in: ["Raymond", "Allen Solly", "Van Heusen"]}}
]
}
},
{
$addFields: {
categorizedByColor: {
$cond: [{$and: [{$in: ["$size",[50, 60, 70]]},
{$in: ["$brand",["Raymond", "Allen Solly", "Van Heusen"]]}]
}, true, false]
},
categorizedBySize: {
$cond: [{$and: [{$in: [ "$color", ["Red", "Green", "Blue"]]},
{$in: ["$brand",["Raymond", "Allen Solly", "Van Heusen"]]}]
}, true, false]
},
categorizedByBrand: {
$cond: [{$and: [{$in: [ "$color", ["Red", "Green", "Blue"]]},
{$in: ["$size",[50, 60, 70]]}]
}, true, false]
},
productResults: {
$and: [{$in: ["$color", ["Red", "Green", "Blue"]]},
{$in: ["$size",[50, 60, 70]]},
{$in: ["$brand",["Raymond", "Allen Solly", "Van Heusen"]]}]
}
}
},
{
$facet: {
"categorizedByColor": [
{$match: {categorizedByColor: true}},
{
$bucket: {
groupBy: "$color",
default: "Other",
output: {"count": {$sum: 1}}
}
}
],
"categorizedBySize": [
{$match: {categorizedBySize: true}},
{
$bucket: {
groupBy: "$size",
default: "Other",
output: {"count": {$sum: 1}}
}
}
],
"categorizedByBrand": [
{$match: {categorizedByBrand: true}},
{
$bucket: {
groupBy: "$brand",
default: "Other",
output: {"count": {$sum: 1}}
}
}
],
"productResults": [{$match: {productResults: true}}]
}
}
])
Playground example
Going a step further, there is even a way to get these results in one query without the $facet step at all, by using $group with $push with $cond instead. This should iterate over the documents once, instead of 14 times, but may result in a large document (with duplicates of data per each categorization). The main idea of such a solution can be seen on this mongoDB playground. It is important to say that these methods are not necessarily better or worse than other. The "right" solution depends on your specific case and data, which we can't see here. You asked for alternative approaches which will allow to use the indexes, so I'm pointing some directions.
Facet stage by default cannot use indexes and will perform COLLSCAN (full scan) when executed.
Because of that, you should use filtering (and sorting) way earlier in your pipeline, in order to get the "common data" for all the sub-pipelines in $facet.
So, in your case, filters :
$match: {
color: { $in : ["Red","Green","Blue"] },
size: { $in : [50,60,70] },
brand: { $in : ["Raymond","Allen Solly","Van Heusen"] }
}
should be used as a first stage in pipeline, then followed by $facet.
Hope I was clear enough. :)
I am trying to run the aggregate query in Mongo using $addFields and $match
.aggregate([
{
$addFields: {
level: { $sum: '$members.level' },
},
},
{
$match: {
level: { $gte: level }
},
},
{
$project: {
_id: 0,
logo: 1,
name: 1,
level: 1,
id: '$_id',
joinType: 1,
countryId: 1,
minimumJoinLevel: 1,
membersCount: { $size: '$members' },
},
},
])
The issue is that level is not an indexed field and has been calculated in the query
My question is: how I can run this query efficiently, avoid "COLLSCAN" and make it "IXSCAN" execution
I would like to conditonally leave out fields in a response.
I have an aggregation query which uses geoNear to find the nearest POI and I would like to only retreive all the information if the distance between the query point and the POI is less than 500.
Let's assume I want to leave out "someField" if the distance is less than or equal to 500.
Here is what I have come up with:
db.pois.aggregate([
{
"$geoNear": {
"near": {
type: "Point",
coordinates: [49.607857, 6.129143]
},
"maxDistance": 0.5 * 1000,
"spherical": true,
"distanceField": "distance"
}
}, {
$project: {
_id:0,
"someField": {
$cond: [{$lte: ["$distance", 500]}, 1, 0 ]
}
}
}
]).pretty()
But instead of leaving the field out of the response, this query somehow replaces the value of "distance" with 0 or 1.
I would appreciate any help.
Starting in MongoDB 3.6, you can use the variable REMOVE in aggregation expressions to conditionally suppress a field.
$$REMOVE
Query:
db.pois
.aggregate([
{
$geoNear: {
near: {
type: "Point",
coordinates: [49.607857, 6.129143]
},
maxDistance: 0.5 * 1000,
spherical: true,
distanceField: "distance"
}
},
{
$project: {
_id: 0,
someField: {
$cond: {
if: { $lte: ["$distance", 500] },
then: "$$REMOVE",
else: "$distance"
}
}
}
}
])
.pretty();
I would like to get the documents with the N highest fields for each of N categories. For example, the posts with the 3 highest scores from each of the past 3 months. So each month would have 3 posts that "won" for that month.
Here is what my work so far has gotten, simplified.
// simplified
db.posts.aggregate([
{$bucket: {
groupBy: "$createdAt",
boundaries: [
ISODate('2019-06-01'),
ISODate('2019-07-01'),
ISODate('2019-08-01')
],
default: "Other",
output: {
posts: {
$push: {
// ===
// This gets all the posts, bucketed by month
score: '$score',
title: '$title'
// ===
}
}
}
}},
{$match: {_id: {$ne: "Other"}}}
])
I attempted to use the $slice operator in between the // ===s, but go an error (below).
postResults: {
$each: [{
score: '$score',
title: '$title'
}],
$sort: {score: -1},
$slice: 2
}
An object representing an expression must have exactly one field: { $each: [ { score: \"$score\", title: \"$title\" } ], $sort: { baseScore: -1.0 }, $slice: 2.0 }
$slice you're trying to use is dedicated for update operations. To get top N posts you need to run $unwind, then $sort and $group to get ordered array. As a last step you can use $slice (aggregation), try:
db.posts.aggregate([
{$bucket: {
groupBy: "$createdAt",
boundaries: [
ISODate('2019-06-01'),
ISODate('2019-07-08'),
ISODate('2019-08-01')
],
default: "Other",
output: {
posts: {
$push: {
score: '$score',
title: '$title'
}
}
}
}},
{ $match: {_id: {$ne: "Other"}}},
{ $unwind: "$posts" },
{ $sort: { "posts.score": -1 } },
{ $group: { _id: "$_id", posts: { $push: { "score": "$posts.score", "title": "$posts.title" } } } },
{ $project: { _id: 1, posts: { $slice: [ "$posts", 3 ] } } }
])
I'm aware there are a few questions very similar to this, but I haven't been able to find one which clearly outlines how to keep both the total count after match (in this case geoNear), but before skip/limit operations, as well as the skip/limit results.
My current aggregation pipeline is:
[ { '$geoNear':
{ near: [Object],
maxDistance: 4000,
distanceField: 'distance',
spherical: true,
query: {} } },
{ '$sort': { distance: 1 } },
{ '$skip': 0 },
{ '$limit': 20 } ]
Ideally I'd want something like this returned:
{
total: 123,
results: [<results respecting $skip & $limit>]
}
You can do :
1 $group to sum result of previous $geonear match and push the $$ROOT document to keep the record
1 $unwind necessary to remove the array
your $skip
your $limit
1 last $group to format the final result with only the sum amount and the JSON array
Query is :
db.coll.aggregate([{
$geoNear: {
near: [Object],
maxDistance: 4000,
distanceField: 'distance',
spherical: true,
query: {}
}
}, {
$sort: {
distance: 1
}
}, {
$group: {
_id: 0,
count: {
$sum: 1
},
document: {
$push: "$$ROOT"
}
}
}, {
$unwind: "$document"
}, {
$skip: 0
}, {
$limit: 20
}, {
$group: {
_id: 0,
total: {
$first: "$count"
},
results: {
$push: "$document"
}
}
}])
You can use $slice instead of $skip and $limit and preserveNullAndEmptyArrays for the $unwind part to assure an empty result array :
db.coll.aggregate([{
$geoNear: {
near: [Object],
maxDistance: 4000,
distanceField: 'distance',
spherical: true,
query: {}
}
}, {
$sort: {
distance: 1
}
}, {
$group: {
_id: 0,
count: {
$sum: 1
},
document: {
$push: "$$ROOT"
}
}
}, {
$project: {
count: 1,
document: { $slice: ["$document", 0, 20] }
}
}, {
$unwind: { path: "$document", preserveNullAndEmptyArrays: true }
}, {
$group: {
_id: 0,
total: {
$first: "$count"
},
results: {
$push: "$document"
}
}
}])