Mongo ordering a $near with another secondary sort - mongodb

I have a list of shops they have a useCount and a geolocation.
How would I search and order by useCount but also have a property on each object returned signifying how close they are to me.
schema:
{
name: String,
useCount: { type: Number, index: true },
location: { 'type': {type: String, enum: "Point", default: "Point"}, coordinates: { type: [Number], default: [0,0]} }
}
e.g results
shop1 usecount-12 closest-3 geo-1333.222,222.222
shop2 usecount-3 closest-1 geo-1333.222,222.222
shop3 usecount-1 closest-2 geo-1333.222,222.222

Presuming your data is actually properly arranged for MongoDB and looks something like this:
{
"shop": 1,
"usecount": 12,
"closest": 3,
"geo": {
"type": "Point",
"coordinates": [1333.222,222.222]
}
}
And your coordinates are in fact in "longitude/latitude" order as is requireed from GeoJSON and MongoDB and that you have a geospatial index that is "2dsphere", then your best option for "composite sort" is using the $geoNear aggregate command pipeline, along with aggregation $sort:
Model.aggregate(
[
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": [1333.222,222.222]
},
"distanceField": "dist",
"spherical": true
}},
{ "$sort": { "dist": 1, "usecount": -1 } }
],
function(err,results) {
}
)
Where the $geoNear projects the "distance" as the nominated field here in "dist", and then you use that in the $sort along with the other field "usecount" as shown here in descending order for the "largest" value if "usecount" first, and within each "dist" already sorted.
The aggregation framework though .aggregate() does more than just "aggregate" documents. It is your "main tool" for projecting new values into a document, useful for such things as sorting results by values that "calculate" by one means or the other.
Unlike $near ( or $nearSphere ) the distance is returned as a true field in the document rather than just a "default" sort order. This allows that key to be used in the sorted results, along with any other field value present or projected into the document at the $sort stage.
Also noting that your data here does not appear to be valid spherical coordinates, which is going to cause problems with GeoJSON storage and also a "2dsphere" index. If not real global coordinates but coordinates on a "plane, then just use a plane legacy array for "geo" as [1333.222,222.222] and a "2d" index only. As well the argument to "near" within $geoNear is simply an array as well, and the "spherical" option would then not be required.
But possibly a problem with typing in your question as well.

Related

MongoDB - How can I use a field's value in the first argument of $centerSphere

I'm trying to get a negative match for $geoWithin, will be used in mongodb Charts.
all of the required information is in the result of the latest stage in an aggregation i'm constructing in mongodb compass, the result of that stage looks like this:
{
"PizzaId": "123",
"info": {
"timestamp": {
"$date": "2021-02-15T05:00:00.000Z"
},
"location": {
"type": "Point",
"coordinates": [33.21883773803711, 33.802675247192383]
},
"dayOfWeek": 2,
},
"PizzaLocation": [{
"_id": "456",
"location": {
"type": "Point",
"coordinates": [37.83396911621094, 37.07674026489258]
}
}]
}
I want to add a stage after that a filter that checks that info.location is not in a 100 km radius within Pizzalocation.0.location:
{
$match: {
"info.location.coordinates": {
$not:
{
$geoWithin: {
$centerSphere: [
"$PizzaLocation.0.location.coordinates",
100 / 6378.1
]
}
}
}
}
}
I get an error: Point must be an array or object
Things I tried:
playing with the field name in centerSphere: removing the 0, or $, using:
{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]}
even used the [lon,lat] format and put
[{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},0]},
{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},1]}]
setting literal coordinates instead of field name, it worked, but I need to use a field.
creating a view that will hold the centerSphere itself, and use a lookup to get it, but mongoDB didn't recognize $geoWithin nor $centerSphere in $addField aggregation
Things I verified:
I used $project stage on {$arrayElemAt: ["$PizzaLocation.location.coordinates",0]} , and indeed it showed in the array: [lon,lat]
I used $project stage on
{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},0]}
and
{$arrayElemAt: [{$arrayElemAt: ["$PizzaLocation.location.coordinates",0]},1]}
and indeed it showed a number for each one.
So, how can I use a field's value(s) in the first argument of $centerSphere.
thank you.
So you can't do it, let's first understand why not.
From the $match docs:
The query syntax is identical to the read operation query syntax;
This means $match queries use the same syntax as find queries. and unsurprisingly $geoWithin is a query operator.
Unfortunately query syntax can not access the document values as part of the query. This is also the reason why your query fails, the "coordinates" you pass are being parsed as a string expression.
For example this following query:
{
$match: {
field1: {$eq: "$field2"}
}
}
Matches: { field1: "$field2"} but no { field1: 1, field2: 1 }
Again this is just the query language parser's behaviour so there's not much you can do.
The alternative is to use an the $geoNear stage, but not only there is no easy way to combine it with $not logic there are additional restrictions like it having to be the first stage of the pipeline and so on.
The best I can recommend is split your query into 2 parts, 1 fetch the document you need and only then re-query it using $geoWithin with the proper coordinates input.

MongoDB Different response on different environment[Aggregate] [duplicate]

I am using a $geoNear as the first step in the aggregation framework. I need to filter out the results based on "tag" field and it works fine but I see there are 2 ways both giving different results.
Sample MongoDB Document
{
"position": [
40.80143,
-73.96095
],
"tag": "pizza"
}
I have added 2dsphere index to the "position" key
db.restaurants.createIndex( { 'position' : "2dsphere" } )
Query 1
uses $match aggregration pipeline operation to filter out the results based on "tag" key
db.restaurants.aggregate(
[
{
"$geoNear":{
"near": { type: "Point", coordinates: [ 55.8284,-4.207] },
"limit":100,
"maxDistance":10*1000,
"distanceField": "dist.calculated",
"includeLocs": "dist.location",
"distanceMultiplier":1/1000,
"spherical": true
}
},{
"$match":{"tag":"pizza"}
},
{
"$group":{"_id":null,"totalDocs":{"$sum":1}}
}
]
);
Query 2
Uses query inside the $geoNear aggregation operation to filter results based on "tag" key
db.restaurants.aggregate(
[
{
"$geoNear":{
"query" : {"tag":"pizza"}
"near": { type: "Point", coordinates: [ 55.8284,-4.207] },
"limit":100,
"maxDistance":10*1000,
"distanceField": "dist.calculated",
"includeLocs": "dist.location",
"distanceMultiplier":1/1000,
"spherical": true
}
},
{
"$group":{"_id":null,"totalDocs":{"$sum":1}}
}
]
);
The grouping option is just to get the count of documents returned by both the queries.
The totalDocs returned by both queries seem to be different.
Can someone explain me the differences between both the queries ?
Few assumptions:-
1. Assume there are 300 records that match based on the location.
2. Assume first set of 100 results do not have tag pizza. The rest 200 documents (101 to 300) have tag pizza
Query 1:-
There are 2 pipeline operations $geoNear and $match
The output of $geoNear pipeline operation is the input to $match
pipeline operation
$geoNear finds max of 100 results (limit we have specified) based on
the location sorted by nearest to far distance. (Note here that the
100 results retured are purely based on the location. So these 100
results do not contain any document with tag "pizza")
These 100 results are sent to the next pipeline operation $match from
where the filtering happens. But since the first set of 100 results
did not have tag pizza, the output is empty
Query 2:-
There is only 1 pipeline operation $geoNear
There is a query field included in the $geoNear pipeline operation
$geoNear finds max of 100 results (limit we have specified) based on
the location sorted by nearest to far distance and the query
tag=pizza
Now here the results from 101 to 200 are returned as output as the
query is included within the pipeline operation $geoNear. So in
simple sentence we say, find all documents with location [x,y] with
tag=pizza.
P.S : - The $group pipeline stage is added just for getting the count and hence have not written about it in the explaination
// If you have to apply multiple criteria to find locations then this query might helpful
const userLocations = await userModel.aggregate([
{
$geoNear: {
near: { type: "Point", coordinates: [data.lon1,data.lat1]
},//set the univercity points
spherical: true,
distanceField: "calcDistance",
// maxDistance: 2400,//25km
"distanceMultiplier": 0.001,
}
},
{ $unwind: "$location" },
{ $match: {
"location": {
$geoWithin: {
$centerSphere: [
[ 73.780553, 18.503327], 20/ 6378.1 //check the user point is present here
]
}
}
}},
])

MongoDB $geoNear aggregation pipeline (using query option and using $match pipeline operation) giving different no of results

I am using a $geoNear as the first step in the aggregation framework. I need to filter out the results based on "tag" field and it works fine but I see there are 2 ways both giving different results.
Sample MongoDB Document
{
"position": [
40.80143,
-73.96095
],
"tag": "pizza"
}
I have added 2dsphere index to the "position" key
db.restaurants.createIndex( { 'position' : "2dsphere" } )
Query 1
uses $match aggregration pipeline operation to filter out the results based on "tag" key
db.restaurants.aggregate(
[
{
"$geoNear":{
"near": { type: "Point", coordinates: [ 55.8284,-4.207] },
"limit":100,
"maxDistance":10*1000,
"distanceField": "dist.calculated",
"includeLocs": "dist.location",
"distanceMultiplier":1/1000,
"spherical": true
}
},{
"$match":{"tag":"pizza"}
},
{
"$group":{"_id":null,"totalDocs":{"$sum":1}}
}
]
);
Query 2
Uses query inside the $geoNear aggregation operation to filter results based on "tag" key
db.restaurants.aggregate(
[
{
"$geoNear":{
"query" : {"tag":"pizza"}
"near": { type: "Point", coordinates: [ 55.8284,-4.207] },
"limit":100,
"maxDistance":10*1000,
"distanceField": "dist.calculated",
"includeLocs": "dist.location",
"distanceMultiplier":1/1000,
"spherical": true
}
},
{
"$group":{"_id":null,"totalDocs":{"$sum":1}}
}
]
);
The grouping option is just to get the count of documents returned by both the queries.
The totalDocs returned by both queries seem to be different.
Can someone explain me the differences between both the queries ?
Few assumptions:-
1. Assume there are 300 records that match based on the location.
2. Assume first set of 100 results do not have tag pizza. The rest 200 documents (101 to 300) have tag pizza
Query 1:-
There are 2 pipeline operations $geoNear and $match
The output of $geoNear pipeline operation is the input to $match
pipeline operation
$geoNear finds max of 100 results (limit we have specified) based on
the location sorted by nearest to far distance. (Note here that the
100 results retured are purely based on the location. So these 100
results do not contain any document with tag "pizza")
These 100 results are sent to the next pipeline operation $match from
where the filtering happens. But since the first set of 100 results
did not have tag pizza, the output is empty
Query 2:-
There is only 1 pipeline operation $geoNear
There is a query field included in the $geoNear pipeline operation
$geoNear finds max of 100 results (limit we have specified) based on
the location sorted by nearest to far distance and the query
tag=pizza
Now here the results from 101 to 200 are returned as output as the
query is included within the pipeline operation $geoNear. So in
simple sentence we say, find all documents with location [x,y] with
tag=pizza.
P.S : - The $group pipeline stage is added just for getting the count and hence have not written about it in the explaination
// If you have to apply multiple criteria to find locations then this query might helpful
const userLocations = await userModel.aggregate([
{
$geoNear: {
near: { type: "Point", coordinates: [data.lon1,data.lat1]
},//set the univercity points
spherical: true,
distanceField: "calcDistance",
// maxDistance: 2400,//25km
"distanceMultiplier": 0.001,
}
},
{ $unwind: "$location" },
{ $match: {
"location": {
$geoWithin: {
$centerSphere: [
[ 73.780553, 18.503327], 20/ 6378.1 //check the user point is present here
]
}
}
}},
])

MongoDB performance with geo data

I am using a single MongoDB 3.0.1 instance (without sharded cluster, replicas, etc) with one database containing a collection with 15324247 points. Of course, points are indexed with a 2dsphere index. Queries are done through a Node.js app.
Looking for points near a concrete lon&lat, it requires 11710ms to return 59925 points.
The same query including a restricted project (only geometry), it still requires 4351ms to return 59925 points.
find({
"geometry": {
"$near": {
"$geometry": {
"type": "Point",
"coordinates": [110.30838012695312,
-20.86522808076763]
},
"$maxDistance": 1000
}
}
},
{
geometry: 1
})
Changing the query and using and aggregation instead of find it requires: 5799ms but returns 8882 points, and projecting only the geometry 4606ms returns 8882 points.
aggregate([{
"$geoNear": {
"near": {
"type": "Point",
"coordinates": [110.30838012695312,
-20.86522808076763]
},
"distanceField": "dist.calculated",
"maxDistance": 964,
"spherical": true,
"num": 70000
}
}])
Although all the elements are indexed, is this a normal performance? How could be improved? I have tried $geoWithin instead of $geoNear, or adding more keys to geoindex, using cursor.get instead of cursor.on in Node.js side, increasing/decreasing batchSize for aggregation... but performances are quite similar in all cases.
And the second question is why the aggregation is returning less results than the find?

MongoDB Geospacial Query Spheres Overlapping Single Point

I am trying to create a geospacial query in MongoDB that finds all circles (with varying radius) that overlap a single point.
My data looks something like this:
{
name: "Pizza Hut",
lat: <latitude>
lon: <longitude>
radius: 20
...
}
Basically, I am trying to do exactly what is described in this SO post but with MongoDB - Get all points(circles with radius), that overlap given point
geoIntersects (http://docs.mongodb.org/manual/reference/operator/query/geoIntersects/) looks like what I need. But in my case, the lat, lon, and radius is stored with each mongodb document and is not a fixed radius that is part of the query. Can this be done?
A different approach would be to find all documents whose distance from my query point is less than the value of their radius field (ie - 20km in the example above). How do you structure a MongoDB query where the calculated distance is part of the query filter criteria?
Thanks!
Well it would be nicer if you could use a GeoJSON object to represent the location but as of present the supported types are actually limited so a "Circle" type which would be ideal is not supported.
The closest you could do is a "Polygon" approximating a circle, but this is probably a little too much work to construct just for this query purpose. The other gotcha with doing this and then applying $geoIntersects is that the results will not be "sorted" by the distance from the query point. That seems to be the opposite of the purpose of finding the "nearest pizza" to the point of origin.
Fortunately there is a $geoNear operation added to the aggregation framework as of MongoDB 2.4 and greater. The good thing here is it allows the "projection" of a distance field in the results. This then allows you to do the logical filtering on the server to those points that are "within the radius" constraint to the distance from the point of origin. It also allows sorting on the server as well.
But you are still going to need to change your schema to support the index
db.places.insert({
"name": "Pizza Hut",
"location": {
"type": "Point",
"coordinates": [
151.00211262702942,
-33.81696995135973
]
},
"radius": 20
})
db.places.ensureIndex({ "location": "2dsphere" })
And for the aggregation query:
db.places.aggregate([
// Query and project distance
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": [
150.92094898223877,
-33.77654333272719
]
},
"distanceField": "distance",
"distanceMultiplier": 0.001,
"maxDistance": 100000,
"spherical": true
}},
// Calculate if distance is within delivery sphere
{ "$project": {
"name": 1,
"location": 1,
"radius": 1,
"distance": 1,
"within": { "$gt": [ "$radius", "$distance" ] }
}},
// Filter any false results
{ "$match": { "within": true } },
// Sort by shortest distance from origin
{ "$sort": { "distance": -1 } }
])
Basically this says,
*"out to 100 kilometers from a given location, find the places with their distance from that point. If the distance is within their "delivery radius" then return them, sorted by the closest"
There are other options you can pass to $geoNear in order to refine the result, as well as return more than the default 100 results if required and basically pass other options to query such as a "type" or "name" or whatever other information you have on the document.