MongoDB: Matching points from one collection with polygons from another - mongodb

I'm trying to match points in one collection with regions stored in another collection.
Here are examples of documents.
Points:
{
"_id" : ObjectId("5e36d904618c0ea59f1eb04f"),
"gps" : { "lat" : 50.073288, "lon" : 14.43979 },
"timeAdded" : ISODate("2020-02-02T15:13:22.096Z")
}
Regions:
{
"_id" : ObjectId("5e49a469afae4a11c4ff3cf7"),
"type" : "Feature",
"geometry" : {
"type" : "Polygon",
"coordinates" : [
[
[ -748397.88, -1049211.61 ],
[ -748402.77, -1049212.2 ],
...
[ -748410.41, -1049213.11 ],
[ -748403.05, -1049070.62 ]
]
]
},
"properties" : {
"Name" : "Region 1"
}
}
And the query I'm trying to construct is something like this:
db.points.aggregate([
{$project: {
coordinates: ["$gps.lon", "$gps.lat"]
}},
{$lookup: {
from: "regions", pipeline: [
{$match: {
coordinates: {
$geoWithin: {
$geometry: {
type: "Polygon",
coordinates: "$geometry.coordinates"
}
}
}
}}
],
as: "district"
}}
])
I'm getting an error:
assert: command failed: {
"ok" : 0,
"errmsg" : "Polygon coordinates must be an array",
"code" : 2,
"codeName" : "BadValue"
} : aggregate failed
I've noticed the structure of $geoWithin document is same as structure of one I have for each region. So I tried such query:
db.points.aggregate([
{$project: {
coordinates: ["$gps.lon", "$gps.lat"]
}},
{$lookup: {
from: "regions", pipeline: [
{$match: {
coordinates: {
$geoWithin: "$geometry.coordinates"
}
}}
],
as: "district"
}}
])
The error was same.
I looked up for geoqueries but surprisingly all found mentions had static region document instead of one taken from a collection. So I'm wondering - is it ever possible to map points with regions having that both document collections aren't static and taken from DB?

Unfortunately not possible
You could perform query below if $geometry could deal with MongoDB Aggregation Expressions.
db.points.aggregate([
{
$lookup: {
from: "regions",
let: {
coordinates: [
"$gps.lon",
"$gps.lat"
]
},
pipeline: [
{
$addFields: {
coordinates: "$$coordinates"
}
},
{
$match: {
coordinates: {
$geoWithin: {
$geometry: {
type: "Polygon",
coordinates: "$geometry.coordinates"
}
}
}
}
}
],
as: "district"
}
}
])

Related

How to mongo aggregate with lookup and geoIntersects

I want to lookup which users are in a specific area, so, if i have these objects on users collection:
{
"_id" : ObjectId("6197a78308591026b742cbc7"),
"longitude" : -8.88180350512795,
"latitude" : 38.5628716186268
}
/* 2 */
{
"_id" : ObjectId("6199798916317b0c2dcab874"),
"longitude" : -9.15904389999993,
"latitude" : 38.7235087
}
/* 3 */
{
"_id" : ObjectId("6199798916317b0c2dcab874"),
"longitude" : -8.6923178,
"latitude" : 41.1846394
}
And i have other collection deliveryareas with this object:
{
"_id" : ObjectId("6197ebf6fdbd8f6b06c97c57"),
"area" : {
"type" : "Polygon",
"coordinates" : [
[
[
-9.0556241,
38.4989521
],
[
-9.0551516,
38.4964531
],
[
-9.0522346,
38.4950589
],
[
-9.0526648,
38.4940872
]
]
],
"_id" : ObjectId("6197ebf6fdbd8f6b06c97c58")
}
}
Making this query:
db.getCollection('users').aggregate([
{
$lookup: {
from: 'deliveryareas',
let: { longitude: '$longitude', latitude: '$latitude' },
pipeline: [
{ $match: {
area: {
$geoIntersects: {
$geometry: {
type: 'Point',
coordinates: [ '$$longitude', '$$latitude' ],
},
},
}
} },
],
as: 'inRegion',
},
},
])
if i run the query directly in the deliveryareas collection it works, but over the pipeline on lookup don't work and it's giving the error "Point must only contain numeric elements", can anyone tell me why?
Just one more thing, if i run first this:
db.getCollection('users').aggregate([
{ $unwind: '$addresses' },
{ $project: { coordinates: [ '$addresses.region.longitude', '$addresses.region.latitude' ] } },
])
I have this results:
/* 1 */
{
"_id" : ObjectId("6197a78308591026b742cbc7"),
"coordinates" : [
-8.88180350512795,
38.5628716186268
]
}
/* 2 */
{
"_id" : ObjectId("6199798916317b0c2dcab874"),
"coordinates" : [
-9.15904389999993,
38.7235087
]
}
/* 3 */
{
"_id" : ObjectId("6199798916317b0c2dcab874"),
"coordinates" : [
-8.6923178,
41.1846394
]
}
Then if i add the lookup:
db.getCollection('users').aggregate([
{ $unwind: '$addresses' },
{ $project: { coordinates: [ '$addresses.region.longitude', '$addresses.region.latitude' ] } },
{
$lookup: {
from: 'deliveryareas',
let: { userCoordinates: '$coordinates' },
pipeline: [
{ $match: {
area: {
$geoIntersects: {
$geometry: {
type: 'Point',
coordinates: '$$userCoordinates',
},
},
}
} },
],
as: 'inRegion',
},
},
])
i have the error: Point must be an array or object
but if i replace the variable $$userCoordinates by a fixed value, for example [ -8.88180350512795, 38.5628716186268 ] (that i have on the users) the query run with success.

How to use $geoNear after $lookup

Hi i have the following problem:
I would like to use $geoNear (to count distance between two points) but after $loopback (and on collection that i joined).
This is the model for companyBases collection (i would like to join it):
{
"_id" : ObjectId("5d7cfe13f42e7345d967b378"),
"location" : {
"type" : "Point",
"coordinates" : [
20.633856,
49.761268
]
},
"vehicles" : [
{
"_id" : ObjectId("5d7cfe13f42e7345d967b340"),
...other fields that doesn't matter
}
]
}
This is vehicle collection:
{
"_id" : ObjectId("5d7cfe13f42e7345d967b340"),
...other fields that doesn't matter
}
I would like to join companyBase collection in aggregation on vehicles collection:
db.vehicles.aggregate([
{
$lookup: {
from: "companybases",
let: {
vehicleId: "$_id"
},
pipeline: [
{
$match: {
$expr: { $in: ["$$vehicleId", "$vehicles._id"] }
}
}
],
as: "companyBases"
}
},
{
$unwind: "$companyBases"
},
{
$geoNear: {
near: {
type: "Point",
coordinates: [50.02485, 20.0008]
},
distanceField: "distance",
spherical: true
}
}
]);
But it returns me:
{
"message" : "$geoNear is only valid as the first stage in a pipeline.",
"operationTime" : "Timestamp(1568472833, 1)",
"ok" : 0,
"code" : 40602,
"codeName" : "Location40602",
"$clusterTime" : {
"clusterTime" : "Timestamp(1568472833, 1)",
"signature" : {
"hash" : "AAAAAAAAAAAAAAAAAAAAAAAAAAA=",
"keyId" : "0"
}
},
"name" : "MongoError"
}
When i am doing the same pipeline on companybases collection it returns me documents with counted distance:
db.companybases.aggregate([
{
$geoNear: {
near: {
type: "Point",
coordinates: [50.02485, 20.0008]
},
distanceField: "distance",
spherical: true
}
}
]);
And result:
{
"_id" : ObjectId("5d7cfe13f42e7345d967b378"),
"location" : {
"type" : "Point",
"coordinates" : [
20.633856,
49.761268
]
},
"vehicles" : [
{
...some fields
},
],
...some fields
"distance" : 4209673.447019393
}
I realize that the error may be because of missing index on vehicles collection. So is there any way to calculate distance with $geoNear with $lookup ? Or maybe it's impossible and i have to do on my own ?
Simple solutions (you can put $geoNear in $lookup pipeline):
db.vehicles.aggregate([
{
$lookup: {
from: "companybases",
let: {
vehicleId: "$_id"
},
pipeline: [
{
$geoNear: {
near: {
type: "Point",
coordinates: [50.02485, 20.0008]
},
distanceField: "distance",
spherical: true
}
},
{
$match: {
$expr: { $in: ["$$vehicleId", "$vehicles._id"] }
}
}
],
as: "companyBases"
}
},
{
$unwind: "$companyBases"
}
]);
But that strongly impressed the performance (it takes at least 5 seconds), becuase $geoNear is used before match.
your error shows "message" : "$geoNear is only valid as the first stage in a pipeline.",
it is clear that you can't put '$geoNear' after $lookup
so You can only use $geoNear as the first stage of a pipeline.
just to exchange you agg order
use $geoNear at first then with $lookup

In MongoDB, how do I use a field in the document as input to a $geoWithin/$centerSphere expression?

I'm trying to write a MongoDB query that searches for documents within a radius centered on a specified location.
The query below works. It finds all documents that are within searching.radius radians of searching.coordinates.
However what I would like to do is add the current documents allowed_radius value to the searching.radius value, so that the allowed sphere is actually larger.
How can I phrase this query to make this possible?
Present Query:
collection.aggregate([
{
$project:{
location: "$location",
allowed_radius: "$allowed_radius"
}
},
{
$match: {
$and:
[
{ location: { $geoWithin: { $centerSphere: [ searching.coordinates, searching.radius ] }}},
{...},
...]
...}
]);
What I am trying to do (pseudo-query):
collection.aggregate([
{
$project:{
location: "$location",
allowed_radius: "$allowed_radius"
}
},
{
$match: {
$and:
[
{ location: { $geoWithin: { $centerSphere: [ searching.coordinates, { $add: [searching.radius, $allowed_radius]} ] }}},
{...},
...]
...}
]);
I tried using $geoWithin / $centerSphere, but couldn't make it work this way.
Here is another way of doing so, using the $geoNear operator:
Given this input:
db.collection.insert({
"airport": "LGW",
"id": 1,
"location": { type: "Point", coordinates: [-0.17818, 51.15609] },
"allowed_radius": 100
})
db.collection.insert({
"airport": "LGW",
"id": 2,
"location": { type: "Point", coordinates: [-0.17818, 51.15609] },
"allowed_radius": 0
})
db.collection.insert({
"airport": "ORY",
"id": 3,
"location": { type: "Point", coordinates: [2.35944, 48.72528] },
"allowed_radius": 10
})
And this index (which is required for $geoNear):
db.collection.createIndex( { location : "2dsphere" } )
With searching.radius = 1000:
db.collection.aggregate([
{ $geoNear: {
near: { "type" : "Point", "coordinates": [7.215872, 43.658411] },
distanceField: "distance",
spherical: true,
distanceMultiplier: 0.001
}},
{ $addFields: { radius: { "$add": ["$allowed_radius", 1000] } } },
{ $addFields: { isIn: { "$subtract": ["$distance", "$radius" ] } } },
{ $match: { isIn: { "$lte": 0 } } }
])
would return documents with id 1 (distance=1002 <= radius=1000+100) and 3 (distance=676 <= radius=1000+10) and discard id 2 (distance=1002 > 1000+0).
The distanceMultiplier parameter is used to bring back units to km.
$geoNear must be the first stage of an aggregation (due to the usage of the index I think), but one of the parameters of $geoNear is a match query on other fields.
Even if it requires the geospacial index, you can add additional dimensions to the index.
$geoNear doesn't take the location field as an argument, because it requires the collection to have a geospacial index. Thus $geoNear implicitly uses as location field (whatever the name of the field) the one indexed.
Finally, I'm pretty sure the last stages can be simplified.
The $geoNear stage is only used to project the distance on each record:
{ "airport" : "ORY", "distance" : 676.5790971238937, "location" : { "type" : "Point", "coordinates" : [ 2.35944, 48.72528 ] }, "allowed_radius" : 10, "id" : 3 }
{ "airport" : "LGW", "distance" : 1002.3351814526812, "location" : { "type" : "Point", "coordinates" : [ -0.17818, 51.15609 ] }, "allowed_radius" : 100, "id" : 1 }
{ "airport" : "LGW", "distance" : 1002.3351814526812, "location" : { "type" : "Point", "coordinates" : [ -0.17818, 51.15609 ] }, "allowed_radius" : 0, "id" : 2 }
In fact, the geoNear operator requires the use of the distanceField argument, which is used to project the computed distance on each record for the next stages of the query. At the end of the aggregation, returned records look like:
{
"airport" : "ORY",
"location" : { "type" : "Point", "coordinates" : [ 2.35944, 48.72528 ] },
"allowed_radius" : 10,
"id" : 3,
"distance" : 676.5790971238937,
"radius" : 1010,
"isIn" : -333.4209028761063
}
If necessary, you can remove fields produced by the query for the query (distance, radius, isIn) with a final $project stage. For instance: {"$project":{"distance":0}}

How to group longitude and latitude reducing the decimal places of those points?

I have the following aggregate:
db.locations.aggregate(
// Pipeline
[
// Stage 1
{
$geoNear: {
near: { type: "Point", coordinates: [-47.121314, -18.151515 ] },
distanceField: "dist.calculated",
maxDistance: 500,
includeLocs: "dist.location",
num: 50000,
spherical: true
}
},
// Stage 2
{
$group: {
"_id" : {
'loc' : '$loc'
},
qtd: { $sum:1 }
}
},
], );
And the following collection:
{
"_id" : ObjectId(),
"loc" : {
"type" : "Point",
"coordinates" : [
-47.121311,
-18.151512
]
}
},
{
"_id" : ObjectId(),
"loc" : {
"type" : "Point",
"coordinates" : [
-47.121311,
-18.151512
]
}
},
{
"_id" : ObjectId(),
"loc" : {
"type" : "Point",
"coordinates" : [
-47.121312,
-18.151523
]
}
},
{
"_id" : ObjectId(),
"loc" : {
"type" : "Point",
"coordinates" : [
-47.121322,
-18.151533
]
}
}
When I run the aggregate, I have the following result:
{
"_id" : {
"loc" : {
"type" : "Point",
"coordinates" : [
-47.121311,
-18.151512
]
}
},
"qtd" : 2.0
},
{
"_id" : {
"loc" : {
"type" : "Point",
"coordinates" : [
-47.121312,
-18.151523
]
}
},
"qtd" : 1.0
},
{
"_id" : {
"loc" : {
"type" : "Point",
"coordinates" : [
-47.121322,
-18.151533
]
}
},
"qtd" : 1.0
}
I would like to group these locations in a single document, since they are very close ..
I thought of reducing the size of each point, -47.121314 being something like -47.1213
Something like this
{
"_id" : {
"loc" : {
"type" : "Point",
"coordinates" : [
-47.1213,
-18.1515
]
}
},
"qtd" : 4.0
}
But I have no idea how to group these documents.
Is it possible?
The way to reduce the floating point precision is to $multiply out the number by the required precision adjustment, "truncate it" to an integer and then $divide back to the desired precision.
For latest MongoDB releases ( since MongoDB 3.2 ) you can use $trunc:
db.locations.aggregate([
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": [ -47.121314, -18.151515 ]
},
"distanceField": "qtd",
"maxDistance": 500,
"num": 50000,
"spherical": true
}},
{ "$group": {
"_id": {
"type": '$loc.type',
"coordinates": {
"$map": {
"input": '$loc.coordinates',
"in": {
"$divide": [
{ "$trunc": { "$multiply": [ '$$this', 10000 ] } },
10000
]
}
}
}
},
"qtd": { "$sum": '$qtd' }
}}
]);
For releases prior to that, you can use $mod and $subtract to remove the "remainder" instead:
db.locations.aggregate([
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": [ -47.121314, -18.151515 ]
},
"distanceField": "qtd",
"maxDistance": 500,
"num": 50000,
"spherical": true
}},
{ "$group": {
"_id": {
"type": '$loc.type',
"coordinates": {
"$map": {
"input": '$loc.coordinates',
"as": "coord",
"in": {
"$divide": [
{ "$subtract": [
{ "$multiply": [ '$$coord', 10000 ] },
{ "$mod": [
{ "$multiply": [ '$$coord', 10000 ] },
1
]}
]},
10000
]
}
}
}
},
"qtd": { "$sum": '$qtd' }
}}
]);
Both return the same result:
/* 1 */
{
"_id" : {
"type" : "Point",
"coordinates" : [
-47.1213,
-18.1515
]
},
"qtd" : 4.01180839007879
}
We use $map here to "reshape" the array contents of "coordinates" applying the "rounding" to each value in the array. You might note the two slightly different usages with "as' in the second example, since the ability to use $$this as a default reference was only applied in MongoDB 3.2, for which the listing presumes you would not have or otherwise you would use $trunc instead of the alternate method usage.
You should note that $geoNear which is essentially a "nearest" search is only returning 100 documents by default or alternately up to the number specified in "num" or "limit" options. So that is always a governing factor in the number of results returned if those would exceed the other constraints such as "maxDistance".
There is also no need to follow the documentation so literally, as "distanceField" is the only other mandatory parameter aside from "spherical" which is required when a "2dsphere" index is used. The value to "distanceField" can be whatever you actually want it to be, and in this case we simply supply it directly with the name of the property you want to output.

$geoWithin with mongoDB aggregate causes BadValue bad geo query

I am trying to use a $geoWithin query in a aggregate pipeline, but I am getting a an
MongoError: exception: bad query: BadValue bad geo query: { $geoWithin: { $box: [ [ "13.618240356445312", "51.01343066212905" ], [ "13.865432739257812", "51.09662294502995" ] ] } }
My query is:
{
$match: {
'gps.coordinates.matched': {
$geoWithin: {
$box: [
[ swlng, swlat ],
[ nelng , nelat ]
]
}
}
}
},
{ $project : {shortGeohash: {$substr: ["$gps.geohash.original", 0, 11]}}},
{ $group: {_id: "$shortGeohash", count: {$sum:1}, originalDoc:{$push: "$$ROOT"}}}
The query only for $geoWithin as well $project...,$group work well on their own, but combined the error occurs.
I tried your query and it seems to actually work. I executed the query over a collection with documents such as this.
[{
"_id" : "5a2404674eb6d938c8f44856",
"code" : "M.12345",
"loc" : {
"type" : "Point",
"coordinates" : [
41.9009789,
12.5010465
]
}
},
...
]
The aggregation pipeline is this.
{
$match: {
'loc': {
$geoWithin: {
$box: [
[ 0, 0 ],
[ 5, 5 ]
]
}
}
}
},
{ $project : {subCode: {$substr: ["$code", 0, 4]}}},
{ $group: {_id: "$subCode", count: {$sum:1}, originalDoc:{$push: "$$ROOT"}}}
One of the results is this.
{
"_id" : "M.10",
"count" : 12.0,
"originalDoc" : [
{
"_id" : "5a2481c44eb6d92b6895633a",
"subCode" : "M.10"
},
.... //11 more items
]
}
Results are correctly returned with mongod v3.4.9.
It seems like $geoWithin is not one of the aggregation operators.
The reference example works, sadly, I am not aware of a way to add an aggregation to that.