Using Mongo Geospatial queries on lots of data - mongodb

I am using a Mongo database with mongoose in Nest, a typescript server framework.
I have 2 mongo collections, one contains 20,000 user locations. The other collection contains 10,000 Point of Interests gathered from Google Places API.
I want to now find intersections between gathered locations and these places (which contain a lat and lng GeoJSON point).
In other words, I am looking to see where in relation to these POI's users were.
Currently, I have an async method that will find all the locations that are near a point, using the nearSphere operator.
Then I think the next step will be to then iterate over each place (10,000 of them) in the mongo collection and run this method on each location. That way I will have a list of which POI's were 'nearby' when that specific location was caught.
Is there a better way to do this? With regards to performance I believe this way will struggle. I cannot find a mongo geospatial query that will let me compare 2 sets of locations together.
Get all locations near point
async findAllNearPlace(coords): Promise<Location[]> {
return await this.locationModel.find(
{
location:
{ $nearSphere:
{
$geometry: { type: "Point", coordinates: coords },
$minDistance: 0,
$maxDistance: 100
}
}
}
);
}
Each POI - check locations:
async findUsersInProximity(places): Promise<Location[]> {
const locations = [];
let i = places.length - 1;
while (i > 0) {
await this.findAllNearPlace(
places[i].location.coordinates
).then(intersectingLocations => {
locations.push(...intersectingLocations);
i--;
});
}
return await locations;
}
As expected, the performance of this is poor and takes minutes.

What you can probably do, is to make an aggregate function with a lookup, I did not test it, and I don't know for sure if it is a better performance, but you can do something similar to the following:
let pipeline = [
{
$geoNear: {
includeLocs: "location",
distanceField: "distance",
near: { type: 'Point', coordinates: "$$point" },
maxDistance: 20,
spherical: true
}
}
];
UsersModel.aggregate([{
$lookup : {
from : 'locations',
let : {point : '$address'}, //whatever the field for the coordinates is
pipeline ,
as : 'places'
}
}])

Related

.count aggregate significantly slows execution time

I am refactoring some old code to hope to speed up and make information more secure. In order to do so I am trying to use the MongoDB aggregation framework to get all users near a specific location. Originally this was being solved by handling some spherical calculations on the front end but to do so it was sending lat, lng coordinates of each user back to the front end which is insecure and exposes location data of users un-necessarily.
My solution is use the aggregation framework and $geoNear to get the list of nearby users and then also populate a field with the distance in miles called distanceAway. Everything is working well until I try to set up pagination. The $count stage slows the execution of the route dramatically.
Below is my code:
module.exports = async function findUsersNearLocationAggregate(baseLocation, page = 1, limit = 20) {
// quick validation on location object passed
if(!lodash.get(baseLocation, 'geometry.location', undefined))
throw new Error('NO_LOCATION_SPECIFIED')
// Grab user location to use as location origin
const { lat, lng } = baseLocation.geometry.location
let query = {
$geoNear: {
near: { type: "Point", coordinates: [lng, lat] },
distanceField: "distanceAway",
spherical: true,
distanceMultiplier: CONVERT_METERS_TO_MILES
}
}
const users = await User.aggregate([
query,
{ $skip: ((page - 1) * limit) },
{ $limit: limit }
])
// const [{ count }] = await User.aggregate([
// query,
// { $count: 'count' }
// ])
return {
user: users,
// totalPages: Math.ceil(count / limit)
// currentPage: page
}
}
This function was meant to return the list of users (limit of 20 at a time) and show other data such as totalPages and currentPage to track the pagination on the front-end and make subsequent requests.
When I comment out the following line:
const [{ count }] = await User.aggregate([
query,
{ $count: 'count' }
])
the execution of the route that uses the call is max 100ms.
When I comment it in the call jumps to approx 1100ms.
Is there someway to get the pagination data I am looking for without a significant increase in request time? 10X seems quite large.

mongodb - $geoIntersects sorted by overlapping percentage

I have a collection with this kind of documents:
{
_id: ObjectId,
type: "string", // COUNTRY, STATE, CITY
geometry: {
type: "MultiPolygon",
coordinates: <coordinates>
}
}
And I want to calculate for example in which COUNTRY is every STATE. I've tried to do this:
var state = db.entity.findOne({_id:ObjectId("someStateID")});
db.entity.find({
geometry: {
$geoIntersects: {
$geometry: state.geometry
}
},
type:"COUNTRY"
});
And this works just fine, excepts in some cases where the line of some STATE is touching the border of a neighbour COUNTRY where I got two or more documents.
Here is an example image
Is there a way I can sort this result by overlapping percentage? Or any kind of filter to know which one is the exact parent?
I've found that there is no way to do this in mongo, I had to do it with code. Here is an easy way to solve this problem

How to do geo searches on two properties as opposed to an array of [ lng, lat ]?

All of the examples and implementations of geo search in Mongo, that I've been able to find, use an array property [lng, lat] to perform the search on:
An example record would be:
{
name: 'foo bar',
locations: [
{
name: 'franks house',
geo: [-0.12, 34.51]
},
{
name: 'jennifers house',
geo: [-0.09, 32.17]
}
]
}
And the resulting query:
db.events.find({ 'locations.geo': { $nearSphere: [ -0.12, 34.51 ], $maxDistance: 0.02 }})
This works fine, but the format of the record is not great for users to read or write because it's not self-documenting where lat and lng go in that array. There's room for gotchas. I'd like to make the records more human friendly:
{
name: 'foo bar',
locations: [
{
name: 'franks house',
lat: 34.51,
lng: -0.12
},
{
name: 'jennifers house',
lat: 32.17,
lng: -0.09
}
]
}
What would the resulting mongo query look like for this type of record? I haven't been able to find any examples of this so am wondering if it's even possible.
It's not recommended to use separate fields for latitude and longitude. The 2dsphere index is used to index geospatial data which requires an array of coordinates, see documentation. This is the reason you can't find examples for separate coordinate fields.
You should separate representation from data storage. Just because coordinates are stored in an array, you don't necessarily need to present them as an array to the user. For example you could use a pre call on save to store separate parameters in an array, something like:
var schema = new Schema(..);
schema.pre('save', function (next) {
this.coordinates = [this.longitude, this.latitude]
next();
});

Mobile Meteor App - calculating nearby locations and storing as sortable collection

I'm putting together a mobile meteor application that I want to use to list local (say within a 20 mile radius) amenities. I have a collection of these amenities with corresponding latlng data - I'd like to be able to pass the app my current location (using Cordova) and generate a list (/collection?) that is sorted closest first.
I have two specific problems that I'd really appreciate some advice on!
Can I use mongo's $near for this or should I be using a node.js add-on (eg 'GeoLib' - https://github.com/manuelbieh/geolib) to do the distance calculation?
How do I generate a temporary (locally stored) collection of these locations to display in my list? Presumably if I don't use $near I have to iterate through my locations, calculating the distance on all of them and then returning any where the distance is under a certain threshold, but this seems like an expensive way of doing it when my location list will grow and grow.
Sorry, this is the first time I've attempted something like this; I'd really appreciate any advice from a more seasoned developer!
EDIT - MY CODE (WHY IS IT NOT WORKING?!)
I'm storing locations like this in a collection:
Beaches.insert({
name: 'Venice Beach CA',
geometry: {
type: "Point",
coordinates: [-118.473314,118.473314]
}
});
...
Beaches._ensureIndex({'geometry.coordinates':'2d'}, function(err, result) {
if(err) return console.dir(err);
});
I'm querying these entries like this (passing in the lat and lng):
getNearBeaches = function(lng,lat) {
return Beaches.find({'geometry.coordinates':
{
$near: {
$geometry: {
type: "Point",
coordinates: [lng,lat]
}
},
$maxDistance: 20000 //meters
}
})
};
I can list my collection with a straight find() but my location search returns nothing, even if I set the $maxDistance to a huge number, or search directly on an already stored set of coords.
What have I done wrong??
You can fetch the current location, send it to server and subscribe to collection, filtered by mongodb $near:
Meteor.subscribe("amenities", {latlng: Session.get("latlng")});
And smth like this:
Meteor.publish("amenities", function (latlng) {
return Amenities.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: latlng
},
$maxDistance: 20000 //meters
}
}
});
});
You must have correct data type and indexes. And field location (-:

Mongo geolocation using $near and 2d index not being accurate

I've written an application that finds local establishments and delivers them via a RESTful API. My stack is: express, express-resource and mongoose. Here is a sample of my model:
var PlaceSchema = new mongoose.Schema(
{
name: {
type: String,
required: true
},
geolocation: {
lat: Number,
lng: Number
},
address: {
type: String
}
}
);
PlaceSchema.index({
geolocation: '2d'
});
I've checked a few times and my lon/lat values are being correctly stored in the db, so there aren't any data errors.
Then I run the query to grab the data with the specific query values:
mongoose.connection.db.executeDbCommand(
{
geoNear: 'places',
near: [
parseFloat(req.body.lat),
parseFloat(req.body.lng)
],
num: limit,
query: {
// Some additional queries that
// don't impact the geo results.
},
spherical: true,
distanceMultiplier: 6371, // converting results to km
maxDistance: distance / 6371,
},
function(err, result)
{
res.send(200, {
status: true,
data: theaters
});
}
);
There are a few issues with the results that it's returning: a) the calculation in km is really wrong. In most cases there's a 6-7km difference, but it varies, b) places that are closer are appearing farther than other places.
I'm using the direct mongoose query because it will return the calculated distance (which I require for my API returns). Switching to the mongoose find method apparently wouldn't let me gain access to this data.
Wondering if my index is perhaps wrong? Should it be 2dsphere instead? The documentation in that regard is slightly confusing, but most of the examples I see use just 2d.
Thanks a bunch!
No matter what type of geospatial indexing you use in MongoDB, you always must store longitude first and then latitude.
From http://docs.mongodb.org/manual/core/2d/#store-points-on-a-2d-plane and multiple other places in the docs:
Whether as an array or document, if you use longitude and latitude,
store coordinates in this order: longitude, latitude.