.count aggregate significantly slows execution time - mongodb

I am refactoring some old code to hope to speed up and make information more secure. In order to do so I am trying to use the MongoDB aggregation framework to get all users near a specific location. Originally this was being solved by handling some spherical calculations on the front end but to do so it was sending lat, lng coordinates of each user back to the front end which is insecure and exposes location data of users un-necessarily.
My solution is use the aggregation framework and $geoNear to get the list of nearby users and then also populate a field with the distance in miles called distanceAway. Everything is working well until I try to set up pagination. The $count stage slows the execution of the route dramatically.
Below is my code:
module.exports = async function findUsersNearLocationAggregate(baseLocation, page = 1, limit = 20) {
// quick validation on location object passed
if(!lodash.get(baseLocation, 'geometry.location', undefined))
throw new Error('NO_LOCATION_SPECIFIED')
// Grab user location to use as location origin
const { lat, lng } = baseLocation.geometry.location
let query = {
$geoNear: {
near: { type: "Point", coordinates: [lng, lat] },
distanceField: "distanceAway",
spherical: true,
distanceMultiplier: CONVERT_METERS_TO_MILES
}
}
const users = await User.aggregate([
query,
{ $skip: ((page - 1) * limit) },
{ $limit: limit }
])
// const [{ count }] = await User.aggregate([
// query,
// { $count: 'count' }
// ])
return {
user: users,
// totalPages: Math.ceil(count / limit)
// currentPage: page
}
}
This function was meant to return the list of users (limit of 20 at a time) and show other data such as totalPages and currentPage to track the pagination on the front-end and make subsequent requests.
When I comment out the following line:
const [{ count }] = await User.aggregate([
query,
{ $count: 'count' }
])
the execution of the route that uses the call is max 100ms.
When I comment it in the call jumps to approx 1100ms.
Is there someway to get the pagination data I am looking for without a significant increase in request time? 10X seems quite large.

Related

Using Mongo Geospatial queries on lots of data

I am using a Mongo database with mongoose in Nest, a typescript server framework.
I have 2 mongo collections, one contains 20,000 user locations. The other collection contains 10,000 Point of Interests gathered from Google Places API.
I want to now find intersections between gathered locations and these places (which contain a lat and lng GeoJSON point).
In other words, I am looking to see where in relation to these POI's users were.
Currently, I have an async method that will find all the locations that are near a point, using the nearSphere operator.
Then I think the next step will be to then iterate over each place (10,000 of them) in the mongo collection and run this method on each location. That way I will have a list of which POI's were 'nearby' when that specific location was caught.
Is there a better way to do this? With regards to performance I believe this way will struggle. I cannot find a mongo geospatial query that will let me compare 2 sets of locations together.
Get all locations near point
async findAllNearPlace(coords): Promise<Location[]> {
return await this.locationModel.find(
{
location:
{ $nearSphere:
{
$geometry: { type: "Point", coordinates: coords },
$minDistance: 0,
$maxDistance: 100
}
}
}
);
}
Each POI - check locations:
async findUsersInProximity(places): Promise<Location[]> {
const locations = [];
let i = places.length - 1;
while (i > 0) {
await this.findAllNearPlace(
places[i].location.coordinates
).then(intersectingLocations => {
locations.push(...intersectingLocations);
i--;
});
}
return await locations;
}
As expected, the performance of this is poor and takes minutes.
What you can probably do, is to make an aggregate function with a lookup, I did not test it, and I don't know for sure if it is a better performance, but you can do something similar to the following:
let pipeline = [
{
$geoNear: {
includeLocs: "location",
distanceField: "distance",
near: { type: 'Point', coordinates: "$$point" },
maxDistance: 20,
spherical: true
}
}
];
UsersModel.aggregate([{
$lookup : {
from : 'locations',
let : {point : '$address'}, //whatever the field for the coordinates is
pipeline ,
as : 'places'
}
}])

Meteor publication sorting

I have a Meteor-react app, what contains a collection, with a lots of data. I am displaying the data with pagination.
At the server side I am just publishing the data for the current page.
So, I am publishing some data at the server side:
Meteor.publish('animals', function(currPage,displayPerPage, options) {
const userId = this.userId;
if (userId) {
const currentUser = Meteor.users.findOne({ _id: userId });
let skip = (currPage - 1) * displayPerPage;
if (displayPerPage > 0) {
Counts.publish(this, 'count-animals', Animals.find(
{$and: [
// Counter Query
}
), {fastCount: true});
return Animals.find(
{$and: [
// Data query
]}, {sort: options.sortOption, skip: skip, limit: displayPerPage });
} else {
Counts.publish(this, 'count-animals', 0);
return [];
}
}
});
And on the client side I am using tracker:
export default AnimalsContainer = withTracker(({subscriptionName, subscriptionFun, options, counterName}) => {
let displayPerPage = Session.get("displayPerPage");
let currPage = Session.get("currPage");
let paginationSub = Meteor.subscribe(subscriptionName, currPage, displayPerPage, options );
let countAnimals = Counts.get(counterName);
let data = Animals.find({}).fetch({});
// console.log(data);
return {
// data: data,
data: data.length <= displayPerPage ? data : data.slice(0, displayPerPage),
countAnimals: countAnimals,
}
})(Animals);
The problem is:
When I try to modify the sort options on the client side, the server sort not from the first data(Skippind the first some). Sometimes from the 20 th sometimes from the 10 th.
The type checks are done at both side.
Two things I can think of.
Keep on eye on the {sort: options.sortOption, skip: skip, limit: displayPerPage } order. As far as I know, it runs in the order you place it. So it sorts first, then skips, then limits.
Do sorts on both client and server. When the sort happens on the server and it's streamed to the client, the client holds a mini mongo version which doesn't guarantee an order. Therefore you need to sort the client as well.

Returning the Results as well as subscribing in Meteor Mongo Geosearch

Using Meteor.js I've got the following code for my pub/sub working flawlessly. I'm able to pass my arguments through and return the cursors with no problem.
My objective is to display the distance between the current user location and the database result.
As mongodb has already calculated the distance to get the result set I don't want to calculate it again someplace else. I'd like to return the geoNear.results[n].dis results of the $geoNear documented here but can't work out a practical way to go about it. I appreciate the publish only returns a cursor to the docs but wondered if there was some way to attach the results somehow...
Meteor.publishComposite("public", function(location, distance) {
return {
find: function() {
return Tutors.find({},
{
$geoNear: {
$geometry: {
type: "Point" ,
coordinates: [ location.lng , location.lat ]
},
$maxDistance: distance,
},
}
);
}
}
});
My subscribe arguments are simply a lat/lng object and distance in metres.
What if I told you that you could use Mongo aggregation? The general idea here is to get the distance between the current user location and the database result to update automatically with a change in the 'Tutors' collection, thus use publication with an observe to achieve this.
Here's the set-up. The first step is to get the aggregation framework package which wraps up some Mongo methods for you. Just meteor add meteorhacks:aggregate and you should be home and dry. This will add an aggregate() method to your collections.
An alternative to adding the aggregation framework support is to call directly your mongoDB and access the underlying collection methods, which in this case you need the aggregate() method. So, use this to connect in the mongoDB :
var db = MongoInternals.defaultRemoteCollectionDriver().mongo.db,
Tutors = db.collection("tutors");
Now you can dive into the aggregation framework and build up your pipeline queries. The following example demonstrates how to get the aggregation in the publish reactive use a observe in the publish with ES6 in Meteor. This follows the 'counts-by-room' example in the meteor docs. With the observe, you know if a new location has been added, changes or removed. For simplicity re-run the aggregation each time (except remove) and if the location was previously published then update the publish, if the location was removed then remove the location from the publish and then for a new location use added:
Meteor.publish('findNearestTutors', function(opts) {
let initializing = 1, run = (action) => {
// Define the aggregation pipeline
let pipeline = [
{
$geoNear: {
near: {type: 'Point', coordinates: [Number(opts.lng), Number(opts.lat)]},
distanceField: 'distance',
maxDistance: opts.distance,
spherical: true,
sort: -1
}
}
]
Tutors.aggregate(pipeline).forEach((location) => {
// Add each of the results to the subscription.
this[action]('nearest-locations', location._id, location)
this.ready()
})
}
// Run the aggregation initially to add some data to your aggregation collection
run('added')
// Track any changes on the collection you are going to use for aggregation
let handle = Tutors.find({}).observeChanges({
added(id) {
// observeChanges only returns after the initial `added` callbacks
// have run. Until then, you don't want to send a lot of
// `self.changed()` messages - hence tracking the
// `initializing` state.
if (initializing && initializing--)
run('changed')
},
removed(id) {
run('changed')
},
changed(id) {
run('changed')
},
error(err) {
throw new Meteor.Error("Houston, we've got a problem here!", err.message)
}
})
// Stop observing the cursor when client unsubs.
// Stopping a subscription automatically takes
// care of sending the client any removed messages.
this.onStop(function () {
handle.stop();
})
})

Mobile Meteor App - calculating nearby locations and storing as sortable collection

I'm putting together a mobile meteor application that I want to use to list local (say within a 20 mile radius) amenities. I have a collection of these amenities with corresponding latlng data - I'd like to be able to pass the app my current location (using Cordova) and generate a list (/collection?) that is sorted closest first.
I have two specific problems that I'd really appreciate some advice on!
Can I use mongo's $near for this or should I be using a node.js add-on (eg 'GeoLib' - https://github.com/manuelbieh/geolib) to do the distance calculation?
How do I generate a temporary (locally stored) collection of these locations to display in my list? Presumably if I don't use $near I have to iterate through my locations, calculating the distance on all of them and then returning any where the distance is under a certain threshold, but this seems like an expensive way of doing it when my location list will grow and grow.
Sorry, this is the first time I've attempted something like this; I'd really appreciate any advice from a more seasoned developer!
EDIT - MY CODE (WHY IS IT NOT WORKING?!)
I'm storing locations like this in a collection:
Beaches.insert({
name: 'Venice Beach CA',
geometry: {
type: "Point",
coordinates: [-118.473314,118.473314]
}
});
...
Beaches._ensureIndex({'geometry.coordinates':'2d'}, function(err, result) {
if(err) return console.dir(err);
});
I'm querying these entries like this (passing in the lat and lng):
getNearBeaches = function(lng,lat) {
return Beaches.find({'geometry.coordinates':
{
$near: {
$geometry: {
type: "Point",
coordinates: [lng,lat]
}
},
$maxDistance: 20000 //meters
}
})
};
I can list my collection with a straight find() but my location search returns nothing, even if I set the $maxDistance to a huge number, or search directly on an already stored set of coords.
What have I done wrong??
You can fetch the current location, send it to server and subscribe to collection, filtered by mongodb $near:
Meteor.subscribe("amenities", {latlng: Session.get("latlng")});
And smth like this:
Meteor.publish("amenities", function (latlng) {
return Amenities.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: latlng
},
$maxDistance: 20000 //meters
}
}
});
});
You must have correct data type and indexes. And field location (-:

Mongo geolocation using $near and 2d index not being accurate

I've written an application that finds local establishments and delivers them via a RESTful API. My stack is: express, express-resource and mongoose. Here is a sample of my model:
var PlaceSchema = new mongoose.Schema(
{
name: {
type: String,
required: true
},
geolocation: {
lat: Number,
lng: Number
},
address: {
type: String
}
}
);
PlaceSchema.index({
geolocation: '2d'
});
I've checked a few times and my lon/lat values are being correctly stored in the db, so there aren't any data errors.
Then I run the query to grab the data with the specific query values:
mongoose.connection.db.executeDbCommand(
{
geoNear: 'places',
near: [
parseFloat(req.body.lat),
parseFloat(req.body.lng)
],
num: limit,
query: {
// Some additional queries that
// don't impact the geo results.
},
spherical: true,
distanceMultiplier: 6371, // converting results to km
maxDistance: distance / 6371,
},
function(err, result)
{
res.send(200, {
status: true,
data: theaters
});
}
);
There are a few issues with the results that it's returning: a) the calculation in km is really wrong. In most cases there's a 6-7km difference, but it varies, b) places that are closer are appearing farther than other places.
I'm using the direct mongoose query because it will return the calculated distance (which I require for my API returns). Switching to the mongoose find method apparently wouldn't let me gain access to this data.
Wondering if my index is perhaps wrong? Should it be 2dsphere instead? The documentation in that regard is slightly confusing, but most of the examples I see use just 2d.
Thanks a bunch!
No matter what type of geospatial indexing you use in MongoDB, you always must store longitude first and then latitude.
From http://docs.mongodb.org/manual/core/2d/#store-points-on-a-2d-plane and multiple other places in the docs:
Whether as an array or document, if you use longitude and latitude,
store coordinates in this order: longitude, latitude.