Geonear and more than one 2dsphere indexes - mongodb

I have a question about use of $near vs geonear in returning distance from stored points in database from the user entered point of interest, if more than one 2dsphere index is present in the schema storing the points.
The use case is below.
In my schema I have a source and a destination location as below. The query using Intracity.find works properly and gives me sorted entries from an entered point of interest.
var baseShippingSchema = new mongoose.Schema({
startDate : Date,
endDate : Date,
locSource: {
type: [Number],
index: '2dsphere'
},
locDest: {
type: [Number],
index: '2dsphere'
}
});
var search_begin = moment(request.body.startDate0, "DD-MM-YYYY").toDate();
var search_end = moment(request.body.endDate1, "DD-MM-YYYY").toDate();
var radius = 7000;
Intracity.find({
locSource: {
$near:{$geometry: {type: "Point",
coordinates: [request.body.lng0,request.body.lat0]},
$minDistance: 0,
$maxDistance: radius
}
}).where('startDate').gte(search_begin)
.where('endDate').lte(search_end)
.limit(limit).exec(function(err, results)
{
response.render('test.html', {results : results, error: error});
}
However, I also want to return the "distance" of the stored points from the point of interest, which as per my knowledge and findings, is not possible using $near but is possible using geonear api.
However, the documentation of geonear says the following.
geoNear requires a geospatial index. However, the geoNear command requires that a collection have at most only one 2d index and/or only one 2dsphere.
Since in my schema I have two 2dspehere indexes the following geonear api fails with the error "more than one 2d index, not sure which to run geoNear on"
var point = { name: 'locSource', type : "Point",
coordinates : [request.body.lng0 , request.body.lat0] };
Intracity.geoNear(point, { limit: 10, spherical: true, maxDistance:radius, startDate:{ $gte: search_begin}, endDate:{ $lte:search_end}}, function(err, results, stats) {
if (err){return done(err);}
response.render('test.html', {results : results, error: error});
});
So my question is how can I also get the distance for each of these stored points, from entered point of interest using the schema described above.
Any help would be really great, as my Internet search is not going anywhere.
Thank you
Mrunal

As you noted the mongodb docs state that
The geoNear command and the $geoNear pipeline stage require that a collection have at most only one 2dsphere index and/or only one 2d index
On the other hand calculating distances inside mongo is only possible with the aggregation framework as it is a specialized projection. If you do not want to take option
relational DB approach: maintaining a separate distance table between all items
then your other option is to
document store approach: calculate distances in your server side JS code. You would have to cover memory limits by paginating results.

Related

Filter Documents by Distance Stored in Document with $near

I am using the following example to better explain my need.
I have a set of points(users) on a map and collection schema is as below
{
location:{
latlong:[long,lat]
},
maxDistance:Number
}
i have another collection with events happening in the area. schema is given below
{
eventLocation:{
latlong:[long,lat]
}
}
now users can add their location and the maximum distance they want to travel for to attend an event and save it.
whenever a new event is posted , all the users satisfying their preferences will get a notification. Now how do i query that. i tried following query on user schema
{
$where: {
'location.latlong': {
$near: {
$geometry: {
type: "Point",
coordinates: [long,lat]
},
$maxDistance: this.distance
}
}
}
}
got an error
error: {
"$err" : "Can't canonicalize query: BadValue $where got bad type",
"code" : 17287
}
how do i query the above case as maxDistance is defined by user and is not fixed. i am using 2dsphere index.
Presuming you have already worked out to act on the event data as you recieve it and have it in hand ( if you have not, then that is another question, but look at tailable cursors ), then you should have an object with that data for which to query the users with.
This is therefore not a case for JavaScript evaluation with $where, as it cannot access the query data returned from a $near operation anyway. What you want instead is $geoNear from the aggregation framework. This can project the "distance" found from the query, and allow a later stage to "filter" the results against the user stored value for the maximum distance they want to travel to published events:
// Represent retrieved event data
var eventData = {
eventLocation: {
latlong: [long,lat]
}
};
// Find users near that event within their stored distance
User.aggregate(
[
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": eventData.eventLocation.latlong
},
"distanceField": "eventDistance",
"limit": 100000,
"spherical": true
}},
{ "$redact": {
"$cond": {
"if": { "$lt": [ "$eventDistance", "$maxDistance" ] },
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
]
function(err,results) {
// Work with results in here
}
)
Now you do need to be careful with the returned number, as since you appear to be storing in "legacy coordinate pairs" instead of GeoJSON, then the distance returned from this operation will be in radians and not a standard distance. So presuming you are storing in "miles" or "kilometers" on the user objects then you need to calculate via the formula mentioned in the manual under "Calculate Distances Using Spherical Geometry" as mentioned in the manual.
The basics are that you need to divide by the equatorial radius of the earth, being either 3,963.2 miles or 6,378.1 kilometers to convert for a comparison to what you have stored.
The alternate is to store in GeoJSON instead, where there is a consistent measurement in meters.
Assuming "kilometers" that "if" line becomes:
"if": { "$lt": [
"$eventDistance",
{ "$divide": [ "$maxDistance", 6,378.1 ] }
]},
To reliably compare your stored kilometer value to the radian result retured.
The other thing to be aware of is that $geoNear has a default "limit" of 100 results, so you need to "pump up" the "limit" argument there to the number for expected users to possibly match. You might even want to do this in "range lists" of user id's for a really large system, but you can go as big as memory allows within a single aggreation operation and possibly add allowDiskUse where needed.
If you don't tune that parameter, then only the nearest 100 results ( default ) will be returned, which may well no even suit your next operation of filtering those "near" the event to start with. Use common sense though, as you surely have a max distance to even filter out potential users, and that can be added to the query as well.
As stated, the point here is returning the distance for comparison, so the next stage is the $redact operation which can fiter the user's own "travel distance" value against the returned distance from the event. The end result gives only those users that fall within their own distance contraint from the event who will qualify for notification.
That's the logic. You project the distance from the user to the event and then compare to the user stored value for what distance they are prepared to travel. No JavaScript, and all native operators that make it quite fast.
Also as noted in the options and the general commentary, I really do suggest you use a "2dsphere" index for accurate spherical distance calculation as well as converting to GeoJSON storage for your coordinate storage in your database Objects, as they are both general standards that produce consistent results.
Try it without embedding your query in $where: {. The $where operator is for passing a javascript function to the database, which you don't seem to want to do here (and is in fact something you should generally avoid for performance and security reasons). It has nothing to do with location.
{
'location.latlong': {
$near: {
$geometry: {
type: "Point",
coordinates: [long,lat]
},
$maxDistance: this.distance
}
}
}

How to find nearby events or tweets

I'm new to NoSQL databases and I'm stuck with a fairly basic query.
I have a collection of tweets in a MongoDB database, which I'm querying through both the Mongo shell and pyMongo. The documents are similar to:
{ loc : { lng : 40, lat : 3 },
timestamp : 124125512,
userid = 55 }
I need to find all pairs of users with events close to each other with less than 4 hours of difference. The most naive way would be:
db.tweets.find().forEach(function(tweet)
{
found = db.tweets.find({ "timestamp": { "$gt" : tweet['timestamp'] - 60*60*4,
"$lt" : tweet['timestamp'] + 60*60*4},
"loc" : {"$near" : [ tweet['loc']['lng'],
tweet['loc']['lat'] ],
"$maxDistance" : 500 }
});
//... extract the users from those tweets...
}
Which of course is extremely slow (it can contain as many as a few million tweets).
I haven't been able to express this query using neither aggregation nor MapReduce. How would you do it? What is the most NoSQL-y, efficient and clear way of making this kind of query?
EDIT: I've kind of given up. I've been convinced by a friend that it is not going to worth it using Mongo for this. I can leverage that time restriction to avoid iterating over the whole collection and do it in a simple, more traditional iterative script. Since it is not such a huge dataset as to not fit in RAM, it's going to be faster.
Use $near in conjuction with $maxDistance is the most recommended way
db.collectionName.find({loc: {$near: [50, 50], $maxDistance: 5}});
For performance issues you can try creating index as mentioned below:
To create a geospatial index for GeoJSON-formatted data, use the ensureIndex() method and set the value of the location field for your collection to 2dsphere.
db.points.ensureIndex( { loc : "2dsphere" } );
For more information:
Index creation
Build a 2dsphere index
Geospatial indexes and queries

mongodb and geospatial schema

im breaking my head with mongo and geospatial,
so maybe someone has some idea or solution how to solve this:
my object schema is like this sample for geoJSON taken from http://geojson.org/geojson-spec.html.
{
"name":"name",
"geoJSON":{
"type":"FeatureCollection",
"features":[
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100,0],[101,0],[101,1],[100,1],[100,0]]]},"properties":{}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100,0],[101,0],[101,1],[100,1],[100,0]]]},"properties":{}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100,0],[101,0],[101,1],[100,1],[100,0]]]},"properties":{}}
]}}
additional info: I'm using spring data but that shouldn't influence the answer.
main problem is how/where to put indexes in this schema. I need to make a query to find all documents for given Point if some polygon intersects.
thanks in advance.
By creating a 2d or 2dsphere index on geoJSON.features.geometry you should be able to create an index covering all of the geoJSON-objects.
To get all documents where at least one of the sub-object in the features array covers a certain point, you can use the $geoIntersects operator with a geoJSON Point:
db.yourcollection.find(
{ `geoJSON.features.geometry` :
{ $geoIntersects :
{ $geometry :
{ type : "Point" ,
coordinates: [ 100.5 , 0.5 ]
}
}
}
}
)

Search all polygons that contains a series of points in mongodb

My question is similar to this one Given a set of polygons and a series of points, find the which polygons are the points located
I have a mongodb database with two collections, regions that stores a set of polygons (italy provinces), and points that stores a set of specimens each with a coordinate pair. I am using mongodb 2.4 and GeoJSON format to store data, both collections have a 2dsphere index.
I am able to find if a given point is inside a given polygon.
Now I would like to find all polygons that contains a list of specimens, to draw a map like this one http://www.peerates.org/province.png
Is there a better solution than iterate over all points and check if it is inside each polygon, leveraging mongodb geoindexes?
edit:
i found a partial solution using a function stored in system.js collection
function(){
var found = [];
var notfound = [];
db.regions.find().forEach(
function(region){
var regionId = region._id;
var query = {
'loc':{
$geoWithin: {
$geometry: region.loc
}
}
};
var len = db.points.find(query).size();
if(len>0){
found.push(regionId);
}else{
notfound.push(regionId);
}
}
);
return {
"found":found,
"notfound":notfound
};
}
sadly I cannot use it on mongohq.com it looks like eval() is no more supported.
#RickyA thank you, I will consider moving to a postgis
There is $geoIntersects in the mongodb geospatial library that does that.

how to deal with complicated query in mongodb?

I use mongodb to save the temporal and spatial data, and the document item is structured as follows:
doc = { time:t,
geo:[x,y]
}
If the different of two docs are defined as:
dist(doc1, doc2) = |t1-t2| + |x1-x2| + |y1 - y2|
How can I query the documents by mongodb and sort the results by their distance to a given document doc0 ={ time:t0, geo:[x0,y0] }?
thanks
Instead of calculating the distance manually, you could trust mongodb with that task. Mongodb has built in geospatial query support.
This would look like this:
db.docs.find( {
"time": "t0",
"geo" : { $near : [x0,y0] }
} ).limit(20)
The result would be all documents near the given location [x0,y0], automatically ordered by distance to that point.