how to deal with complicated query in mongodb? - mongodb

I use mongodb to save the temporal and spatial data, and the document item is structured as follows:
doc = { time:t,
geo:[x,y]
}
If the different of two docs are defined as:
dist(doc1, doc2) = |t1-t2| + |x1-x2| + |y1 - y2|
How can I query the documents by mongodb and sort the results by their distance to a given document doc0 ={ time:t0, geo:[x0,y0] }?
thanks

Instead of calculating the distance manually, you could trust mongodb with that task. Mongodb has built in geospatial query support.
This would look like this:
db.docs.find( {
"time": "t0",
"geo" : { $near : [x0,y0] }
} ).limit(20)
The result would be all documents near the given location [x0,y0], automatically ordered by distance to that point.

Related

MongoDB best optimisation on embedded documents using index

In MondoDB, how can you optimize better a collection with embedded documents using index, if you need to query the embedded document?
For example, in a collection with the following format:
{
name: “Andy”,
address: {
city: “London”,
street: “Sunny St.”
}
}
If we need to query by:
db.collection.find( {$and: [ {"address.city ": “London”}, {"address”: “Sunny St."} ] } )
Which type of index will be better:
1. db.collection.createIndex({"address":1})
2. db.collection.createIndex({"address.city ":1})
db.collection.createIndex({"address.street":1})
3. db.collection.createIndex({"address.city ":1, "address.street":1})
Thanks
for given query proposal number 3
db.collection.createIndex({"address.city ":1, "address.street":1})
will do the job as there is logical relation city=> street
if you need to get more precise output how mongo uses index and perform your own test use query.explain("executionStats") to see index usage.
more here

Geonear and more than one 2dsphere indexes

I have a question about use of $near vs geonear in returning distance from stored points in database from the user entered point of interest, if more than one 2dsphere index is present in the schema storing the points.
The use case is below.
In my schema I have a source and a destination location as below. The query using Intracity.find works properly and gives me sorted entries from an entered point of interest.
var baseShippingSchema = new mongoose.Schema({
startDate : Date,
endDate : Date,
locSource: {
type: [Number],
index: '2dsphere'
},
locDest: {
type: [Number],
index: '2dsphere'
}
});
var search_begin = moment(request.body.startDate0, "DD-MM-YYYY").toDate();
var search_end = moment(request.body.endDate1, "DD-MM-YYYY").toDate();
var radius = 7000;
Intracity.find({
locSource: {
$near:{$geometry: {type: "Point",
coordinates: [request.body.lng0,request.body.lat0]},
$minDistance: 0,
$maxDistance: radius
}
}).where('startDate').gte(search_begin)
.where('endDate').lte(search_end)
.limit(limit).exec(function(err, results)
{
response.render('test.html', {results : results, error: error});
}
However, I also want to return the "distance" of the stored points from the point of interest, which as per my knowledge and findings, is not possible using $near but is possible using geonear api.
However, the documentation of geonear says the following.
geoNear requires a geospatial index. However, the geoNear command requires that a collection have at most only one 2d index and/or only one 2dsphere.
Since in my schema I have two 2dspehere indexes the following geonear api fails with the error "more than one 2d index, not sure which to run geoNear on"
var point = { name: 'locSource', type : "Point",
coordinates : [request.body.lng0 , request.body.lat0] };
Intracity.geoNear(point, { limit: 10, spherical: true, maxDistance:radius, startDate:{ $gte: search_begin}, endDate:{ $lte:search_end}}, function(err, results, stats) {
if (err){return done(err);}
response.render('test.html', {results : results, error: error});
});
So my question is how can I also get the distance for each of these stored points, from entered point of interest using the schema described above.
Any help would be really great, as my Internet search is not going anywhere.
Thank you
Mrunal
As you noted the mongodb docs state that
The geoNear command and the $geoNear pipeline stage require that a collection have at most only one 2dsphere index and/or only one 2d index
On the other hand calculating distances inside mongo is only possible with the aggregation framework as it is a specialized projection. If you do not want to take option
relational DB approach: maintaining a separate distance table between all items
then your other option is to
document store approach: calculate distances in your server side JS code. You would have to cover memory limits by paginating results.

How to find nearby events or tweets

I'm new to NoSQL databases and I'm stuck with a fairly basic query.
I have a collection of tweets in a MongoDB database, which I'm querying through both the Mongo shell and pyMongo. The documents are similar to:
{ loc : { lng : 40, lat : 3 },
timestamp : 124125512,
userid = 55 }
I need to find all pairs of users with events close to each other with less than 4 hours of difference. The most naive way would be:
db.tweets.find().forEach(function(tweet)
{
found = db.tweets.find({ "timestamp": { "$gt" : tweet['timestamp'] - 60*60*4,
"$lt" : tweet['timestamp'] + 60*60*4},
"loc" : {"$near" : [ tweet['loc']['lng'],
tweet['loc']['lat'] ],
"$maxDistance" : 500 }
});
//... extract the users from those tweets...
}
Which of course is extremely slow (it can contain as many as a few million tweets).
I haven't been able to express this query using neither aggregation nor MapReduce. How would you do it? What is the most NoSQL-y, efficient and clear way of making this kind of query?
EDIT: I've kind of given up. I've been convinced by a friend that it is not going to worth it using Mongo for this. I can leverage that time restriction to avoid iterating over the whole collection and do it in a simple, more traditional iterative script. Since it is not such a huge dataset as to not fit in RAM, it's going to be faster.
Use $near in conjuction with $maxDistance is the most recommended way
db.collectionName.find({loc: {$near: [50, 50], $maxDistance: 5}});
For performance issues you can try creating index as mentioned below:
To create a geospatial index for GeoJSON-formatted data, use the ensureIndex() method and set the value of the location field for your collection to 2dsphere.
db.points.ensureIndex( { loc : "2dsphere" } );
For more information:
Index creation
Build a 2dsphere index
Geospatial indexes and queries

mongodb and geospatial schema

im breaking my head with mongo and geospatial,
so maybe someone has some idea or solution how to solve this:
my object schema is like this sample for geoJSON taken from http://geojson.org/geojson-spec.html.
{
"name":"name",
"geoJSON":{
"type":"FeatureCollection",
"features":[
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100,0],[101,0],[101,1],[100,1],[100,0]]]},"properties":{}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100,0],[101,0],[101,1],[100,1],[100,0]]]},"properties":{}},
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100,0],[101,0],[101,1],[100,1],[100,0]]]},"properties":{}}
]}}
additional info: I'm using spring data but that shouldn't influence the answer.
main problem is how/where to put indexes in this schema. I need to make a query to find all documents for given Point if some polygon intersects.
thanks in advance.
By creating a 2d or 2dsphere index on geoJSON.features.geometry you should be able to create an index covering all of the geoJSON-objects.
To get all documents where at least one of the sub-object in the features array covers a certain point, you can use the $geoIntersects operator with a geoJSON Point:
db.yourcollection.find(
{ `geoJSON.features.geometry` :
{ $geoIntersects :
{ $geometry :
{ type : "Point" ,
coordinates: [ 100.5 , 0.5 ]
}
}
}
}
)

Updating MongoDB document for geospatial searching

Currently, I have my lat/long in separate fields in my MongoDB database, but if I want to do geospatial searching I need to have them in this format:
{ location : [ 50 , 30 ] }
By what means can I transpose the values of my lat/long keys into a new key per document as per above?
TIA!
You will have to iterate through all your documents that don't have a location field and add it (presumably deleting the lat/long fields unless this will break your application).
db.mycollection.find( { location : { $exists : false } } ).forEach(
function (doc) {
// Add (lon, lat) pairs .. order is important
doc.location = { lon: doc.lon, lat: doc.lat };
// Remove old properties
delete doc.lon;
delete doc.lat;
// Save the updated document
db.mycollection.save(doc);
}
)
Note that the order for MongoDB geospatial indexing should be consistent in your document as (longitude, latitude).