Geospatial index in MongoDB makes no difference to performance - mongodb

I'm trying to find which documents are geo-located within a given rectangle. I have a Mongo collection looking a bit like:
{
...
"metadata" : {
...
"geometry" : { "type" : "Point", "coordinates" : [ -0.000, 51.477 ] }
}
}
And my query looks like:
db.my_coll.find({ "$query" : {
"metadata.geometry" : {
"$geoIntersects" : {
"$geometry" : { "type" : "Polygon", "coordinates" : [ [ [..., ...], ... ] ] }
} } }, "$explain":1})
With no geospatial index I get:
{
"cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 646,
"nscannedObjects" : 19539,
"nscanned" : 19539,
"nscannedObjectsAllPlans" : 19539,
"nscannedAllPlans" : 19539,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 152,
"nChunkSkips" : 0,
"millis" : 125,
...
With the geospatial index db.my_coll.ensureIndex({"metadata.geometry" : "2dsphere"}); I get:
{
"cursor" : "BtreeCursor metadata.geometry_2dsphere",
"isMultiKey" : false,
"n" : 646,
"nscannedObjects" : 18726,
"nscanned" : 18727,
"nscannedObjectsAllPlans" : 18726,
"nscannedAllPlans" : 18727,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 146,
"nChunkSkips" : 0,
"millis" : 161,
...
i.e.it's slower with the index when explaining. Querying from an outside application shows no significant difference in query time with or without the index (ms resolution). What am I doing wrong? Shouldn't the index make the query rather faster than this?
Thanks :-)

Related

mongodb $nearSphere performance issue with huge data(over 200w+)

mongodb version is 2.6
the total records in the collection is over 200w
details as below
collection table structure as below:
{
"postid":NumberLong(97040),
"accountid":NumberLong(348670),
"location":{
"type":"Point",
"coordinates":[
112.56531,
32.425657
]
},
"type":NumberLong(1),
"countspreads":NumberLong(6),
"countavailablespreads":NumberLong(6),
"timestamp":NumberLong(1428131578)
}
collection index is 2dsphere:
{
"v" : 1,
"key" : {
"location" : "2dsphere"
},
"name" : "location_2dsphere",
"ns" : "fly.postspreads",
"2dsphereIndexVersion" : 2
},
** query command ** as below
db.example.find({"location":{"$nearSphere":{"$geometry":{"type":"Point","coordinates":[113.547821,22.18648]},"$maxDistance":50000, "$minDistance":0}}}).explain()
result
{
"cursor" : "S2NearCursor",
"isMultiKey" : false,
"n" : 145255,
"nscannedObjects" : 1290016,
"nscanned" : 1290016,
"nscannedObjectsAllPlans" : 1290016,
"nscannedAllPlans" : 1290016,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 4087,
"indexBounds" : {
},
"server" : "DB-SH-01:27017",
"filterSet" : false
}
the value of $maxDistance is too large for now, from the above result, we will find that the command was scanned over 100w+ records and cost 4087ms.
if we reduce the value of $maxDistance to 500, new result as below:
{
"cursor" : "S2NearCursor",
"isMultiKey" : false,
"n" : 21445,
"nscannedObjects" : 102965,
"nscanned" : 102965,
"nscannedObjectsAllPlans" : 102965,
"nscannedAllPlans" : 102965,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 634,
"indexBounds" : {
},
"server" : "DB-SH-01:27017",
"filterSet" : false
}
now the command was scanned over 10w+ records and cost 634ms, the query speed is too slow too. even if i reduce the value of $maxDistance to 0.0001, the scanned records over 8w+, and the time is about 600ms too.
the query time is unacceptable, but i didn't find where was wrong.

MongoDB index intersection

Hey I want to evaluate the performance of index intersection but I'm not able to get an intersection between two indices.
I've inserted some dummy records into my DB along this manual.
http://docs.mongodb.org/manual/core/index-intersection/
Insert code:
for(var i=0;i<1000;i++){
for(var j=0;j<100;j++){
db.t.insert({item:"abc"+i,qty:j})
}
}
Indices:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "db.t"
},
{
"v" : 1,
"key" : {
"qty" : 1
},
"name" : "qty_1",
"ns" : "db.t"
},
{
"v" : 1,
"key" : {
"item" : 1
},
"name" : "item_1",
"ns" : "db.t"
}
]
Query:
db.t.find({item:"abc123",qty:{$gt:15}}).explain()
Result of explain:
{
"cursor" : "BtreeCursor item_1",
"isMultiKey" : false,
"n" : 84,
"nscannedObjects" : 100,
"nscanned" : 100,
"nscannedObjectsAllPlans" : 201,
"nscannedAllPlans" : 305,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 2,
"nChunkSkips" : 0,
"millis" : 1,
"indexBounds" : {
"item" : [
[
"abc123",
"abc123"
]
]
},
"server" : "brews18:27017",
"filterSet" : false
}
My question is why mongo is only using item as an index an does not use an intersection.
Thanks in advance
Well it actually does even though it does not in this case. To really see what is happening you need to look at the "verbose" form of explain, by adding true:
db.t.find({item:"abc123",qty:{$gt:15}}).explain(true)
{
"cursor" : "BtreeCursor item_1",
"isMultiKey" : false,
"n" : 84,
"nscannedObjects" : 100,
"nscanned" : 100,
"nscannedObjectsAllPlans" : 201,
"nscannedAllPlans" : 304,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 2,
"nChunkSkips" : 0,
"millis" : 2,
"indexBounds" : {
"item" : [
[
"abc123",
"abc123"
]
]
},
"allPlans" : [
{
"cursor" : "BtreeCursor item_1",
"isMultiKey" : false,
"n" : 84,
"nscannedObjects" : 100,
"nscanned" : 100,
"scanAndOrder" : false,
"indexOnly" : false,
"nChunkSkips" : 0,
"indexBounds" : {
"item" : [
[
"abc123",
"abc123"
]
]
}
},
{
"cursor" : "BtreeCursor qty_1",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 101,
"nscanned" : 102,
"scanAndOrder" : false,
"indexOnly" : false,
"nChunkSkips" : 0,
"indexBounds" : {
"qty" : [
[
15,
Infinity
]
]
}
},
{
"cursor" : "Complex Plan",
"n" : 0,
"nscannedObjects" : 0,
"nscanned" : 102,
"nChunkSkips" : 0
}
],
Cut short, but the last part is what you are looking for. As explained in the manual, the appearance of "Complex Plan" means an intersection is being used.
{
"cursor" : "Complex Plan",
"n" : 0,
"nscannedObjects" : 0,
"nscanned" : 102,
"nChunkSkips" : 0
}
The only case here is that while it is being "looked at" it is not being chosen by the optimizer in this case as the most "optimal" query. So the optimizer is saying that in fact the plan using just the one selected index, is the one that will complete in the most responsive fashion.
So while the "intersection" was considered, it was not the "best fit" and the single index was chosen.

mongodb slow query with $near and other condition

I have a mongodb collection named rooms, and it has a 2d index for field location. I've queried like this:
db.rooms.find( { "location" : { "$near" : { "latitude" : 37.3356135, "longitude" : 127.12383030000001 } }, "status": "open", "updated" : { "$gt" : ISODate("2014-06-03T15:34:22.213Z") }}).explain()
The result:
{
"cursor" : "GeoSearchCursor",
"isMultiKey" : false,
"n" : 7,
"nscannedObjects" : 143247,
"nscanned" : 143247,
"nscannedObjectsAllPlans" : 143247,
"nscannedAllPlans" : 143247,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 1457,
"indexBounds" : {
},
"server" : "ip-10-162-39-56:27017",
"filterSet" : false
}
Sometimes it takes more than 2000ms. But if I remove the $gt condition for updated field, the query is fast, about 5~30ms.
> db.rooms.find( { "location" : { "$near" : { "latitude" : 37.3356135, "longitude" : 127.12383030000001 } }, "status": "open"}).explain()
{
"cursor" : "GeoSearchCursor",
"isMultiKey" : false,
"n" : 100,
"nscannedObjects" : 1635,
"nscanned" : 2400,
"nscannedObjectsAllPlans" : 1635,
"nscannedAllPlans" : 2400,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 22,
"indexBounds" : {
},
"server" : "ip-10-162-39-56:27017",
"filterSet" : false
}
I've tried compound index for {location:"2d", updated: -1} but it didn't work. How can I make this query faster?

Mongodb indexing

I have a query
db.messages.find({'headers.Date':{'$gt': new Date(2001,3,1)}},{'headers.From':1, _id:0}).sort({'headers.From':1})
I have set headers.From as index. Now which part of query will use this index ? i.e find part of query or sort part of query?
Explain output is
{
"cursor" : "BtreeCursor headers.From_1",
"isMultiKey" : false,
"n" : 83057,
"nscannedObjects" : 120477,
"nscanned" : 120477,
"nscannedObjectsAllPlans" : 120581,
"nscannedAllPlans" : 120581,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 250,
"indexBounds" : {
"headers.From" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
},
"server" : "Andrews-iMac.local:27017"
}
Any help is appreciated !!!
The index is being used for the sort part, not for the query, as your query doesn't use the headers.From field and your sort does.

slow Mongodb $near search with additional criteria

I have a collection, the data look like this:
{
"_id" : ObjectId("4e627655677c27cf24000000"),
"gps" : {
"lng" : 116.343079,
"lat" : 40.034283
},
"lat" : 1351672296
}
And I build a compound index:
{
"v" : 1,
"key" : {
"gps" : "2d",
"lat" : 1
},
"ns" : "test.user",
"name" : "gps__lat_1"
}
A pure $near query like below can be very fast ( < 20ms ):
>db.user.find({"gps":{"$near":{"lng":116.343079,"lat":40.034283}}}).explain()
{
"cursor" : "GeoSearchCursor",
"nscanned" : 100,
"nscannedObjects" : 100,
"n" : 100,
"millis" : 23,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
}
}
But the query with "lat" criteria is very slow ( 900ms+ ):
>db.user.find({"gps":{"$near":{"lng":116.343079,"lat":40.034283}},"lat":{"$gt":1351413167}}).explain()
{
"cursor" : "GeoSearchCursor",
"nscanned" : 3,
"nscannedObjects" : 3,
"n" : 3,
"millis" : 665,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
}
}
Can anybody explain this? Great thx!
I updated my Mongodb up to 2.2.0, the problem disappeared.
127.0.0.1/test> db.user.find({gps:{$near:[116,40]},lat:{$gt:1351722342}}).explain()
{
"cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 0,
"nscanned" : 0,
"nscannedObjectsAllPlans" : 0,
"nscannedAllPlans" : 0,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 0,
"indexBounds" : {
},
"server" : "zhangshenjiamatoMacBook-Air.local:27017"
}
From the explain above, it doesn't look like the geoIndex is being used at all - also, it looks like the above query didn't return any results!
If your query is using the 2d index, the explain should contain:
"cursor" : "GeoSearchCursor"
Can you check if upgrading to 2.2.0 really solved your issue? :)
Sundar