How to get closest 100 documents in Mongo? - mongodb

I want to query a database for the first 100 documents that are closest to me in MongoDB. Once I have the closest 100 documents, I want to sort them by custom fields in the documents. Such as createdAt or points. It seems like $near is what I want, but their docs say:
When using sort() with geospatial queries, consider using $geoWithin operator, which does not sort documents, instead of $near.
So it seems like they suggest using $geoWithin, but I don't want to constrain the search to a specific range. Suggestions?

Related

MongoDB Geospatial and createdAt sorting

I have a headache for a idea how to properly sort data from a MongoDB. It is using 2dsphere index and has timestamp createdAt. The goal is to show latest pictures (that what this collection is about, just a field mediaUrl...) but it has to be close to the user. I'm not very familiar with complex MongoDB aggregation queries so I thought here's a good place to ask. Sorting with $near shows only items sorted by distance. But there's a upload time, e.g. if item is 5 min fresh but is like 500 meters far than older item it still should be sorted higher.
Ugly way would be to iterate every few hundreds meters and collect data but maybe there's a smarter way?
So if I am correct you want to be able to sort on 2 fields ?
distance
timestamp
You should check out this method:
https://docs.mongodb.com/manual/reference/operator/aggregation/sort/
It allows you to sort multiple columns.

Mongo intersection measure select

I have the following documents:
{'variations': ['BlueViolet', 'CadetBlue', 'Cyan']}
{'variations': ['LightPink', 'VioletRed']}
And I want to write a query that selects all documents where size of intersection between variations field and {'Cyan', 'CadetBlue', 'SmoothsRed'} is greater than 2.
Can this be performed with mongodb operators?
This link explains how to achieve what you want - https://docs.mongodb.com/manual/reference/operator/aggregation/setIntersection/
In order to write the comparison, I'll have to assume each document can be uniquely identified using an _id value, and then you can write your query using the solution given in this answer - How to find set intersection of sets between the documents in a single collection in MongoDB?
Good luck

MongoDB query: Using Limit together with $near skips few documents

I am currently developing an app which gets the specific number of documents from a collection if their location cordinates falls within certain range of distance. I am using a active record library for Codeigniter and the query that is generated is as follows
db.updates.find({locs: { $near: [72.844102008984, 19.130207090604 ], $maxDistance: 5000 }, posted_on : { $lt :1398425538.1942 },}).sort( { posted_on: -1 } ).limit(10).toArray()
The problem I am facing is that the above query skips few documents which should actually get pulled. But if I remove the limit(10) from the above query then proper documents gets pulled.
I am not sure, but does using limit() in MongoDB omit few results ? or does it limits to only the closest(nearest) documents?
P.S - The documents skipped using the limit are not always the same & random results are generated
I suspect you are running into problems with the special nature of the $near query. $near performs both a limit() and a sort() on the cursor returning the results -
Specifies a point for which a geospatial query returns the closest documents first. The query sorts the documents from nearest to farthest.
By default, queries that use a 2d index return a limit of 100 documents; however you may use limit() to change the number of results.
http://docs.mongodb.org/manual/reference/operator/query/near/
While the documentation does specifically discuss overriding the limit of 100 with your own limit call
You can further limit the number of results using cursor.limit().
It is silent on adding your own sort() or both sorting and overriding the limit at the same time. I suspect you are running into side effects of doing both. Note that it's not incorrect to do both - it just may not produce the results you are looking for. I'd suggest trying the same query using $geoWithin
http://docs.mongodb.org/manual/reference/operator/query/geoWithin/
$geoWithin does not apply a sort or a limit on the results, so it gives you something of a more raw result set.
Do you have any identical posted_on dates in the system? I recommend sorting by a second key, perhaps _id. If the sort order is non-deterministic the system may skip documents in a non-deterministic manor. Adding the _id field to your sort order is generally not that expensive if you have an index on the other fields as they will already be very close to the correct order and _id is part of all indexes. ("By default, all collections have an index on the _id field, and applications and users may add additional indexes to support important queries and operations." http://docs.mongodb.org/manual/core/index-single/ )

How does sorting work in MongoDB?

I have this query: (Doctrine2 ODM)
....
->field('coordinates')
->near(
(float)$lat,
(float)$lng)
->field('date')
->lte($lastdate)
->sort('date','desc')
->limit(9)
->getQuery()
->execute()
->toArray();
It gives me documents with the following dates : (example)
2014-03-31 01:51:06
2014-03-31 01:51:02
2014-03-31 01:50:46
2014-03-31 01:50:07
IF I change the limit to 20 for example , I get these dates:
2014-03-31 01:52:01
2014-03-31 01:51:42
2014-03-31 01:51:16
2014-03-31 01:51:06
My question is why these dates were skipped in the first query ?
Does mongoDB collects the first documents that match the criteria then sort them ?
That would be very stupid !!
I changed the order in the query (criteria after sorting) but it doesn't seem to have any effect.. WTF
$near will get you the nearest documents; it's not going to get you newer documents further away.
From the doc:
Specifies a point for which a geospatial query returns the closest documents first. The query sorts the documents from nearest to farthest.
$near without $maxDistance is inherently a sort; without specifying the bounds of the data you're interested in, applying a sort that superscedes the $near query would effectively negate the $near.
You might be able to combine a $near query that specifies both $geometry and $maxDistance to filter a result set, then sort to sort on the date of that filtered set, though. I haven't tried it, but it would change the semantics of your $near clause such that it may work.
If you want to sort by distance, use $near, but if you want to sort by a different attribute, use $within now $geoWithin operator which returns results that are within a radius without sorting them, allowing you to sort on a different criteria.

Search two location points in one go using Mongodb

Say I have a document, which looks like
{loc:{start:[x,y],end:[x1,y1]}}
with 2dsphere index, and I want to search for all the documents that have the same start and end points in one query. Is it possible to do it in MongoDB?
A geospatial index can only cover one field with a legacy coordinate pair (array with two values) or a valid GeoJSON Point, Polygon or LineString. That means you can only index either the start-position or the end-position with a geospatial index.
But this is not a problem when you are searching for an exact match. A 2dsphere index offers some additional geographical operators, but when you only need to test for exact equality, you don't need those and can treat your geo-coordinates like plain old data. So when you just add a vanilla index on the loc field and use a vanilla find() with the values you are searching, you will get your matches.