Query near vs. within - mongodb

Using MongoDB I'm querying homes that are within 25 miles of a lat/long.
My first attempt to do this used the near command, like so:
var near = Query.Near("Coordinates", coordinates.Latitude, coordinates.Longitude, find.GetRadiansAway(), false);
var query = Collection().Find(near);
var listings = query.ToList();
The issue with near is that it only returns 100 listings, whereas I want to return all listings within 25 miles of the coordinates.
My next attempt was to use within:
var within = Query.WithinCircle("Coordinates", coordinates.Latitude, coordinates.Longitude, find.GetRadiansAway(), false);
var query = Collection().Find(within);
var listings = query.ToList();
Within returns all listings within 25 miles, which is great, however it doesn't sort them by how close they are to the center coordinates like near does.
So my question is, how do I get the best of both worlds? How do I get all listings within 25 miles AND have them sorted by proximity to the center coordinates?

Geospatial $near queries set a default limit() of 100 results. You should be able to get more results by setting a new limit().
While "near" queries are sorted by distance, "within" is not (although "within" doesn't have a default limit).

Related

Overpass API: query for counting amenity of specified type around set of lat lons

I'm trying to query data from the OSM Overpass API. Specifically I'm trying to determine the count of amenities of a given type around a point (using the 'around' syntax). When running this for many locations (lat, lons) I'm running into a TooManyRequests error.
I have tried to work around by setting sleep time pauses and playing with the timeout header and retry time, but I'm running into the same issue. I'm trying to find a way to adapt the query so that it just returns the count of amenities (of specified type) around each point, rather than the full json of nodes which is more data intensive. My current script is as follows;
# Running Overpass query for each point
results = {}
for n in range(0, 200):
name = df.loc[n]['city']
state = df.loc[n]['state_name']
rad = df.loc[n]['radius_m']
lat = df.loc[n]['lat']
lon = df.loc[n]['lng']
# Overpass query for amenities
start_time = time.time()
api = overpy.Overpass(max_retry_count=None, retry_timeout=2)
r = api.query(f"""
[out:json][timeout:180];
(node["amenity"="charging_station"](around:{rad}, {lat}, {lon});
);
out;
""")
print("query time for "+str(name)+", number "+str(n)+" = "+str(time.time() - start_time))
results[name] = len(r.nodes)
time.sleep(2)
Any help is much appreciated from other Overpass users!
Thanks
In general, you can run out count; to return a count from an overpass API query.
It's hard to say without knowing how your data is specifically structured, but you might have better luck using area to look at specific cities, or regions.
Here is an example that returns the count of all nodes tagged as charging station in Portland, Oregon:
/* charging stations in portland */
area[name="Oregon"]->.state;
area[name="Portland"]->.city;
(
node["amenity"="charging_station"](area.state)(area.city);
);
out count;

How to retrieve all airports around a given osm-ID

I want to retrieve all airports around a city (g.e. Leipzig) within a radius of 50km.
I tried this
[out:csv(::type,name, 'name', ::id; true; ";")];
(
relation["iata"~".*"](around: 50000, 51.3124404, 12.4131075);
node["iata"~".*"](around: 50000, 51.3124404, 12.4131075);
way["iata"~".*"](around: 50000, 51.3124404, 12.4131075);
);
out;
returns
#type;name;name;#id
way;Flughafen Leipzig/Halle;Flughafen Leipzig/Halle;435715317
relation;Leipzig-Altenburg Airport;Leipzig-Altenburg Airport;6116338
But I want to use the osm-id directly.
When I search within Leipzig (admin_level = 8) and a radius
[out:csv(::type,name, 'name', ::id; true; ";")];
area(3600062649) -> .leipzig;
way(around.leipzig:50000)[iata~".*"]({{bbox}});
out;
I retrieve an empty result set
When I search within saxony (admin_level = 4)
area(3600062467) -> .sachsen;
way(area.sachsen)[iata~".*"]({{bbox}});
out;
then I retrieve
#type;name;name;#id
way;Flughafen Leipzig/Halle;Flughafen Leipzig/Halle;435715317
I want to retrieve the same result like I use the lat/lon-information.
Edit:
I get the expected result by retriving the center-lat/lon of a relation (given by an osm-id) and use this lat/lon-information to retrieve all airports within a given radius.
Now I want to skip the step "retrieve center-lat/lon". I want to use the osm-id of a relation and a given radius directly to retrieve all airports within a given radius.
#mmd: I hope it's more understandable.

Printing the calculated distance in SQLAlchemy

I am using Flask-SQLAlchemy with Postgres, Postgis and GEOAlchemy. I am able to sort entries in a table according to a point submitted by the user. I wonder how I could also return the calculated distance...
This is how I sort the items:
result = Event.query.order_by(func.ST_Distance(Event.address_gps, coordinates_point)).paginate(page, 10).items
for result in results:
result_dict = result.to_dict()
return result_dict
according to the users position (coordinates_point). I would like to add an entry in each result in the result_dict which also contains the distance that the item was ordered by. How do I do that? What does func.ST_Distance return?
I tried to add this in the for loop above:
current_distance = func.ST_Distance(Event.address_gps, coordinates_point)
result_dict['distance'] = current_distance
But that did not seem to work.
You can use column labels
query = Event.query.with_entities(Event, func.ST_Distance(Event.address_gps, coordinates_point).label('distance')).order_by('distance')
results = query.paginate(page, 10).items
for result in results:
event = result.Event
distance = result.distance
result_dict = event.to_dict()
result_dict['distance'] = distance

MongoDB Query advice for weighted randomized aggregation

By far I have encountered ways for selecting random documents but my problem is a bit more of a pickle.So here goes
I have a collection which contains say a 1000+ documents (products)
say each document has a more or less generic format of .Say for simplicity it is
{"_id":{},"name":"Product1","groupid":5}
The groupid is a number say between 1 to 20 denoting the product belongs to that group.
Now if my query input is something like an array of {groupid->weight} for eg {[{"2":4},{"7":6}]} and say another parameter n(=10 say) Then I need to be able to pick 4 random documents that belong to groupid 2 and 6 random documents that belong to groupid 7.
The only solution i can think of is to run 'm' subqueries where m is the array length in the query input.
How do I accomplish this an efficient manner in MongoDB using probably a Mapreduce.
Picking up n random documents for each group.
Group the records by the groupid field. Emit the groupid as key
and the record as value.
For each group pick n random documents from the values array.
Let,
var parameter = {"5":1,"6":2}; //groupid->weight, keep it as an Object.
be the input to the map reduce functions.
The map function, emit only those group ids which we have provided as the parameter.
var map = function map(){
if(parameter.hasOwnProperty(this.groupid)){
emit(this.groupid,this);
}
}
The reduce function, for each group, get random records based on the parameter object in scope.
var reduce = function(key,values){
var length = values.length;
var docs = [];
var added = [];
var i= 1;
while(i<=parameter[key]){
var index = Math.floor(Math.random()*length);
if(added.indexOf(index) == -1){
docs.push(values[index]);
added.push(index);
i++;
}
else{
i--;
}
}
return {result:docs};
}
Invoking map reduce on the collection, by passing the parameter object in scope.
db.collection.mapReduce(map,
reduce,
{out: "sam",
scope:{"parameter":{"5":1,"6":2,"n":10}}})
To get the dumped output:
db.sam.find({},{"_id":0,"value.result":1}).pretty()
When you bring the parameter n into picture, you need to specify the number of documents for each group as a ratio, or else that parameter is not necessary at all.

Number of items in the aggregation with MongoDB 2.6

My query looks like that:
var x = db.collection.aggregate(...);
I want to know the number of items in the result set. The documentation says that this function returns a cursor. However it contains far less methods/fields than when using db.collection.find().
for (var k in x) print(k);
Produces
_firstBatch
_cursor
hasNext
next
objsLeftInBatch
help
toArray
forEach
map
itcount
shellPrint
pretty
No count() method! Why is this cursor different from the one returned by find()? itcount() returns some type of count, but the documentation says "for testing only".
Using a group stage in my aggregation ({$group:{_id:null,cnt:{$sum:1}}}), I can get the count, like that:
var cnt = x.hasNext() ? x.next().cnt : 0;
Is there a more straight forward way to get this count? As in db.collection.find(...).count()?
Barno's answer is correct to point out that itcount() is a perfectly good method for counting the number of results of the aggregation. I just wanted to make a few more points and clear up some other points of confusion:
No count() method! Why is this cursor different from the one returned by find()?
The trick with the count() method is that it counts the number of results of find() on the server side. itcount(), as you can see in the code, iterates over the cursor, retrieving the results from the server, and counts them. The "it" is for "iterate". There's currently (as of MongoDB 2.6), no way to just get the count of results from an aggregation pipeline without returning the cursor of results.
Using a group stage in my aggregation ({$group:{_id:null,cnt:{$sum:1}}}), I can get the count
Yes. This is a reasonable way to get the count of results and should be more performant than itcount() since it does the work on the server and does not need to send the results to the client. If the point of the aggregation within your application is just to produce the number of results, I would suggest using the $group stage to get the count. In the shell and for testing purposes, itcount() works fine.
Where have you read that itcount() is "for testing only"?
If in the mongo shell I do
var p = db.collection.aggregate(...);
printjson(p.help)
I receive
function () {
// This is the same as the "Cursor Methods" section of DBQuery.help().
print("\nCursor methods");
print("\t.toArray() - iterates through docs and returns an array of the results")
print("\t.forEach( func )")
print("\t.map( func )")
print("\t.hasNext()")
print("\t.next()")
print("\t.objsLeftInBatch() - returns count of docs left in current batch (when exhausted, a new getMore will be issued)")
print("\t.itcount() - iterates through documents and counts them")
print("\t.pretty() - pretty print each document, possibly over multiple lines")
}
If I do
printjson(p)
I find that
"itcount" : function (){
var num = 0;
while ( this.hasNext() ){
num++;
this.next();
}
return num;
}
This function
while ( this.hasNext() ){
num++;
this.next();
}
It is very similar var cnt = x.hasNext() ? x.next().cnt : 0; And this while is perfect for count...