Filter polygons by area - openstreetmap

I try to filter the polygons of an .osm by their area.
Now it's about swiss lakes. I extracted all the polygons using the "natural=water" filter, but I still have all the ponds of Switzerland. Therefore I try to add a filter using the area of the polygons.
How can I do it??
I have allready searched some solutions, but was unable to find good answer.
The best I found was this question but I don't know where I should execute it and if it is compatible with osm data.
Thanks for your answers

One way to solve this issue would be to use Atlas. Here are the steps you would need to follow (with links to other related SO answers inline):
Convert your OSM file to a .osm.pbf using osmosis
Load the .osm.pbf into an Atlas file, using atlas-shell-tools's pbf2atlas subcommand
Write a small Java class that opens the Atlas file, gets all the lakes using a TaggableFilter, and then filters them by area, using Polygon.getSurface().

Related

accessing p-values in PySpark UnivariateFeatureSelector module

I'm currently in the process of performing feature selection on a fairly large dataset and decided to try out PySpark's UnivariateFeatureSelector module.
I've been able to get everything sorted out except one thing -- how on earth do you access the actual p-values that have been calculated for a given set of features? I've looked through the documentation and searched online and I'm wondering if you can't... but that seems like such a gross oversight for such this package.
thanks in advance!

Spatial geometry columns vs float/decimal columns in storing longitude/lattitude

I need to store and give out long/lat coordinates and display on google maps. I would need to store points, lines, and polygons. Then add metadata on them to for generating info.
Currently looking into postgis, and it seems a fair bit to absorb. Now I'm wondering if I need to delve into it.
Is it advisable to use a spatial database for the said purpose? or using float/decimal columns for long/lat is fine?
Currently looking into postgis, and it seems a fair bit to absorb. Now I'm wondering if I need to delve into it. Is it advisable to use a spatial database for the said purpose? or using float/decimal columns for long/lat is fine?
It is a lot to absorb. Storing as a float/decimal gets you nothing. Nothing at all. Spatial functions require spatial types. You gotta learn it. You don't have to learn all of it, but you have to learn it. And, it's not too hard to get started.
CREATE TABLE foo(id,geom)
AS
VALUES ( 1, ST_MakePoint(long,lat)::geography );
Etc.
I highly suggest PostGIS in Action 2nd Edition
If the only thing you need is store and give out, then float/decimal columns would be more than enough. However, if you are querying for spatial relations (whether a point is located within a polygon, whether polygons intersect etc.), you'd better to use either PostGIS or, for instance, MySQL extensions for spatial data.

How to use wget to download large osm dataset?

I want to create a global dataset of wetlands using the OSM database. As there are problems for huge datasets if I use the overpass-turbo or so, I thought I could use wget to download the planet file and filter it for only the data I'm interested in. Problem is, I don't know much about wget. So I wanted to know whether there is a way to filter the data from the planet file while downloading and unzipping it?
In general I'm looking for the least time and disk-space consuming way to get to that data. Would you have any suggestions?
wget is just a tool for downloading, it can't filter on the fly. There is probably a chance that you can pipe the data to a second tool which does filtering on the fly but I don't see any advantage here. And the disadvantage is that you can't verify the file checksum afterwards.
Download the planet and filter it afterwards using osmosis or osmfilter.

How to use mapnik + shapefiles with dynamically filtered results without PostgreSQL or manual changes to the dbf file?

I have an unusual challenge that I'm struggling with. I'm hoping someone has come across this and can point me towards a solution.
Overview:
I am searching for a way to use mapnik + shapefiles with dynamically filtered results without PostgreSQL or manual changes to the dbf file.
Details:
I need to draw territories made up of ZIP codes on a map. My current solution pulls polys from a MySQL database and draws them on a Google Map at runtime. This is obviously very slow. I'd prefer to move this to a mapnik-compatible solution so I can render the tiles on the fly and have the maps respond much more quickly.
Now, before you say, just use Postgres instead of shapefiles, let me say that this server doesn't and can't have Postgres on it. It sucks, but it's a hard limitation. The dynamic data source is MySQL which mapnik doesn't support.
I can, of course, render these territories manually offline (I happen to use TileMill for this). To do so, I must manipulate the shapefile's dbf file in OpenOffice by adding a simple column that flags if a ZIP is in a territory or not and to whom it belongs. Then in my CartoCSS, it's a simple matter of looking for [INCLUDE="Y"] { ... } to draw or hide the poly, and a simple color assignment based on the territory owner's id. No problem there except for the fact that this is incredibly time- and labor-intensive.
The real problem is that the client has nearly a hundred territories that change from time to time. To make matters worse, when they change, a ripple effect can cause a chain reaction whereby neighboring territories change. And to make matters even worse, they don't want the the territory owners to see each others' territories, meaning each territory has to be filtered and drawn in isolation.
So, I'm looking for an automated hybrid solution to handle the filtering the shapefile so I don't have to redo this by hand every time the territories change.
My hair is turning gray on this one. I'm open to options so long as the solution doesn't rely on PostgreSQL or manual manipulation of the shapefile's dbf database.
Any help and advice is much appreciated.
Thanks,
Steve
TLDR: I need to use mapnik + shapefiles with dynamically filtered results without PostgreSQL or manual changes to the dbf file
You can do this by rendering territories with unique colors, and then manipulating the color palette of the rendered images when you serve them. Doing this on the server is cheap if done in C, or you can attempt to do it in javascript (Performance unknown to me Possible to change an image color palette using javascript?).
There's a description of this here: http://blog.webfoot.com/2013/03/12/optimizing-map-tile-generation/
If you're rendering boundaries only, then this will not work.
If you haven't already, you can speed up polygon rendering in browser for your existing system by using encoded polylines (https://developers.google.com/maps/documentation/utilities/polylinealgorithm). This is a huge performance bump, and reduces wire-transfer size as well.

mongodb: inserting and querying geometries and WMS

I am discovering mongodb, looks nice but i am still wondering if it can solve my needs.
The question is that we have 16 million point data and we want to cross some part of it with polygons to get statistics (how many points in each polygon).
Basic geometries would be cell degrees (1 degree, 0.5 degree...) covering all the world. In that case the $within function would work, right?
But I wonder, how do I insert these geometries (coming from a shapefile) inside mongodb? Till now I was using postgreSQL-postGIS, and for that I have a lot of tools, but for mongodb...I am also wondering if more complex geometries could be inserted and queried against points.
MongoDB only provides JSON as result, right? if we want to plot some hundreds of points it would be no problem, but hundreds of thousands to be converted to vectorial data via javascript... is for this reason that WMS services are useful, as they provide one image.
Any hope to connect mongodb to any WMS? I saw someone announcing a plugin for Geoserver but it makes a year ago and nothing happened since then.
In case it is not possible, about how many GeoJSON features can be plotted at time keeping a nice browser performance?
Not much help, but I saw a talk on someone who added MongoDB as a back end to GeoServer last year.
IIRC, he said he would open source it (if his company approved), so maybe it's worth tracking him down.
EDIT: Looks like he got approval. Dug up some code here but not sure where associated documentation is. The Geotools/opengeo mailing list is where I found that.
I'm also starting to investigate using NoSQL for geographic data.
There is an article
The example code Python, PyMongo and the OGR libraries to convert shapefiles to a MongoDB collection and vice versa.