Openstreetmaps - continuous road? - postgresql

I imported the OSM data for Switzerland in Postgres and I am interested in getting the road data of a continuous part of a highway (I know the name),that is, the part that connects two specific cities. The highway is quite big (A1) and connects a lot of cities together.
I am not sure how the sequence of road segments is stored in postgres (ie, how one knows that one road segment is directly after the other). How should I query Postgres to get a linestring with the route from on city to another? I can visualize the data of the whole highway (which spans multiple cities) in QuantumGis by doing the query:
select osm_id,way from planet_osm_roads where highway='motorway' and ref='A1';
but I don't know how to only get the osm_ids that I am interested in, in the order they appear in the route. I do not want to do a bounding box constraint in the where clauses because I am looking for a general solution and also, I am still not sure how the order of the sequence of road segments is saved.

The way I did that was to use pgrouting, namely, their pgr_dijkstra algorithm. I loaded the OSM data into a format fit for use by pgrouting using the osm2pgrouting tool.

Related

pgRouting with custom network?

I have a cost network, but it's not a street mapping network. I know the nodes and edges as I defined them. pgRouting looks like a good choice, but every single example I can find uses Open Street Map as the data. I don't have GPS coordinates. The x1,y1 for nodes makes no sense in my graphs, my nodes have specific ids, not coordinates. The costs aren't calculated from the coordinates, they're assigned by me on the various edges based on domain knowledge specific to my domain.
Are there any examples of how to create a custom network in pgRouting? I'm really struggling because the examples are "and then you use this tool to import OSM data"...which doesn't help me at all.
#Chris Kessel
I don't know if this is still relevant, but it may help others:
Basically, what you need to have is a table with edges, where in column 'source' is the id of a node on one end of the edge and in column 'target' - id of the node on the other end. You also have to have a defined cost for the edge, I'm not sure what this will be for you - usually it's distance or time units.
Ususally this is done with geo info using pgr_createTopology function, but in your case you will need to just create this yourself, I suppose.
I think this link can help you:
https://anitagraser.com/2011/02/07/a-beginners-guide-to-pgrouting/
The answer to the question "Are there any examples of how to create a custom network in pgRouting?" is Yes there are.

How to do in-memory search for polygons that contain a given point?

I have a PostGreSQL table that has a geometry type column, in which different simple polygons (possibly intersecting) are stored. The polygons are are all areas within a city. I receive an input of a point (latitude-longitude pair) and need to find the list of polygons that contain the given point. What I have currently:
Unclustered GiST index defined on the polygon column.
Use ST_Contains(#param_Point, table.Polygon) on the whole table.
It is quite slow, so I am looking for a more performant in-memory alternative. I have the following ideas:
Maintain dictionary of polygons in Redis, keyed by their geohash. Polygons with same geohash would be saved as a list. When I receive the point, calculate its geohash and trim to a desired level. Then search in the Redis map and keep trimming the point's geohash until I find the first result (or enough results).
Have a trie of geohashes loaded from the database. Update the trie periodically or by receiving update events. Calculate the point's geohash, search in the trie until I find enough results. I prefer this because the map may have long lists for a geohash, given the nature of the polygons.
Any other approaches?
I have read about libraries like GeoTrie and Polygon Geohasher but can't seem to integrate them with the database and the above ideas.
Any cues or starting points, please?
Have you tried using ST_Within? Not sure if it meets your criteria but I believe it is meant to be faster than st_contains

What are the pros and cons of multiple rows of POLYGON vs one MULTIPOLYGON field?

So for the first time I'm gonna do a project that involves maps and layers on top of maps which have many points and many polygons on them.
I have the tendency to create separate tables for points and polygons and then create many-to-many relationships between them and the layers table. If I do that I end up with 5 tables: points, polygons, layers, layers_points and layers_polygons.
However, I see PostGIS also offers types called MULTIPOINT and MULTIPOLYGON. If I use those types then I could put it all in the layers table. I guess that would make queries faster, because I need less joins. However, I'm not sure if later I might regret it, if it means that working with the individual points and polygons becomes impossible. I'm not even sure yet if it will be necessary to work perform calculations on the individual points and polygons, but it would be nice to know whether that's possible or not in both approaches.
So basically I'm asking, what the pros and cons are of these different approaches?
In general, you would consider using multipolygons to represent entities that have disjoint surfaces (for example, the geometry of Alaska) or other topologies that you can't represent as polygons. The key here is that a single entity needs to be expressed with a multipolygon
What you wouldn't do is group unrelated polygons into a multipolygon, because you won't be able to perform queries at a child polygon level, unless you extract the rings into another geometry. If the polygons are unrelated, chances are you will need to query them individually. Even if they share a layer, you can manage that relation with business logic without merging them as they aren't representing the same entity.
Keep in mind that geometry tools in the frontend won't necesarilly treat multipolygons as a valid geometry or a multi object. Algorithms of point-in-polygon that looks like your use case, won't necesarilly work when checking if a point is contained in a multipolygon.
Tools like Wicket.js (transform from/to WKT/geojson/native objects) don't support multipolygons. Google maps api v3 doesn't support multipolygons except for the data layer (but you can't operate on the data layer as you would on a polygon feature). Turf.js has operations that would run on a Featurecollection containing several polygons, yet not over a multipolygon.
Without knowing your exact use case, that's the best I can tell you, and TL/DR: keep your polygons as they are.

Postgis ST_Split not splitting roads by all points

I have a table containing the street network of Chicago, and I also have a table of crimes committed in Chicago. I am trying to to create k-means clusters for the crimes by assigning them to the cluster centre that is the shortest road distance away.
Firstly, I interpolated all the crimes onto the closest road. So far, so good.
Now what I'm trying to do is to split each road by all the crime points that fall on it, so that I can then create a network topology using pgrouting and route from one crime location to another.
Problem is, the ST_Split function does not seem to be splitting most of the roads and I have no idea why. Given I have a million crime points, the roads should be split into a large amount of segments, but I'm only getting about a thousand rows more than there are in the original street network table. This is the command I'm using:
CREATE TABLE algorithms.crime_network AS
SELECT road.id AS road_id, (ST_Dump(ST_Split(road.geom, road.crime_points))).geom
FROM (SELECT r.geom as geom, r.gid id, ST_Multi(ST_Collect(c.geom)) AS crime_points FROM public.transportation r INNER JOIN chicago_data.interpolated_crimes c ON c.road_id = r.gid GROUP BY r.gid) AS road;
I am using Postgis version 2.2.2 so the fact that I'm splitting by multipoints isn't a problem..
Any help would be appreciated!
As commented, some functions like all of the overlay operators and ST_Split require exact noding to perform as expected. This means that annoying floating point differences of geometry overlays will have vertexes from different geometries very close to each other (on the order of <1e-12), but not exact.
Use ST_Snap to get exact noding of one geometry on another, which will help functions like ST_Split operate as expected.

What SRID should I use for my application and how?

I'm using PostgreSQL with PostGIS. All my data has already decimal lat/long attached to it (i.e. -87.34554 33.12321) but to use PostGIS I need to convert it to a certain type of SRID.
The majority of my queries are looking for data inside a certain radius.
What SRID should I use? I created already a geometry column with SRID 4269.
In this example:
link text the author is converting SRID 4269 to SRID 32661. I'm very confused about how and when to use these SRIDs. Any lite on the subject would be truly appreciated.
As long as you never intend to reproject/transform the data to another coordinate system, it doesn't technically matter what srid you use. However assuming you don't want to throw away that important metadata, and you do want to transform it, you will want to ensure your assigned srid matches the data, so postgis knows what to do when the time comes.
So why would you want to reproject from epsg:4269? The answer is because certain types of queries (such as distance) make no sense in this 'unprojected' world. Your units are in decimal degrees, and a straight measurement of x decimal degrees is a different real distance depending where in the planet you are.
In your example above, someone is using epsg:32661 as they believe it will give them better accuracy for the are they're working in. If your data is in a specific area of the globe, you can select a projection that's accurate for that area. If it spans the entire globe, you have to choose a projection that does 'ok' for your needs.
Now fortunately PostGIS has a few ways of making all this easier. For approx distances you can just use the st_distance_sphere function which, as you might guess, assumes the earth is a sphere. Or the more accurate st_distance_spheroid. Using these, you don't need to reproject and you will probably be fine for your distance queries except in edge cases. Newer versions of PostGIS also let you use geography columns
tl;dr - use st_distance_spheroid for your distance queries, store your data in geography columns, or transform it to a local projection (when storing, or on the fly, depending on your needs).
Take a look at this question: How do you know what SRID to use for a shp file?
The SRID is just a way of storing the WKT inside the database (you may have noticed that, altough you store lat/long points, the preferred storing is a long string with number and capital letters).
The SRID or EPSG can be different for the country/state/... altough there are some very common ones especially the 2 mentioned by you. If you need specific info what area uses what SRID, there is a database for handling that.
Inside your database, you have a table spatial_ref_sys that has the information on what SRID PostGIS knows about.