Identify crossroad nodes in openstreetmap data (.pbf) - openstreetmap

does anybody know if there is a way I can seperate only the crossroad nodes which are included in a .pbf file? Is this clue (if a node is crossroad or not) included in this file's format?

Another option to solve your issue would be to use the new Atlas project.
As part of loading .osm.pbf files into in-memory Atlas files, it takes care of doing way sectioning on roads:
Load your pbf file into an Atlas. You will then have an Atlas object that you could save to a file and re-use.
Use the Atlas APIs to access all the intersections
In the end, each Atlas Node which is connected to more than 4 Edges on a two-way road or 2 Edges on a one way road would be a candidate if I understand your question correctly.

I'm not aware of a ready-made solution for this task, but it should still be relatively easy to do.
For parsing the .pbf file, I recommend using an existing library like Osmosis or Osmium. That way, you only need to implement the actual semantics of your use case.
The nodes themselves don't have any special attributes that mark them as crossroads. So instead, you will have to look at the ways containing the nodes.
Some considerations when implementing this:
You need to check the way's tags to find out whether it's a road. The most relevant key for that is highway. The details depend on your specific use case – for example, you need to decide whether footways, forestry tracks, driveways, ... should count.
What matters is the number of connecting way segments at a node, not the number of ways. For example, a node that is part of two ways may be a crossroads (if at least one of the ways continues beyond that node), or may not (if both ways start/end at that node).

Related

How to use VTK to efficiently write time-varying field data on a fixed mesh?

I am working on physics simulation research. I have a large fixed grid in one of my projects that does not vary with time. The fields on the grid, on the other hand, vary with time in the simulation. I need to use VTK to record the field data in each step for visualization (Paraview).
The method I am using is to write a separate *.vtu file to disk at each time step. This basically serves the purpose, but actually writes a lot of duplicate data (re-recording the geometry of the mesh at each step), which not only consumes more disk space, but also wastes time on encoding and parsing.
I would like to have a way to write the mesh information only once, and the rest of the time only new field data is written, while being able to guarantee the same visualization. Please let me know if VTK and Paraview provide such an interface and how to implement it.
Using .pvtu and refer to the same .vtu as Piece for each step should do the trick.
See this similar post on the ParaView discourse, and the pvtu doc
EDIT
This seems to be a side effect of the format, this is not supported by the writer.
The correct solution is to use another file format ...
Let me provide my own research findings for reference.
As Nico said, with the combination of pvtu/vtu files, we could theoretically implement a geometry structure stored in a separate vtu file, referenced by a pvtu file. Setting the NumberOfPieces attribute of the ptvu file to 1 would enable the construction of only one separate vtu file.
However, the VTK library does not expose a dedicated operation interface to control the writing process of vtu files. No matter how it is set, as long as the writer's input contains geometry structures, the writer will write geometry information to disk, and this process cannot be skipped through the exposed interface.
However, it is indeed possible to make multiple pvtu files point to the same vtu file by manually editing the piece node in the ptvu file, and paraview can recognize and visualize such a file group properly.
I did not proceed to try adding arrays to the unstructured grid and using pvtu output.
So, I think the conclusion is.
if you don't want to dive into VTK's library code and XML implementation, then this approach doesn't make sense.
if you are willing to write a series of files, delete most of them from the vtu file, and then point all the pvtu's piece nodes to the only surviving vtu file by editing the pvtu file, you can save a lot of disk space, but will not shorten the write, read, and parse times.
If you implement an XML writer by yourself, you can achieve all the requirements in theory, but it requires a lot of coding work.

Query regarding OSM file structure

Had a query regarding the OSM file's tags and values. I came across files where there are inconsistencies in tag names and certain node ids/uids do not have tag names and thereby don't allow us to identify what feature it is without opening them in any GIS software. For example, some node ids and uids have tag names as "source bing". Is there a way to identify what they represent without opening them in GIS software? Also, How does OSM recognize these features without proper tag names and values?
Thank you!
In the OSM data model, nodes can represent point features, but they can also be used to define the shape of ways (or, rarely, relations). OSM ways do not have location information on their own, but instead reference a list of node IDs. You can read more about this in the OSM wiki's article on nodes. The OSM wiki is also a good source on the meaning of the tags used in the OSM database.
Nodes that do not represent a point feature will usually carry no tags. However, you will sometimes find nodes with tags for internal use by OSM contributors, such as source information.
There are other reasons why a node may have no tags or unexpected tags, such as editing mistakes by an OSM contributor or newly invented tags. But these are comparatively less common.

Meaning of the spatialite scheme generated by the spatialite_osm_map tool

I used the spatialite_osm_map tool to generate a spatialite database from an .osm.pbf file. After the process was finished, a series of tables were generated in the database as shown in the image.
I noticed that there were 3 groups of tables based on the prefixes of their names: In_, pg_ and pt_. I also noticed that the rest of the name corresponded to a key defined in OpenStreetMap.
Can someone explain to me how the information is distributed in each of these groups and tables? I've searched for a site that explains the resulting schema after the conversion, but I've only found information on how to use the tool.
I think you have already identified the key points of this scheme.
It's main purpose is to offer the data from OSM in a way who could be more direct and intuitive for a GIS user. The data is splited according to OSM tags (aerialway, aeroway, amenity, etc., you can change the list of tags to be used if you don't need all of them) and according to the type of geometry (pt_* for points, ln_* for lines, and pg_* for polygons) so these tables (which could be directly seen as "layers" by a GIS user) can quickly be styled (for example in a GIS desktop application such as QGIS) with simple rules due to this simple schema (for example one can set rules like green for pg_natural, blue for ln_waterway and pg_waterway, or just click on the "pg_building" layer to toggle its visibility). That schema doesn't preserve all the objects from the OSM database, but only those requested to build the tables for the requested tags.
Contrary to the original way of storing OSM objects, with this kind of extraction you will lose the relationships between objects (for example in OSM the same node can be used, let's say, as part of the relationship describing an administrative boundary and as part of a road; here you will get a road line in ln_road and a polygon in pg_boundary but you will loose the information that they were maybe partially sharing nodes). Notably due to this last point, the weight of the OSM extractions can be relatively high compared to the original file.
So I guess that this kind of scheme (which is one amongst other existing ways to transform OSM data) offers an interesting abstraction for those who are not accustomed to the OSM schema which use Node, Way and Relation elements (eg, in OSM, buildings can be represented as closed way or as relation, here you get "simply" polygons for these various buildings).

Routing network from OSM

I'm looking for some good tool to import map.osm to postgres and next create some routes which will be displayed by geoserver. I need route, with some text information about vertexes (e.g. city, address, address number, and so on...)
I found this:
osm2pgrouting - Import OSM data into pgRouting Database
osm2postgis -Import OSM data to PostGIS
osm2po - tool to convert OSM data into a routable format
osm4routing - OpenStreetMap data parser to turn them into a nodes-edges adapted for routing applications
I do not have many experiences with GIS, so how tool is the best for me? I try osm2pgrouting, but in result I have tables, which do not contains data about vertexes(only lat. and alt.) Thanks for answers.
UPDATE App Info:
I will be have web and android client where user enter text value of start and end node, and next over geoserver get wms with vertexes of entered route for example
My result from could be be some edges and nodes like this like this:
sequence_num, edge_distance, and informations about edge vertexes like osm_id, some text value, lat alt, etc...
I think you have a lot of work to do before you get to a complete solution, but here are some pointers. I suggest you break down your project into smaller chunks and ask specific questions on any bits you might get stuck on.
First, you need to import your data. Then you'll need some pre-processing / cleaning. Then you need your routing queries and, finally, a way to use the outputs (with this last part determining to some extent the previous steps).
Import OSM data
As I described in an answer to your previous question here, you can use OGR2OGR to import OSM data to Postgis. You can use other programs, as you mention above, but I guess you'll get much the same results. I think the difference between the OGR2OGR tables and the osm2postgis ones is that some of the columns in the latter appear in the other_tags column. However, the data is still there, you just need slightly different queries.
Preparing data
I'm assuming you'll use pgrouting for the routing, but whatever you use, you'll need a network suitable for routing (in short, the edges have a start and end node, and the end nodes must connect with other start nodes). Pgrouting has tools to create what you need and validate it. E.g. you create integer columns source and target and the function pgr_createtopology will populate the columns for you.
OGR2OGR gives you tables "lines", "points", "multipolygons", "multilinestrings". I suggest you read up on OSM to understand exactly what is in these tables, but, roughly speaking, the lines contain your roads and the multipolygons contain, amongst other things, buildings with e.g. addresses. The addresses are in a hstore column called "other_tags".
The lines do not contain addresses! (although they do contain street names). So, if you want to do address-to-address routing you need to do some preparation. You can skip this if you can live with the street names.
Create your network (e.g. if you're routing for cars, you'll want to
throw out pedestrian routes and so on)
Extract the desired addresses (including coordinates)
Either snap the addresses to the nearest node, or otherwise
relate the address to the nearest node
Pgrouting will return the edges in your route, so you need the above to relate back to your addresses.
Routing
Your app is going to send to your server (in an as-yet unspecified way) a pair of addresses or coordinates and you need postgis to return the route. With pgrouting, that's quite easy and there are plenty of examples out there, for example here. You will need to write queries that join the output to your address table to give you the desired output.
pgrouting creates a vertices table. You can get the nearest vertex with the following query:
select id from vertices_pgr
order by the_geom <-> st_setsrid(st_point(lon,lat),4326)
limit 1
Using the output
Using WMS from geoserver is unlikely to be a good choice - you won't have the information on individual edges without a lot of messing about. You might consider geoJSON, which can be read by e.g. OpenLayers, Leaflet, or you can manipulate in Javascript. Postgres has lots of useful functions for working with json and geojson.
Conclusion
That's quite a lot of work and probably new stuff if you have little GIS knowledge, and it, er, basically recreates what you'd get from Graphhopper! Are you sure that's not a better way to go?
If you do decide to go this (or similar) route break things down into manageable chunks! First, figure out exactly what you're trying to achieve, then work backwards from there. If you do decide to use OSM / pgrouting, then play with the data and pgrouting first so you understand how it works before trying address matching etc.
The tools you listed are only for producing data, but I think you actually need a routing engine.
Try Graphhopper: https://graphhopper.com
Using the WEB Api (more likely what you need), you don't need the import the data in your database. This is the easiest solution. You will have not control over the input openstreetmap data but this is fine if you don't have special requirements.
Import data and implement/integrate a routing engine directly in your application would be much more complicated.

consistent hashing on Multiple machines

I've read the article: http://n00tc0d3r.blogspot.com/ about the idea for consistent hashing, but I'm confused about the method on multiple machines.
The basic process is:
Insert
Hash an input long url into a single integer;
Locate a server on the ring and store the key--longUrl on the server;
Compute the shorten url using base conversion (from 10-base to 62-base) and return it to the user.(How does this step work? In a single machine, there is a auto-increased id to calculate for shorten url, but what is the value to calculate for shorten url on multiple machines? There is no auto-increased id.)
Retrieve
Convert the shorten url back to the key using base conversion (from 62-base to 10-base);
Locate the server containing that key and return the longUrl. (And how can we locate the server containing the key?)
I don't see any clear answer on that page for how the author intended it. I think this is basically an exercise for the reader. Here's some ideas:
Implement it as described, with hash-table style collision resolution. That is, when creating the URL, if it already matches something, deal with that in some way. Rehashing or arithmetic transformation (eg, add 1) are both possibilities. This means, naively, a theoretical worst case of having to hit a server n times trying to find an available key.
There's a lot of ways to take that basic idea and smarten it, eg, just search for another available key on the same server, eg, by rehashing iteratively until you find one that's on the server.
Allow servers to talk to each other, and coordinate on the autoincrement id.
This is probably not a great solution, but it might work well in some situations: give each server (or set of servers) separate namespace, eg, the first 16 bits selects a server. On creation, randomly choose one. Then you just need to figure out how you want that namespace to map. The namespaces only really matter for who is allowed to create what IDs, so if you want to add nodes or rebalance later, it is no big deal.
Let me know if you want more elaboration. I think there's a lot of ways that this one could go. It is annoying that the author didn't elaborate on this point; my experience with these sorts of algorithms is that collision resolution and similar problems tend to be at the very heart of a practical implementation of a distributed system.