Export postgres table data to shape file to build OSRM cluster - postgresql

I have imported all shapefiles (all 50 states data of USA) to the Postgres/PostGIS DB using QGIS. I am trying to export that data to a single shapefile. I am getting the error below -
Failed to write a shape object. File size cannot reach 4294967068 + 248.Error writing shape 31754843
How to generate single shapefile of large size as OSRM only takes single .osrm file to deploy in cluster.
I have also tried to merge individual shapefile to single one but it is also not working for me.

Related

How to clip the specific area from OSM in Postgresql

I uploaded OSM network to my server. Now, I'm working directly in Postgres (using dBeaver), and the file format is OSM format.
The imported file has a lot of areas outside of Montreal Island. This makes a lot of extra data and slowing the process. For example running the inner join takes like 2 hours.
Also, when I open the OSM file in JOSM, it has a lot of extra information that I don't need it.
Do you know is there any query to limit and clip the data only to the Montreal Island in dBeaver?
I know I can do this in ArcGIS but I prefer working on OSM data directly. Mainly because bringing shapefile to Postgres is tricky and breaking the original file. I prefer to work directly on the original file and not importing it form somewhere else.

Migrate from OrientDB to AWS Neptune

I need to migrate a database from OrientDB to Neptune. I have an exported JSON file from Orient that contains the schema (classes) and the records - I now need to import this into Neptune. However, it seems that to import data into Neptune there must be a csv file containing all the vertex's and another file containing all the edges.
Are there any existing tools to help with this migration and converting to the required files/format?
If you are able to export the data as GraphML then you can use the GraphML2CSV tool. It will create a CSV file for the nodes and another for the edges with the appropriate header rows.
Keep in mind that GraphML is a lossy format (it cannot describe complex Java types the way GraphSON can) but you would not be able to import those into Neptune either.

Adding new data to the neo4j graph database

I am importing a huge dataset of about 46K nodes into Neo4j using import option.Now this dataset is dynamic i.e new entries keep getting adding to it now and then so if i have to re perform the entire import then its wastage of resources.I tried using neo4j rest client of python to send the queries to create the new data points but as the number of new data points increase the time taken is more than the importing of 46k nodes.So is there any alternative to add these datapoints or do i have to redo the entire import?
First of all - 46k is rather tiny.
The most easy way to import data into Neo4j is using LOAD CSV togesther with PERIODIC COMMIT. http://neo4j.com/developer/guide-import-csv/ contains all the details.
Be sure to have indexes in place to find the stuff that needs to be changed with an incremental update quickly.

postgres table as source for pdal reader

Using the point data abstraction library (PDAL), is it possible to use an existing postgis table as input to a pipeline?
I've got some older lidar data stored as xyz points in postgis - I'd like to move them into pointcloud patches without exporting them and re-importing them to improve query efficiency.
Alternatively, can the same be done within Postgis/pointcloud? Is there a way to 'chip' the input points into patches within the database?

Deploy a Predictive model created by R in production server

I have created a random forest predictive model using R and have saved it in my local machine. Now I want to deploy in a server running Hive and predict using this model using the full data set in my Hive data warehouse. Now how do I deploy this model and how do I run this model against Hive full data set.
Can someone help me and share me the code to predict the model I created against the Hive full data set.
I am wondering if I have 10 million rows then how kind of batch size should I use. Even if I take a batch size of 100,000 it would be too time consuming . How others are deploying in production and predicting using large data size.