Creating a centroid column from geometry shape field produces AttributeError: 'NoneType' object has no attribute 'centroid' - centroid

I have imported a shape file of Australian local government areas (LGA's) into google Colab and successfully read into geopandas. The subsequent geodataframe
lga_df = gpd.read_file("LGA_2016_AUST.shp")
has a geometry field with a list of polygons. I am trying to find the centroid / lat & long point in order to use the LGA names as labels in the map I will create.
The code I am using to create the centroid field is
lga_df["center"] = lga_df["geometry"].centroid
lga_df_points = lga_df.copy()
lga_df_points.set_geometry("center", inplace = True)
Logic is
copy the original df_lga to a new df, and then set
the geometry column to the newly created center points column
because a GeoPandas df can only have one geometry column)
However, I am getting the following message
AttributeError: 'NoneType' object has no attribute 'centroid'.
I have also tried code using representative_point() but doesn;t work either
I believe I have imported all the necessary dependencies - (see below)
import geopandas as gpd
from geopandas import GeoDataFrame
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from shapely.geometry import Point, LineString
import adjustText as aT
I have tried for a few hours but still get same error. Can someone advise where I am going wrong
Update - This Problem has been solved - geometry field had rows with No polygon co-ordinates. I have removed the rows without polygon coordinates. However, another error is thrown out. "Attribute error - Series does not have the attribute 'centroid'.
I am thinking for some reason python is not recognising my geometry field as a geometry field but text / string field. How do I convert a geometry field that is a string to a geometry field?

Try keeping the name of the geometry field:
lga_df_points = lga_df.copy()
lga_df_points["geometry"] = lga_df_points["geometry"].centroid
I don't know why it works but it worked for me. I guess that there is a bug changing the name of the geom field

Related

How to parse a shape file to use in Foundry's map application?

I am ingesting data in the form of a shapefile. For example, ice data from https://usicecenter.gov/Products
How do I use these files in Foundry, in particular displaying on a map?
Easy! This is outlined in the documentation on using vector data in transforms
Clean geospatial data in Foundry is:
Tabular, so the data can be used in Spark transforms
Formatted as either a valid GeoJSON or geohash, so Geospatial data can be used in the Foundry Ontology
Projected using the EPSG:4326 CRS, so that both sides of spatial joins use the same projection and Foundry maps will render features correctly.
Foundry provides a geospatial-tools pyspark library which makes it easy to clean and convert. Further details are in the documentation for data parsing and cleaning, but for this specific example, we would need to convert the shapefile into a dataframe and then project to EPSG:4326.
The EPSG can be determined from the .prj file, using the method outlined here. For the example of the ice shapefiles:
with open(shapeprj_path, 'r') as f:
prj_txt = f.read()
srs = osr.SpatialReference()
srs.ImportFromESRI([prj_txt])
print(str(srs.ExportToProj4()))
The output is:
+proj=lcc +lat_0=40 +lon_0=-100 +lat_1=49 +lat_2=77 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs
This is used as the input_crs:
from transforms.api import transform, Input, Output
from geospatial_tools import geospatial
from geospatial_tools.parsers import shapefile_to_dataframe
from geospatial_tools.geom_transformations import normalize_projection
#geospatial()
#transform(
output=Output("path/to/ice_data_parsed"),
raw=Input("path/to/ice_data_raw"),
)
def compute(raw, output):
gdf = shapefile_to_dataframe(raw)
gdf = normalize_projection(input_df=gdf, geometry_column="geometry", input_crs="+proj=lcc +lat_0=40 +lon_0=-100 +lat_1=49 +lat_2=77 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs")
output.write_dataframe(gdf)
The output dataset can then be synced to the Ontology and used in the mapping applications

selecting a range of colums in SKlearn column transformer

I am encoding catagorical data, many columns need to be seletced, I have typed them in individually and it works ok but there is obviouly a more elegant way.
dataset =pd.read_csv('train.csv')
x = dataset.iloc[:,:-1].values
y = dataset.iloc[:, -1].values
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(),[2,5,6,7,8,9,10,11,12,13,14,15,16,21,22,23,24,25,27,28,29,30,31,32,33,34,35,39,40,41,42,53,54,55,56,57,58,60,63,64,65,72,73,74,78,79])], remainder='passthrough')
x = np.array(ct.fit_transform(x))
I have tried using (23:34) I have tried using slice but that does not work as it is not that data type.
Which method should I use for selecting a range of columns?
Also what datatype is it at this point were I am selecting the columns?
I made a search I an not able to see a solution for this exact question.
Finally, is this an effecient way to encode catagorical data or should I be looking at an alternative method?
Thanks!
you can use the following workaround:
ct = ColumnTransformer(
transformers=[
("ordinal_enc", OrdinalEncoder(), data.loc[:, "col1":"col100"].columns)
])

Save and re-load a weighted graph from OSMnx for NetworKX

I am using OSMnx to get a graph and add a new edge attribute (w3) representing a custom weight for each edge. Then I can successfully find 2 different shortest paths between 2 points using NetworkX and 'length', 'w2'. Everything works fine, this is my code:
G = ox.graph_from_place(PLACE, network_type='all_private', retain_all = True, simplify=True,truncate_by_edge=False) ```
w3_dict = dict((zip(zip(lu, lv, lk),lw3)))
nx.set_edge_attributes(G, w3_dict, "w3")
route_1 = nx.shortest_path(G, node_start, node_stop, weight = 'length')
route_2 = nx.shortest_path(G, node_start, node_stop, weight = 'w3')
Now I would like to save G to disk and reopen it, to perform more navigation tasks later on. But after saving it with:
ox.save_graph_xml(G, filepath='DATA/network.osm')
and reopen it with:
G = ox.graph_from_xml('DATA/network.osm')
my custom attribute w3 has disappeared. I have followed the instructions in the docs but with no luck. It feels like I'm missing something really obvious but I don't understand what it is..
Use the ox.save_graphml and ox.load_graphml functions to save/load full-featured OSMnx/NetworkX graphs to/from disk for later use. The save xml function exists only to allow serialization to the .osm file format for applications that require it, and has many constraints to conform to that.
import networkx as nx
import osmnx as ox
ox.config(use_cache=True, log_console=True)
# get a graph, set 'w3' edge attribute
G = ox.graph_from_place('Piedmont, CA, USA', network_type='drive')
nx.set_edge_attributes(G, 100, 'w3')
# save graph to disk
ox.save_graphml(G, './data/graph.graphml')
# load graph from disk and confirm 'w3' edge attribute is there
G2 = ox.load_graphml('./data/graph.graphml')
nx.get_edge_attributes(G2, 'w3')

Plot a graph with ipycytoscape (and networkx)

Following the instructions of ipycitoscape I am not able to plot a graph using ipycitoscape.
according to: https://github.com/QuantStack/ipycytoscape/blob/master/examples/Test%20NetworkX%20methods.ipynb
this should work:
import networkx as nx
import ipycytoscape
G2 = nx.Graph()
G2.add_nodes_from([*'ABCDEF'])
G2.add_edges_from([('A','B'),('B','C'),('C','D'),('E','F')])
print(G2.nodes)
print(G2.edges)
cytoscapeobj = ipycytoscape.CytoscapeWidget()
cytoscapeobj.graph.add_graph_from_networkx(nx_graph)
G2 is a networkx graph example and it looks ok since print(G2) gives the networkx object back and G2.nodes and G2.edges can be printed.
The error:
ValueError: invalid literal for int() with base 10: 'A'
Why should a node be an integer?
More general what to do if the starting data point if a pandas dataframe with a million rows edges those being strings like ProcessA-ProcessB, processC-processD etc
Also having a look to the examples it is to be noted that the list of nodes is composed of a dictionary data for every node. that data including an "id" per node and also "Atribute". The surprise here is that the networkx Graph should have all those properties.
thanks
This problem was fixed. See attachment.
Please let me know if it's still happening. Feel free to open an issue: https://github.com/QuantStack/ipycytoscape/
I'm just playing around with ipycytoscape myself, so I could be way off-base, but, shouldn't the line be:
cytoscapeobj.graph.add_graph_from_networkx(G2) # your graph name goes here
Trying to generate a cytoscape object built on a graph that doesn't exist might trigger a ValueError because it can't find any nodes.

postgis shape file import problems

Hi I'm trying to import a shape file from
http://www.nyc.gov/html/dcp/html/bytes/bytesarchive.shtml
into a postgis database. the above files creates MULTIPOLYGONS when i import using shp2pgsql.
then i'm trying to simply determine if lat/long points are contained in my multipolygons
however my select's are not working, and when i print out the poitns of my the_geom column it seems to be very broken.
select st_astext(geom) from (select (st_dumppoints(the_geom)).* from nybb where borocode =1) foo;
gives the result...
st_astext
------------------------------------------
POINT(1007193.83859999 257820.786899999)
POINT(1007209.40620001 257829.435100004)
POINT(1007244.8654 257833.326199993)
POINT(1007283.3496 257839.812399998)
POINT(1007299.3502 257851.488900006)
POINT(1007320.1081 257869.218500003)
POINT(1007356.64669999 257891.055800006)
POINT(1007385.6197 257901.432999998)
POINT(1007421.94509999 257894.084000006)
POINT(1007516.85959999 257890.406100005)
POINT(1007582.59110001 257884.7861)
POINT(1007639.02150001 257877.217199996)
POINT(1007701.29170001 257872.893099993)
...
for points in nyc, this is very off.. what am i doing wrong?
The points are not of. The spatial data that is referred to is NOT in lat/long. This is why numbers are different from what you expect. If you need it to be in long/lat it must be reprojected. See more here: http://postgis.refractions.net/news/20020108/
The projection of the data seems to be in the NAD_1983_StatePlane_New_York_Long_Island_FIPS_3104_Feet coordinate system (according to the metadata - see code.).
<spref>
<horizsys>
<planar>
<planci>
<plance Sync="TRUE">coordinate pair</plance>
<coordrep>
<absres Sync="TRUE">0.000000</absres>
<ordres Sync="TRUE">0.000000</ordres>
</coordrep>
<plandu Sync="TRUE">survey feet</plandu>
</planci>
<mapproj><mapprojn Sync="TRUE">Lambert Conformal Conic</mapprojn><lambertc><stdparll Sync="TRUE">40.666667</stdparll><stdparll Sync="TRUE">41.033333</stdparll><longcm Sync="TRUE">-74.000000</longcm><latprjo Sync="TRUE">40.166667</latprjo><feast Sync="TRUE">984250.000000</feast><fnorth Sync="TRUE">0.000000</fnorth></lambertc></mapproj></planar>
<geodetic>
<horizdn Sync="TRUE">North American Datum of 1983</horizdn>
<ellips Sync="TRUE">Geodetic Reference System 80</ellips>
<semiaxis Sync="TRUE">6378137.000000</semiaxis>
<denflat Sync="TRUE">298.257222</denflat>
</geodetic>
<cordsysn>
<geogcsn Sync="TRUE">GCS_North_American_1983</geogcsn>
<projcsn Sync="TRUE">NAD_1983_StatePlane_New_York_Long_Island_FIPS_3104_Feet</projcsn>
</cordsysn>
</horizsys>
</spref>
If you work much with spatial data I suggest that you read more about map projection.
I think this is not issue with PostGIS. I checked input esri Shape file nybb.shp with AvisMap Free Viewer and as you see points are weird itself:
However there is something interesting in nybb.shp.xml metadata file:
<spdom>
<bounding>
<westbc Sync="TRUE">-74.257465</westbc>
<eastbc Sync="TRUE">-73.699450</eastbc>
<northbc Sync="TRUE">40.915808</northbc>
<southbc Sync="TRUE">40.495805</southbc>
</bounding>
<lboundng>
<leftbc Sync="TRUE">913090.770096</leftbc>
<rightbc Sync="TRUE">1067317.219904</rightbc>
<bottombc Sync="TRUE">120053.526313</bottombc>
<topbc Sync="TRUE">272932.050103</topbc>
</lboundng>
</spdom>
I am not familiar with those toolkit (ESRI ArcCatalog), but most probably you need to rescale your points after import using that metadata.