import geopandas as gpd
import matplotlib.pyplot as plt
import seaborn as sns
from cityseer.metrics import layers
from cityseer.tools import graphs, io
Statistics from geopandas data
Calculate building statistics from a geopandas
GeoDataFrame
.
To start, follow the same approach as shown in the network examples to create the network.
= gpd.read_file("data/madrid_streets/street_network.gpkg")
streets_gpd = streets_gpd.explode(reset_index=True)
streets_gpd = io.nx_from_generic_geopandas(streets_gpd)
G = graphs.nx_decompose(G, 50)
G = graphs.nx_to_dual(G)
G_dual = io.network_structure_from_nx(G_dual) nodes_gdf, _edges_gdf, network_structure
100%|██████████| 47155/47155 [00:06<00:00, 6967.44it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 1.
100%|██████████| 47129/47129 [00:00<00:00, 183089.20it/s]
INFO:cityseer.tools.graphs:Decomposing graph to maximum edge lengths of 50.
100%|██████████| 47129/47129 [00:21<00:00, 2236.45it/s]
INFO:cityseer.tools.graphs:Converting graph to dual.
INFO:cityseer.tools.graphs:Preparing dual nodes
100%|██████████| 137778/137778 [00:03<00:00, 42819.25it/s]
INFO:cityseer.tools.graphs:Preparing dual edges (splitting and welding geoms)
100%|██████████| 137778/137778 [02:11<00:00, 1051.62it/s]
INFO:cityseer.tools.io:Preparing node and edge arrays from networkX graph.
100%|██████████| 137778/137778 [00:02<00:00, 50751.69it/s]
100%|██████████| 137778/137778 [00:21<00:00, 6268.15it/s]
Read-in the dataset from the source Geopackage or Shapefile Geopandas.
= gpd.read_file("data/madrid_buildings/madrid_bldgs.gpkg")
bldgs_gpd bldgs_gpd.head()
mean_height | area | perimeter | compactness | orientation | volume | floor_area_ratio | form_factor | corners | shape_index | fractal_dimension | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | 187.418714 | 58.669276 | 0.491102 | 40.235999 | NaN | NaN | NaN | 4 | 0.700787 | 1.026350 | POLYGON ((448688.642 4492911, 448678.351 44928... |
1 | 7.0 | 39.082821 | 26.992208 | 0.472874 | 10.252128 | 273.579749 | 78.165643 | 5.410857 | 4 | 0.687658 | 1.041691 | POLYGON ((440862.665 4482604.017, 440862.64 44... |
2 | 7.0 | 39.373412 | 27.050303 | 0.475086 | 10.252128 | 275.613883 | 78.746824 | 5.400665 | 4 | 0.689265 | 1.040760 | POLYGON ((440862.681 4482608.269, 440862.665 4... |
3 | 7.5 | 37.933979 | 26.739878 | 0.464266 | 10.252129 | 284.504846 | 75.867959 | 5.513124 | 4 | 0.681371 | 1.045072 | POLYGON ((440862.705 4482612.365, 440862.681 4... |
4 | 7.0 | 39.013701 | 26.972641 | 0.472468 | 10.183618 | 273.095907 | 78.027402 | 5.412350 | 4 | 0.687363 | 1.041798 | POLYGON ((440880.29 4482607.963, 440880.274 44... |
Use the layers.compute_stats
method to compute statistics for numeric columns in the GeoDataFrame
. These are specified with the stats_column_labels
argument. These statistics are computed over the network using network distances. In the case of weighted variances, the contribution of any particular point is weighted by the distance from the point of measure.
= [100, 200]
distances = layers.compute_stats(
nodes_gdf, bldgs_gpd
bldgs_gpd,=[
stats_column_labels"area",
"perimeter",
"compactness",
"orientation",
"shape_index",
],=nodes_gdf,
nodes_gdf=network_structure,
network_structure=distances,
distances )
INFO:cityseer.metrics.layers:Computing statistics.
INFO:cityseer.metrics.layers:Assigning data to network.
100%|██████████| 135302/135302 [00:00<00:00, 761483.31it/s]
100%|██████████| 137778/137778 [02:06<00:00, 1089.38it/s]
INFO:cityseer.config:Metrics computed for:
INFO:cityseer.config:Distance: 100m, Beta: 0.04, Walking Time: 1.25 minutes.
INFO:cityseer.config:Distance: 200m, Beta: 0.02, Walking Time: 2.5 minutes.
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
super().__setitem__(key, value)
This will generate a set of columns containing count
, sum
, min
, max
, mean
, and var
, in unweighted nw
and weighted wt
versions (where applicable) for each of the input distance thresholds.
nodes_gdf.columns
Index(['ns_node_idx', 'x', 'y', 'live', 'weight', 'primal_edge',
'primal_edge_node_a', 'primal_edge_node_b', 'primal_edge_idx',
'dual_node',
...
'cc_shape_index_sum_200_nw', 'cc_shape_index_sum_200_wt',
'cc_shape_index_mean_200_nw', 'cc_shape_index_mean_200_wt',
'cc_shape_index_count_200_nw', 'cc_shape_index_count_200_wt',
'cc_shape_index_var_200_nw', 'cc_shape_index_var_200_wt',
'cc_shape_index_max_200', 'cc_shape_index_min_200'],
dtype='object', length=110)
The result in columns can be explored with conventional Python ecosystem tools such as seaborn
and matplotlib
.
sns.histplot(=nodes_gdf,
data="cc_orientation_mean_200_wt",
x=50,
bins )
= plt.subplots(1, 1, figsize=(8, 8), facecolor="#1d1d1d")
fig, ax
nodes_gdf.plot(="cc_orientation_mean_200_wt",
column="Dark2",
cmap=False,
legend=0,
vmin=45,
vmax=ax,
ax
)
bldgs_gpd.plot(="orientation",
column="Dark2",
cmap=False,
legend=0,
vmin=45,
vmax=0.5,
alpha=ax,
ax
)438500, 438500 + 3500)
ax.set_xlim(4472500, 4472500 + 3500)
ax.set_ylim(False) ax.axis(
(np.float64(438500.0),
np.float64(442000.0),
np.float64(4472500.0),
np.float64(4476000.0))