Statistics from OSM data

Calculate building statistics from osmnx data.

import matplotlib.pyplot as plt
import momepy
import pandas as pd
import seaborn as sns
from cityseer.metrics import layers
from cityseer.tools import graphs, io
from matplotlib import colors
from osmnx import features
/Users/gareth/dev/benchmark-urbanism/cityseer-examples/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

To start, follow the same approach as shown in the network examples to create the network.

lng, lat = -0.13396079424572427, 51.51371088849723
buffer = 1500
poly_wgs, epsg_code = io.buffered_point_poly(lng, lat, buffer)
G = io.osm_graph_from_poly(poly_wgs)
G = graphs.nx_decompose(G, 50)
G_dual = graphs.nx_to_dual(G)
nodes_gdf, _edges_gdf, network_structure = io.network_structure_from_nx(G_dual)
WARNING:cityseer.tools.io:Merging node 12450391665 into 25544116 due to identical x, y coords.
WARNING:cityseer.tools.io:Unable to parse level info: -`;-4
WARNING:cityseer.tools.io:Unable to parse level info: -`;-4
WARNING:cityseer.tools.io:Unable to parse level info: -`;-4
INFO:cityseer.tools.graphs:Generating interpolated edge geometries.
INFO:cityseer.tools.io:Converting networkX graph to CRS code 32630.
INFO:cityseer.tools.io:Processing node x, y coordinates.
INFO:cityseer.tools.io:Processing edge geom coordinates, if present.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.graphs:Removing dangling nodes.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Splitting opposing edges.
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Splitting opposing edges.
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Splitting opposing edges.
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Splitting opposing edges.
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.util:Creating nodes STR tree
INFO:cityseer.tools.graphs:Consolidating nodes.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.util:Creating nodes STR tree
INFO:cityseer.tools.graphs:Consolidating nodes.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.util:Creating nodes STR tree
INFO:cityseer.tools.graphs:Consolidating nodes.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.util:Creating nodes STR tree
INFO:cityseer.tools.graphs:Consolidating nodes.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.util:Creating nodes STR tree
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Snapping gapped endings.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Splitting opposing edges.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.graphs:Removing dangling nodes.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Splitting opposing edges.
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.util:Creating nodes STR tree
INFO:cityseer.tools.graphs:Consolidating nodes.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.util:Creating edges STR tree.
INFO:cityseer.tools.graphs:Splitting opposing edges.
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.util:Creating nodes STR tree
INFO:cityseer.tools.graphs:Consolidating nodes.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
INFO:cityseer.tools.graphs:Ironing edges.
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 1.
INFO:cityseer.tools.graphs:Removing dangling nodes.
INFO:cityseer.tools.graphs:Removing filler nodes.
INFO:cityseer.tools.graphs:Decomposing graph to maximum edge lengths of 50.
INFO:cityseer.tools.graphs:Converting graph to dual.
INFO:cityseer.tools.graphs:Preparing dual nodes
INFO:cityseer.tools.graphs:Preparing dual edges (splitting and welding geoms)
INFO:cityseer.tools.io:Preparing node and edge arrays from networkX graph.
INFO:cityseer.graph:Edge R-tree built successfully with 8875 items.

Prepare the buildings GeoDataFrame by downloading the data from OpenStreetMap. The osmnx features_from_polygon works well for this purpose. In this instance, we are specifically targeting features that are labelled as an building.

It is important to convert the derivative GeoDataFrame to the same CRS as the network.

data_gdf = features.features_from_polygon(poly_wgs, tags={"building": True})
data_gdf = data_gdf.to_crs(G.graph["crs"])
data_gdf.tail()
geometry addr:city addr:postcode addr:street brand brand:wikidata brand:wikipedia building email name ... name:th name:tt name:ur name:vec name:vi name:war name:wuu name:yi name:yue building:flats
element id
way 1374498886 POLYGON ((699072.276 5711987.717, 699073.651 5... London WC1H 0PD Gordon Square NaN NaN NaN kiosk NaN Garden Kiosk ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1374681344 POLYGON ((700334.357 5710842.473, 700337.684 5... NaN NaN NaN NaN NaN NaN commercial NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1374681345 POLYGON ((700347.126 5710850.783, 700337.684 5... NaN NaN NaN NaN NaN NaN yes NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1374681346 POLYGON ((700335.868 5710827.778, 700339.655 5... NaN NaN NaN NaN NaN NaN commercial NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1374820282 POLYGON ((699547.129 5710318.314, 699559.064 5... NaN NaN NaN NaN NaN NaN yes NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 554 columns

Some preparatory data cleaning is typically necessary. This example extracts the particular rows and columns of interest for the subsequent steps of analysis.

bldg_way = data_gdf.loc["way"]
bldg_way = bldg_way.reset_index(level=0, drop=True)
bldg_way = bldg_way[["geometry"]]
bldg_way.head()
geometry
0 POLYGON ((699578.37 5711981.881, 699573.237 57...
1 POLYGON ((699509.329 5711958.943, 699502.917 5...
2 POLYGON ((699771.225 5711711.453, 699763.808 5...
3 POLYGON ((699408.814 5712013.541, 699453.773 5...
4 POLYGON ((698517.969 5709739.367, 698517.359 5...
bldg_rel = data_gdf.loc["relation"]
bldg_rel = bldg_rel.reset_index(level=0, drop=True)
bldg_rel = bldg_rel[["geometry"]]
bldg_rel.head()
geometry
0 POLYGON ((698284.165 5711770.625, 698310.212 5...
1 POLYGON ((699365.964 5709570.204, 699366.296 5...
2 POLYGON ((699421.233 5709690.565, 699431.714 5...
3 POLYGON ((699578.614 5709619.103, 699586.076 5...
4 POLYGON ((699708.798 5710453.42, 699717.241 57...
bldgs_gpd = pd.concat([bldg_way, bldg_rel], axis=0)
bldgs_gpd = bldgs_gpd.explode(ignore_index=True)
bldgs_gpd = bldgs_gpd.reset_index(drop=True)
bldgs_gpd.head()
geometry
0 POLYGON ((699578.37 5711981.881, 699573.237 57...
1 POLYGON ((699509.329 5711958.943, 699502.917 5...
2 POLYGON ((699771.225 5711711.453, 699763.808 5...
3 POLYGON ((699408.814 5712013.541, 699453.773 5...
4 POLYGON ((698517.969 5709739.367, 698517.359 5...
bldgs_gpd["area"] = bldgs_gpd.geometry.area
bldgs_gpd["perimeter"] = bldgs_gpd.geometry.length
bldgs_gpd["compactness"] = momepy.circular_compactness(bldgs_gpd.geometry)
bldgs_gpd["orientation"] = momepy.orientation(bldgs_gpd.geometry)
bldgs_gpd["shape_index"] = momepy.shape_index(bldgs_gpd.geometry)
bldgs_gpd.geometry.type.unique()
array(['Polygon'], dtype=object)

Use the layers.compute_stats method to compute statistics for numeric columns in the GeoDataFrame. These are specified with the stats_column_labels argument. These statistics are computed over the network using network distances. In the case of weighted variances, the contribution of any particular point is weighted by the distance from the point of measure.

distances = [100, 200]
nodes_gdf, bldgs_gpd = layers.compute_stats(
    bldgs_gpd,
    stats_column_labels=[
        "area",
        "perimeter",
        "compactness",
        "orientation",
        "shape_index",
    ],
    nodes_gdf=nodes_gdf,
    network_structure=network_structure,
    distances=distances,
)
INFO:cityseer.metrics.layers:Computing statistics.
INFO:cityseer.metrics.layers:Assigning data to network.
INFO:cityseer.data:Assigning 9768 data entries to network nodes (max_dist: 100).
INFO:cityseer.data:Collected 89624 potential node assignments from data entries.
INFO:cityseer.data:Finished assigning data. 89624 assignments added to 4849 nodes.
INFO:cityseer.graph:Barriers unset and R-tree cleared.
INFO:cityseer.config:Metrics computed for:
INFO:cityseer.config:Distance: 100m, Beta: 0.04, Walking Time: 1.25 minutes.
INFO:cityseer.config:Distance: 200m, Beta: 0.02, Walking Time: 2.5 minutes.

This will generate a set of columns containing count, sum, min, max, mean, and var, in unweighted nw and weighted wt versions (where applicable) for each of the input distance thresholds.

nodes_gdf.columns
Index(['ns_node_idx', 'x', 'y', 'live', 'weight', 'primal_edge',
       'primal_edge_node_a', 'primal_edge_node_b', 'primal_edge_idx',
       'dual_node',
       ...
       'cc_shape_index_sum_200_nw', 'cc_shape_index_sum_200_wt',
       'cc_shape_index_mean_200_nw', 'cc_shape_index_mean_200_wt',
       'cc_shape_index_count_200_nw', 'cc_shape_index_count_200_wt',
       'cc_shape_index_var_200_nw', 'cc_shape_index_var_200_wt',
       'cc_shape_index_max_200', 'cc_shape_index_min_200'],
      dtype='object', length=110)

The result in columns can be explored with conventional Python ecosystem tools such as seaborn and matplotlib.

sns.histplot(
    data=nodes_gdf,
    x="cc_area_mean_100_wt",
    bins=50,
)

fig, ax = plt.subplots(1, 1, figsize=(8, 8), facecolor="#1d1d1d")
nodes_gdf.plot(
    column="cc_area_mean_100_wt",
    cmap="hot",
    legend=False,
    vmin=0,
    vmax=1000,
    norm=colors.LogNorm(),  # Apply log normalization
    ax=ax,
)
bldgs_gpd.plot(
    column="area",
    cmap="hot",
    legend=False,
    vmin=0,
    vmax=1000,
    alpha=0.5,
    norm=colors.LogNorm(),  # Apply log normalization
    ax=ax,
)
ax.axis(False)
(np.float64(697102.5282020274),
 np.float64(700585.1041845226),
 np.float64(5709114.861675241),
 np.float64(5712536.396237949))