Statistics from OSM data

Calculate building statistics from osmnx data.

import matplotlib.pyplot as plt
import momepy
import pandas as pd
import seaborn as sns
from cityseer.metrics import layers
from cityseer.tools import graphs, io
from matplotlib import colors
from osmnx import features
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

To start, follow the same approach as shown in the network examples to create the network.

lng, lat = -0.13396079424572427, 51.51371088849723
buffer = 1500
poly_wgs, epsg_code = io.buffered_point_poly(lng, lat, buffer)
G = io.osm_graph_from_poly(poly_wgs)
G = graphs.nx_decompose(G, 50)
G_dual = graphs.nx_to_dual(G)
nodes_gdf, _edges_gdf, network_structure = io.network_structure_from_nx(G_dual)
WARNING:cityseer.tools.io:Merging node 12450391665 into 25544116 due to identical x, y coords.
WARNING:cityseer.tools.io:Unable to parse level info: -`;-4
WARNING:cityseer.tools.io:Unable to parse level info: -`;-4
WARNING:cityseer.tools.io:Unable to parse level info: -`;-4
INFO:cityseer.tools.graphs:Generating interpolated edge geometries.
100%|██████████| 24093/24093 [00:00<00:00, 52053.70it/s]
INFO:cityseer.tools.io:Converting networkX graph to CRS code 32630.
INFO:cityseer.tools.io:Processing node x, y coordinates.
100%|██████████| 22376/22376 [00:00<00:00, 824367.32it/s]
INFO:cityseer.tools.io:Processing edge geom coordinates, if present.
100%|██████████| 24093/24093 [00:01<00:00, 17619.79it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 22376/22376 [00:00<00:00, 23764.97it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 12155/12155 [00:00<00:00, 1504508.21it/s]
100%|██████████| 12155/12155 [00:01<00:00, 10579.17it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 10438/10438 [00:00<00:00, 64006.25it/s]
100%|██████████| 7386/7386 [00:00<00:00, 306039.25it/s]
INFO:cityseer.tools.graphs:Removing dangling nodes.
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 9004/9004 [00:00<00:00, 902121.52it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 6652/6652 [00:00<00:00, 1451714.98it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 5270/5270 [00:00<00:00, 90093.47it/s]
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 6661/6661 [00:00<00:00, 52171.12it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 6612/6612 [00:00<00:00, 1670043.24it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 5270/5270 [00:00<00:00, 38631.24it/s]
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 6668/6668 [00:00<00:00, 329042.43it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 6658/6658 [00:00<00:00, 70240.20it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 5270/5270 [00:00<00:00, 147680.84it/s]
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 6659/6659 [00:00<00:00, 492746.73it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 6659/6659 [00:00<00:00, 1442882.18it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 5270/5270 [00:00<00:00, 96024.53it/s]
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 6660/6660 [00:00<00:00, 428962.91it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 5270/5270 [00:00<00:00, 61706.08it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 5270/5270 [00:00<00:00, 6486.51it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 6608/6608 [00:00<00:00, 265646.49it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 5230/5230 [00:00<00:00, 437909.69it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 5158/5158 [00:00<00:00, 61966.97it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 5158/5158 [00:06<00:00, 789.33it/s] 
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 6060/6060 [00:00<00:00, 81597.58it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 4798/4798 [00:00<00:00, 622637.62it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 4762/4762 [00:00<00:00, 58029.97it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 4762/4762 [00:01<00:00, 3430.03it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 5851/5851 [00:00<00:00, 209652.49it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 4708/4708 [00:00<00:00, 924951.20it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 4694/4694 [00:00<00:00, 26154.54it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 4694/4694 [00:02<00:00, 2148.87it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 5710/5710 [00:00<00:00, 279914.40it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 4613/4613 [00:00<00:00, 767943.02it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 4598/4598 [00:00<00:00, 56333.59it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 5687/5687 [00:00<00:00, 1612452.30it/s]
INFO:cityseer.tools.graphs:Snapping gapped endings.
100%|██████████| 4598/4598 [00:00<00:00, 18176.86it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 5888/5888 [00:00<00:00, 1752985.66it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 4598/4598 [00:00<00:00, 16131.24it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 6766/6766 [00:00<00:00, 442655.76it/s]
INFO:cityseer.tools.graphs:Removing dangling nodes.
100%|██████████| 5088/5088 [00:00<00:00, 645277.54it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 4287/4287 [00:00<00:00, 32714.52it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 4808/4808 [00:00<00:00, 1453001.92it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 3131/3131 [00:00<00:00, 11904.13it/s]
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 4848/4848 [00:00<00:00, 128808.62it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 3131/3131 [00:00<00:00, 61487.45it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 3131/3131 [00:10<00:00, 295.17it/s] 
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 3874/3874 [00:00<00:00, 44290.41it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 3678/3678 [00:00<00:00, 1361695.66it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 2472/2472 [00:00<00:00, 12715.47it/s]
INFO:cityseer.tools.graphs:Squashing opposing nodes
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 3688/3688 [00:00<00:00, 401000.47it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 2472/2472 [00:00<00:00, 56345.58it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 2472/2472 [00:06<00:00, 363.22it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 25.
100%|██████████| 3165/3165 [00:00<00:00, 56212.72it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 2028/2028 [00:00<00:00, 106089.56it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 2930/2930 [00:00<00:00, 137809.62it/s]
INFO:cityseer.tools.graphs:Ironing edges.
100%|██████████| 2911/2911 [00:00<00:00, 22077.08it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 1.
100%|██████████| 2898/2898 [00:00<00:00, 440705.30it/s]
INFO:cityseer.tools.graphs:Removing dangling nodes.
100%|██████████| 1887/1887 [00:00<00:00, 709706.93it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1869/1869 [00:00<00:00, 305108.56it/s]
INFO:cityseer.tools.graphs:Decomposing graph to maximum edge lengths of 50.
100%|██████████| 2845/2845 [00:00<00:00, 3897.41it/s]
INFO:cityseer.tools.graphs:Converting graph to dual.
INFO:cityseer.tools.graphs:Preparing dual nodes
100%|██████████| 5001/5001 [00:00<00:00, 43693.58it/s]
INFO:cityseer.tools.graphs:Preparing dual edges (splitting and welding geoms)
100%|██████████| 5001/5001 [00:05<00:00, 882.45it/s] 
INFO:cityseer.tools.io:Preparing node and edge arrays from networkX graph.
100%|██████████| 5001/5001 [00:00<00:00, 57807.11it/s]
100%|██████████| 5001/5001 [00:00<00:00, 6341.88it/s]

Prepare the buildings GeoDataFrame by downloading the data from OpenStreetMap. The osmnx features_from_polygon works well for this purpose. In this instance, we are specifically targeting features that are labelled as an building.

It is important to convert the derivative GeoDataFrame to the same CRS as the network.

data_gdf = features.features_from_polygon(poly_wgs, tags={"building": True})
data_gdf = data_gdf.to_crs(G.graph["crs"])
data_gdf.tail()
geometry addr:city addr:postcode addr:street brand brand:wikidata brand:wikipedia building email name ... name:th name:tt name:ur name:vec name:vi name:war name:wuu name:yi name:yue building:flats
element id
way 1376078547 POLYGON ((700109.929 5711088.531, 700109.345 5... NaN NaN NaN NaN NaN NaN yes NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1376143786 POLYGON ((699629.646 5710126.422, 699636.667 5... NaN NaN NaN NaN NaN NaN yes NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1376223154 POLYGON ((698717.576 5711621.834, 698721.937 5... NaN NaN NaN NaN NaN NaN yes NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1376708981 POLYGON ((699087.434 5711425.131, 699087.697 5... NaN NaN NaN NaN NaN NaN yes NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1377426683 POLYGON ((699683.471 5710178.415, 699687.2 571... NaN NaN NaN NaN NaN NaN stall NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 554 columns

Some preparatory data cleaning is typically necessary. This example extracts the particular rows and columns of interest for the subsequent steps of analysis.

bldg_way = data_gdf.loc["way"]
bldg_way = bldg_way.reset_index(level=0, drop=True)
bldg_way = bldg_way[["geometry"]]
bldg_way.head()
geometry
0 POLYGON ((699578.37 5711981.881, 699573.237 57...
1 POLYGON ((699509.329 5711958.943, 699502.917 5...
2 POLYGON ((699771.225 5711711.453, 699763.808 5...
3 POLYGON ((699408.814 5712013.541, 699453.773 5...
4 POLYGON ((698517.969 5709739.367, 698517.359 5...
bldg_rel = data_gdf.loc["relation"]
bldg_rel = bldg_rel.reset_index(level=0, drop=True)
bldg_rel = bldg_rel[["geometry"]]
bldg_rel.head()
geometry
0 POLYGON ((698284.165 5711770.625, 698310.212 5...
1 POLYGON ((699365.964 5709570.204, 699366.296 5...
2 POLYGON ((699421.233 5709690.565, 699431.714 5...
3 POLYGON ((699578.614 5709619.103, 699586.076 5...
4 POLYGON ((699708.798 5710453.42, 699717.241 57...
bldgs_gpd = pd.concat([bldg_way, bldg_rel], axis=0)
bldgs_gpd = bldgs_gpd.explode(ignore_index=True)
bldgs_gpd = bldgs_gpd.reset_index(drop=True)
bldgs_gpd.head()
geometry
0 POLYGON ((699578.37 5711981.881, 699573.237 57...
1 POLYGON ((699509.329 5711958.943, 699502.917 5...
2 POLYGON ((699771.225 5711711.453, 699763.808 5...
3 POLYGON ((699408.814 5712013.541, 699453.773 5...
4 POLYGON ((698517.969 5709739.367, 698517.359 5...
bldgs_gpd["area"] = bldgs_gpd.geometry.area
bldgs_gpd["perimeter"] = bldgs_gpd.geometry.length
bldgs_gpd["compactness"] = momepy.circular_compactness(bldgs_gpd.geometry)
bldgs_gpd["orientation"] = momepy.orientation(bldgs_gpd.geometry)
bldgs_gpd["shape_index"] = momepy.shape_index(bldgs_gpd.geometry)
bldgs_gpd.geometry.type.unique()
array(['Polygon'], dtype=object)

Use the layers.compute_stats method to compute statistics for numeric columns in the GeoDataFrame. These are specified with the stats_column_labels argument. These statistics are computed over the network using network distances. In the case of weighted variances, the contribution of any particular point is weighted by the distance from the point of measure.

distances = [100, 200]
nodes_gdf, bldgs_gpd = layers.compute_stats(
    bldgs_gpd,
    stats_column_labels=[
        "area",
        "perimeter",
        "compactness",
        "orientation",
        "shape_index",
    ],
    nodes_gdf=nodes_gdf,
    network_structure=network_structure,
    distances=distances,
)
INFO:cityseer.metrics.layers:Computing statistics.
INFO:cityseer.metrics.layers:Assigning data to network.
100%|██████████| 9767/9767 [00:00<00:00, 1000922.77it/s]
100%|██████████| 5001/5001 [00:01<00:00, 4926.64it/s]
INFO:cityseer.config:Metrics computed for:
INFO:cityseer.config:Distance: 100m, Beta: 0.04, Walking Time: 1.25 minutes.
INFO:cityseer.config:Distance: 200m, Beta: 0.02, Walking Time: 2.5 minutes.
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  super().__setitem__(key, value)
/Users/gareth/dev/cityseer-examples/.venv/lib/python3.11/site-packages/geopandas/geodataframe.py:1819: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  super().__setitem__(key, value)

This will generate a set of columns containing count, sum, min, max, mean, and var, in unweighted nw and weighted wt versions (where applicable) for each of the input distance thresholds.

nodes_gdf.columns
Index(['ns_node_idx', 'x', 'y', 'live', 'weight', 'primal_edge',
       'primal_edge_node_a', 'primal_edge_node_b', 'primal_edge_idx',
       'dual_node',
       ...
       'cc_shape_index_sum_200_nw', 'cc_shape_index_sum_200_wt',
       'cc_shape_index_mean_200_nw', 'cc_shape_index_mean_200_wt',
       'cc_shape_index_count_200_nw', 'cc_shape_index_count_200_wt',
       'cc_shape_index_var_200_nw', 'cc_shape_index_var_200_wt',
       'cc_shape_index_max_200', 'cc_shape_index_min_200'],
      dtype='object', length=110)

The result in columns can be explored with conventional Python ecosystem tools such as seaborn and matplotlib.

sns.histplot(
    data=nodes_gdf,
    x="cc_area_mean_100_wt",
    bins=50,
)

fig, ax = plt.subplots(1, 1, figsize=(8, 8), facecolor="#1d1d1d")
nodes_gdf.plot(
    column="cc_area_mean_100_wt",
    cmap="hot",
    legend=False,
    vmin=0,
    vmax=1000,
    norm=colors.LogNorm(),  # Apply log normalization
    ax=ax,
)
bldgs_gpd.plot(
    column="area",
    cmap="hot",
    legend=False,
    vmin=0,
    vmax=1000,
    alpha=0.5,
    norm=colors.LogNorm(),  # Apply log normalization
    ax=ax,
)
ax.axis(False)
(np.float64(697129.6654319772),
 np.float64(700565.4282649271),
 np.float64(5709114.974765815),
 np.float64(5712534.021335888))