Graph Cleaning

Last update

March 23, 2024

Install and update cityseer if necessary.

# !pip install --upgrade cityseer

See the guide for a preamble.

Please also see the graph corrections guide for an alternative approach.

Downloading data

This example will make use of OSM data downloaded from the OSM API. To keep things interesting, let’s pick London Soho, which will be buffered and cleaned for a 1,250m radius.

from shapely import geometry
import utm

from cityseer.tools import graphs, plot, io

# Let's download data within a 1,250m buffer around London Soho:
lng, lat = -0.13396079424572427, 51.51371088849723
# lng, lat = 2.166981, 41.389526 -- Barcelona - which is a complex case
buffer = 1250
# creates a WGS shapely polygon
poly_wgs, _ = io.buffered_point_poly(lng, lat, buffer)
# use a WGS shapely polygon to download information from OSM
# this version will not simplify
G_raw = io.osm_graph_from_poly(poly_wgs, simplify=False)
# whereas this version does simplify
G_utm = io.osm_graph_from_poly(poly_wgs)

# select extents for clipping the plotting extents
easting, northing = utm.from_latlon(lat, lng)[:2]
buff = geometry.Point(easting, northing).buffer(1000)
min_x, min_y, max_x, max_y = buff.bounds


# reusable plot function
def simple_plot(_G, plot_geoms=True):
    # plot using the selected extents
    plot.plot_nx(
        _G,
        labels=False,
        plot_geoms=plot_geoms,
        node_size=4,
        edge_width=1,
        x_lim=(min_x, max_x),
        y_lim=(min_y, max_y),
        figsize=(6, 6),
        dpi=150,
    )
INFO:cityseer.tools.io:Converting networkX graph from EPSG code 4326 to EPSG code 32630.
INFO:cityseer.tools.io:Processing node x, y coordinates.
100%|██████████| 12102/12102 [00:00<00:00, 275377.28it/s]
INFO:cityseer.tools.io:Processing edge geom coordinates, if present.
100%|██████████| 13412/13412 [00:00<00:00, 894909.41it/s]
INFO:cityseer.tools.graphs:Generating interpolated edge geometries.
100%|██████████| 13412/13412 [00:00<00:00, 73978.03it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 12102/12102 [00:01<00:00, 7822.16it/s]
INFO:cityseer.tools.io:Converting networkX graph from EPSG code 4326 to EPSG code 32630.
INFO:cityseer.tools.io:Processing node x, y coordinates.
100%|██████████| 12102/12102 [00:00<00:00, 486906.03it/s]
INFO:cityseer.tools.io:Processing edge geom coordinates, if present.
100%|██████████| 13412/13412 [00:00<00:00, 845454.49it/s]
INFO:cityseer.tools.graphs:Generating interpolated edge geometries.
100%|██████████| 13412/13412 [00:00<00:00, 16189.03it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 12102/12102 [00:02<00:00, 5710.20it/s]
INFO:cityseer.tools.graphs:Removing dangling nodes.
100%|██████████| 4092/4092 [00:00<00:00, 229732.59it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 3298/3298 [00:00<00:00, 24764.21it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 2946/2946 [00:00<00:00, 65353.33it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 2946/2946 [00:01<00:00, 2219.34it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1983/1983 [00:00<00:00, 101890.30it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 3168/3168 [00:00<00:00, 5400.99it/s] 
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 2910/2910 [00:00<00:00, 617464.70it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 1958/1958 [00:01<00:00, 1476.44it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 3016/3016 [00:00<00:00, 131997.92it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 2064/2064 [00:00<00:00, 45089.94it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 2064/2064 [00:00<00:00, 2830.37it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1793/1793 [00:00<00:00, 20507.39it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 2747/2747 [00:00<00:00, 11471.26it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1710/1710 [00:00<00:00, 89981.68it/s]
INFO:cityseer.tools.graphs:Ironing edges.
100%|██████████| 2544/2544 [00:00<00:00, 2942.02it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 1.
100%|██████████| 2544/2544 [00:00<00:00, 139942.15it/s]
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 2544/2544 [00:00<00:00, 461698.30it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 1675/1675 [00:00<00:00, 2153.87it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 2550/2550 [00:00<00:00, 97839.08it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 1681/1681 [00:00<00:00, 65441.11it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 1681/1681 [00:00<00:00, 7709.81it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1663/1663 [00:00<00:00, 287373.42it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 2523/2523 [00:00<00:00, 68375.67it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1659/1659 [00:00<00:00, 409724.45it/s]
INFO:cityseer.tools.graphs:Ironing edges.
100%|██████████| 2506/2506 [00:01<00:00, 2281.32it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 1.
100%|██████████| 2506/2506 [00:00<00:00, 103703.04it/s]

The automated graph cleaning may give satisfactory results depending on the intended end-use. See the steps following beneath for an example of how to manually clean the graph where additional control is preferred.

print("The graph before simplification.")
simple_plot(G_raw, plot_geoms=False)

print("The graph after simplification")
simple_plot(G_utm, plot_geoms=True)
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 5608/5608 [00:00<00:00, 14499.80it/s]
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 2506/2506 [00:00<00:00, 11882.23it/s]
The graph before simplification.
The graph after simplification

Deducing the network topology

Once OSM data has been converted to a NetworkX MultiGraph, the tools.graphs module can be used to clean the network.

The convenience method used for this demonstration has already converted the graph from a geographic WGS to projected UTM coordinate system; however, if working with a graph which is otherwise in a WGS coordinate system then it must be converted to a projected coordinate system prior to further processing. This can be done with io.nx_wgs_to_utm.

Now that raw OSM data has been loaded into a NetworkX graph, the cityseer.tools.graph methods can be used to further clean and prepare the network prior to analysis.

At this stage, the raw OSM graph is going to look a bit messy. Note how that nodes have been used to represent the roadway geometry. These nodes need to be removed and will be abstracted into shapely LineString geometries assigned to the respective street edges. So doing, the geometric representation will be kept distinct from the network topology.

# remove dangling nodes: short dead-end stubs
# these are often found at entrances to buildings or parking lots
# The removed_disconnected flag will removed isolated network components
# i.e. disconnected portions of network that are not joined to the main street network
G = graphs.nx_remove_dangling_nodes(G_raw)
simple_plot(G)
INFO:cityseer.tools.graphs:Removing dangling nodes.
100%|██████████| 4092/4092 [00:00<00:00, 194692.21it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 3298/3298 [00:00<00:00, 26554.79it/s]
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 4332/4332 [00:00<00:00, 18199.01it/s]

Refining the network

Things are already looked much better, but we still have areas with large concentrations of nodes at complex intersections and many parallel roadways, which will confound centrality methods. We’ll now try to remove as much of this as possible. These steps involve the consolidation of nodes to clean-up extraneous nodes, which may otherwise exaggerate the intensity or complexity of the network in certain situations.

In this case, we’re trying to get rid of parallel road segments so we’ll do this in three steps, though it should be noted that, depending on your use-case, Step 1 may already be sufficient:

Step 1: An initial pass to cleanup complex intersections will be performed with the graphs.nx_consolidate_nodes function. The arguments passed to the parameters allow for a number of different strategies, such as whether to ‘crawl’; whether to use intersection through routes for determining new centroids; and to set the direct or indirect neighbour policies according to which nodes and edges are consolidated. These are explained more fully in the documentation.

G1 = graphs.nx_consolidate_nodes(G, buffer_dist=12, crawl=True, contains_buffer_dist=50)
simple_plot(G1)
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 2946/2946 [00:00<00:00, 74437.61it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 2946/2946 [00:01<00:00, 2764.75it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1983/1983 [00:00<00:00, 158096.62it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 3168/3168 [00:00<00:00, 11433.26it/s]
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 2910/2910 [00:00<00:00, 15955.78it/s]

Complex intersections have now been simplified. In Step 2, graphs.nx_split_opposing_geoms is used to intentionally split edges in near proximity to nodes located on an adjacent parallel roadway. This helps with the final pass of consolidation in Step 3.

G2 = graphs.nx_split_opposing_geoms(G1, buffer_dist=15, contains_buffer_dist=50)
simple_plot(G2)
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 2910/2910 [00:00<00:00, 290833.86it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 1958/1958 [00:01<00:00, 1360.16it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 3016/3016 [00:00<00:00, 143093.31it/s]
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 3016/3016 [00:00<00:00, 12877.87it/s]

Rerun consolidation to clean up residual clusters of nodes.

G3 = graphs.nx_consolidate_nodes(G2, buffer_dist=15, contains_buffer_dist=50)
simple_plot(G3)
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 2064/2064 [00:00<00:00, 64375.64it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 2064/2064 [00:00<00:00, 3985.15it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1793/1793 [00:00<00:00, 44076.56it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 2747/2747 [00:00<00:00, 12130.66it/s]
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 2579/2579 [00:00<00:00, 17996.66it/s]

Finally, the edges are “ironed” to straighten out artefacts introduced by automated cleaning, which will sometimes bend the ends of edge segments to the locations of new centroids.

G4 = graphs.nx_remove_filler_nodes(G3)
G5 = graphs.nx_iron_edges(G4)
simple_plot(G5)
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1710/1710 [00:00<00:00, 97937.54it/s]
INFO:cityseer.tools.graphs:Ironing edges.
100%|██████████| 2544/2544 [00:01<00:00, 2113.16it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 1.
100%|██████████| 2544/2544 [00:00<00:00, 185004.32it/s]
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 2544/2544 [00:00<00:00, 22932.11it/s]

An extra round of consolidation can help.

G6 = graphs.nx_split_opposing_geoms(G5, buffer_dist=15, contains_buffer_dist=50)
G6 = graphs.nx_consolidate_nodes(G6, buffer_dist=15, contains_buffer_dist=50)
G6 = graphs.nx_remove_filler_nodes(G6)
G6 = graphs.nx_iron_edges(G6)
simple_plot(G6)
INFO:cityseer.tools.util:Creating edges STR tree.
100%|██████████| 2544/2544 [00:00<00:00, 386871.74it/s]
INFO:cityseer.tools.graphs:Splitting opposing edges.
100%|██████████| 1675/1675 [00:00<00:00, 2418.66it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 2550/2550 [00:00<00:00, 129152.13it/s]
INFO:cityseer.tools.util:Creating nodes STR tree
100%|██████████| 1681/1681 [00:00<00:00, 74867.27it/s]
INFO:cityseer.tools.graphs:Consolidating nodes.
100%|██████████| 1681/1681 [00:00<00:00, 8668.60it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1663/1663 [00:00<00:00, 250264.70it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 50.
100%|██████████| 2523/2523 [00:00<00:00, 102607.59it/s]
INFO:cityseer.tools.graphs:Removing filler nodes.
100%|██████████| 1659/1659 [00:00<00:00, 150421.55it/s]
INFO:cityseer.tools.graphs:Ironing edges.
100%|██████████| 2506/2506 [00:00<00:00, 2526.77it/s]
INFO:cityseer.tools.graphs:Merging parallel edges within buffer of 1.
100%|██████████| 2506/2506 [00:00<00:00, 185299.45it/s]
INFO:cityseer.tools.plot:Preparing graph nodes
INFO:cityseer.tools.plot:Preparing graph edges
100%|██████████| 2506/2506 [00:00<00:00, 17484.99it/s]