Clustering the network#

Note

If you have not yet set up Python on your computer, you can execute this tutorial in your browser via Google Colab. Click on the rocket in the top right corner and launch “Colab”. If that doesn’t work download the .ipynb file and import it in Google Colab.

Then install the following packages by executing the following command in a Jupyter cell at the top of the notebook.

!pip install cartopy folium geopandas pandas pypsa mapclassify matplotlib numpy shapely

In this notebook, we show how both PyPSA and PyPSA-Eur handle spatial clustering of networks.

Background#

Motivation#

  • Base network contains more than 6700 buses, of which around 4900 are based on real OSM substation polygons, 1800 are “virtual” buses (the latter needed to model lines that are directly connected to other lines without substations)

  • Solving an investment and operation model at this regional scale, up to hourly resolution and including sectors beyond electricity is computationally not feasible

  • Depending on the focus region of interest, both the PyPSA framework and PyPSA-Eur model enable a variety of established clustering algorithms

  • All components within a bus region will be aggregated and mapped, accordingly

Clustering options#

PyPSA & PyPSA-Eur

  • By total number of regions, i.e.:

  • By custom busmap, i.e., providing a dictionary that maps each individual bus to a bus region/group

PyPSA-Eur

  • By administrative regions, i.e. NUTS0-NUTS3 for EU member status and ADM0-ADM1 for non-EU member states.

Underneath, all clustering methods harness the built-in clustering by busmap. Any clustering method, including HAC, k-means, or by administrative regions, calculates individual busmaps

Exploring the clustering methods#

For execution of the notebook in Google Colab, comment out and run:

# !pip install cartopy folium geopandas pandas pypsa mapclassify matplotlib numpy shapely

Note that in PyPSA-Eur, all of these steps are built into the cluster_network Snakemake rule. For illustrative purposes, we go through the underlying functions, step by step in this notebook. First, we import the example PyPSA network file and built-in functions from the PyPSA package.

import pandas as pd
import pypsa
resources = "https://raw.githubusercontent.com/resilient-project/pypsa-workshop/main/pypsa-workshop/resources/"
n = pypsa.Network(resources + "base.nc")
INFO:pypsa.io:Retrieving network data from https://raw.githubusercontent.com/resilient-project/pypsa-workshop/main/pypsa-workshop/resources/base.nc
WARNING:pypsa.io:Importing network from PyPSA version v0.33.2 while current version is v0.34.1. Read the release notes at https://pypsa.readthedocs.io/en/latest/release_notes.html to prepare your network for import.
INFO:pypsa.io:Imported network base.nc has buses, carriers, lines, links, shapes, transformers

Clustering by K-Means#

In this first example, we make a clustering based on the K-Means algorithm. We prepare a busmap weighting all original buses, equally. We want to cluster Europe to a 150 nodes.

weighting = pd.Series(1, n.buses.index)
print(weighting.head())
Bus
way/150003587-400     1
way/150003587-220     1
way/174609198-220     1
way/249066440-400     1
way/1302364573-220    1
dtype: int64

We use the imported spatial clustering functions to build the K-Means-based busmap.

busmap_kmeans = n.cluster.busmap_by_kmeans(bus_weightings=weighting, n_clusters=90)

Let’s double-check if all buses have been mapped correctly in the busmap

print(busmap_kmeans.head())
# Unique reqions
print(f"\nUnique values/regions: {len(busmap_kmeans.unique())}")
Bus
way/150003587-400     71
way/150003587-220     71
way/174609198-220     71
way/249066440-400     42
way/1302364573-220    71
dtype: object

Unique values/regions: 90

Now, use built-in PyPSA functions to cluster the network using the calculated busmap

# nc_kmeans = n.cluster.cluster_by_busmap(busmap_kmeans)

What happened? Note that PyPSA by default will throw an error when trying to cluster buses where the carrier attribute does not agree. Specifically, AC and DC buses cannot be clustered by default. So let’s correct our busmap to keep DC buses separate. Note that PyPSA-Eur has built-in steps to mitigate this.

b_dc = n.buses.loc[n.buses.carrier == "DC"].index

Then we make them unique by giving them an individual suffix.

busmap_kmeans.loc[b_dc] = busmap_kmeans.loc[b_dc].astype(str) + "_DC"

Now let’s try clustering again. Note that our number of unique bus regions has increased from the above operation.

print(f"Number of bus regions: {len(busmap_kmeans.unique())}")
Number of bus regions: 119

PyPSA will go through all attributes of the modelled components and throw an error similar to the above, if the attributes do not agree. For illustrative purposes, we ignore the remaining attributes in this notebook.

n.buses = n.buses.reindex(columns=n.components["Bus"]["attrs"].index[1:])
n.lines = n.lines.reindex(columns=n.components["Line"]["attrs"].index[1:])
n.lines["type"] = "Example"
nc_kmeans = n.cluster.cluster_by_busmap(busmap_kmeans)

Let’s explore the clustered network.

nc_kmeans.plot.explore()
INFO:pypsa.plot.maps.interactive:Components rendered on the map: Bus, Line, Link.
INFO:pypsa.plot.maps.interactive:Components omitted as they are missing or not selected: Generator, Load, StorageUnit, Transformer.
Make this Notebook Trusted to load map: File -> Trust Notebook

… and visualise them side-by-side

import cartopy.crs as ccrs
import matplotlib.pyplot as plt

fig, (ax, ax1) = plt.subplots(
    1, 2, subplot_kw={"projection": ccrs.EqualEarth()}, figsize=(12, 12)
)
plot_kwrgs = dict(bus_sizes=5e-3, line_widths=0.7, link_widths=0.7)
n.plot.map(ax=ax, title="Original network", **plot_kwrgs)
nc_kmeans.plot.map(ax=ax1, title="Clustered network using K-Means", **plot_kwrgs)
fig.tight_layout()
_images/0e7880d3ffbbb00e6f02714f0ea2a2f71bff0697d3f7e20629cdb4600024b031.png

Clustering by bidding zones#

First, we import world bidding zones from a public github repository and import it as GeoDataFrame.

import geopandas as gpd
import numpy as np

world_bz = gpd.read_file(
    "https://raw.githubusercontent.com/electricitymaps/electricitymaps-contrib/v1.238.0/web/geo/world.geojson"
)

We only need Europe in this example.

countries = [
    "AL",
    "AT",
    "BA",
    "BE",
    "BG",
    "CH",
    "CZ",
    "DE",
    "DK",
    "EE",
    "ES",
    "FI",
    "FR",
    "GB",
    "GR",
    "HR",
    "HU",
    "IE",
    "IT",
    "LT",
    "LU",
    "LV",
    "ME",
    "MK",
    "NL",
    "NO",
    "PL",
    "PT",
    "RO",
    "RS",
    "SE",
    "SI",
    "SK",
    "XK",
]
europe_bz = world_bz[world_bz["countryKey"].isin(countries)]

# Remove geometries that are south and east of Portugal
europe_bz = europe_bz[
    (europe_bz.geometry.bounds["minx"] > -12) & (europe_bz.geometry.bounds["maxy"] > 33)
]

europe_bz.head()
zoneName countryKey countryName geometry
0 SE-SE4 SE Sweden MULTIPOLYGON (((12.37512 56.91149, 12.35227 56...
1 SE-SE3 SE Sweden MULTIPOLYGON (((18.56359 60.22731, 18.32958 60...
2 SE-SE2 SE Sweden MULTIPOLYGON (((20.3096 63.65475, 20.072 63.66...
3 SE-SE1 SE Sweden MULTIPOLYGON (((16.38861 67.04356, 16.40663 67...
7 AL AL Albania MULTIPOLYGON (((20.56881 41.87367, 20.50041 42...

Let’s store all bidding zones in a variable and give them a random order to prepare for plotting.

bidding_zones = europe_bz["zoneName"].unique()
np.random.shuffle(bidding_zones)
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors

# Map each countrykey to a distinct color from the colormap
cmap = plt.colormaps["rainbow"]
norm = mcolors.Normalize(vmin=0, vmax=len(bidding_zones) - 1)

color_dict = {key: cmap(norm(i)) for i, key in enumerate(bidding_zones)}
europe_bz.loc[:, "color"] = europe_bz["zoneName"].map(color_dict)
europe_bz["color"] = europe_bz["color"].apply(
    lambda x: mcolors.to_hex(x)
)  # Hex conversion optional

europe_bz.plot(color=europe_bz["color"], edgecolor="black")
plt.show()
_images/b82a38ee9aed316620884e9f12bd4674ec6564c8694797bd5208191324827ae8.png

Next, we want to create a custom busmap based on the Polygons contained in the europe_bz GeoDataFrame. For this purpose, we create a GDF from n.buses first.

# Using sjoin_nearest
buses = gpd.GeoDataFrame(
    n.buses,
    geometry=gpd.points_from_xy(n.buses.x, n.buses.y),
    crs="EPSG:4326",
)

We convert both GDFs to a distance-based projection.

distance_crs = "EPSG:3857"
buses = buses.to_crs(distance_crs)
europe_bz = europe_bz.to_crs(distance_crs)

Now we can use GeoPandas sjoin_nearest() function to map each network bus to a corresponding bidding zone.

# Using sjoin_nearest map to europe_bz
buses_mapped = gpd.sjoin_nearest(
    buses[["geometry"]],
    europe_bz[["zoneName", "geometry"]],
    how="left",
    lsuffix="bus",
    rsuffix="bz",
)

busmap_bz = buses_mapped["zoneName"]

We use PyPSA’s clustering by busmap functionalities again to cluster by our custom busmap. This time, we want all buses to be clustered together, independent of their carrier attribute. AC and DC buses should be mapped to the same geographic regions.

n.buses["carrier"] = "AC"  # We turn all DC buses into AC

nc_bz = n.cluster.cluster_by_busmap(busmap_bz)

Again, we compare them side-by-side:

# set bounds
europe_bz_latlon = europe_bz.to_crs(epsg=4326)
bounds = europe_bz_latlon.total_bounds  # returns [minx, miny, maxx, maxy]

fig, (ax, ax1) = plt.subplots(
    1, 2, subplot_kw={"projection": ccrs.EqualEarth()}, figsize=(12, 12)
)

plot_kwrgs = dict(bus_sizes=5e-3, line_widths=0.7, link_widths=0.7)
n.plot(ax=ax, title="Original network", **plot_kwrgs)

# Adding the bidding zones
europe_bz.to_crs(epsg=8857).plot(ax=ax, color=europe_bz["color"], alpha=0.3)
europe_bz.to_crs(epsg=8857).plot(ax=ax1, color=europe_bz["color"], alpha=0.3)
nc_bz.plot(ax=ax1, title="Network clustered by bidding zones", **plot_kwrgs)

ax.set_extent([bounds[0], bounds[2], bounds[1], bounds[3]], crs=ccrs.PlateCarree())
ax1.set_extent([bounds[0], bounds[2], bounds[1], bounds[3]], crs=ccrs.PlateCarree())

fig.tight_layout()
/tmp/ipykernel_2162/1969779938.py:10: DeprecatedWarning:

plot is deprecated. Use `n.plot.map()` as a drop-in replacement instead.
/tmp/ipykernel_2162/1969779938.py:15: DeprecatedWarning:

plot is deprecated. Use `n.plot.map()` as a drop-in replacement instead.
_images/1cbda580262bf19d59faf18c6aa7d71c5edb39f9c0ad879207b2b53bc3f411ab.png

Clustering by administrative regions#

PyPSA-Eur has a built-in function to set administrative clustering level for each country, individually. To get the clustered network below, set:

config.yaml

scenario:
  clusters:
  - adm

clustering:
  mode: administrative
  administrative:
    level: 0
    DE: 1
    BE: 1
    AT: 1
    CH: 2

We import the network and administrative shape file for illustrative purposes

nc_adm = pypsa.Network(resources + "base_s_adm_elec_.nc")
admin_shapes = gpd.read_file(resources + "admin_shapes.geojson")
INFO:pypsa.io:Retrieving network data from https://raw.githubusercontent.com/resilient-project/pypsa-workshop/main/pypsa-workshop/resources/base_s_adm_elec_.nc
WARNING:pypsa.io:Importing network from PyPSA version v0.33.2 while current version is v0.34.1. Read the release notes at https://pypsa.readthedocs.io/en/latest/release_notes.html to prepare your network for import.
INFO:pypsa.io:Imported network base_s_adm_elec_.nc has buses, carriers, generators, lines, links, loads, storage_units, stores
# set bounds
bounds = admin_shapes.total_bounds  # returns [minx, miny, maxx, maxy]

fig, (ax, ax1) = plt.subplots(
    1, 2, subplot_kw={"projection": ccrs.EqualEarth()}, figsize=(12, 12)
)

plot_kwrgs = dict(bus_sizes=5e-3, line_widths=0.7, link_widths=0.7)
n.plot.map(ax=ax, title="Original network", **plot_kwrgs)

# Adding the bidding zones
admin_shapes.to_crs(epsg=8857).plot(ax=ax, alpha=0.1, edgecolor="black")
admin_shapes.to_crs(epsg=8857).plot(ax=ax1, alpha=0.1, edgecolor="black")
nc_adm.plot.map(ax=ax1, title="Network clustered by NUTS regions", **plot_kwrgs)

ax.set_extent([bounds[0], bounds[2], bounds[1], bounds[3]], crs=ccrs.PlateCarree())
ax1.set_extent([bounds[0], bounds[2], bounds[1], bounds[3]], crs=ccrs.PlateCarree())

fig.tight_layout()
_images/c9ebf4b6799a1666bad4303a5d57b08b8ca954217ad1ea884b72f10462cb31fe.png

As this is a solved network, we can also inspect solutions for each individual NUTS region by their unique identifier, e.g.

nc_adm.statistics.energy_balance(
    groupby="bus", comps=["Generator", "Load", "Link", "Line", "Store", "StorageUnit"]
).filter(like="DE").unstack().T.plot(kind="bar", stacked=True)
<Axes: xlabel='bus'>
_images/5f77bc121dc39f40c1bd37d181ebfc3217739e686de86e32463dbc2907b3ebaf.png

References#

  • Frysztacki, M.M., Recht, G. & Brown, T. A comparison of clustering methods for the spatial reduction of renewable electricity optimisation models of Europe. Energy Inform 5, 4. https://doi.org/10.1186/s42162-022-00187-7 (2022).

  • Hörsch, J., Hofmann, F., Schlachtberger, D. & Brown, T. PyPSA-Eur: An open optimisation model of the European transmission system. Energy Strategy Reviews 22, 207–215, https://doi.org/10.1016/j.esr.2018.08.012 (2018).