Aggregating subannual timeseries data

The pyam package offers many tools to facilitate processing of scenario data. In this notebook, we illustrate methods to aggregate timeseries data that is given at a sub-annual resolution using timeslices (seasons, representative days, etc.).

The features for working with subannual time resolution are still in an experimental status. The functions illustrated in this tutorial are operational and tested, but other tools such as the plotting library may not work as expected (yet) when working with subannual data.

Overview

This notebook illustrates the following features:

  1. Import data from file and inspect the scenario

  2. Aggregate timeseries data given at a sub-annual time resolution to a yearly value


0. Import data from file and inspect the scenario

The stylized scenario used in this tutorial has data for primary-energy timeseries for two subannual timeslices summer and winter.

[1]:
from pyam import IamDataFrame

df = IamDataFrame(data="tutorial_data_subannual_time.csv")
[INFO] 12:04:53 - pyam.core: Reading file tutorial_data_subannual_time.csv
/home/docs/checkouts/readthedocs.org/user_builds/pyam-iamc/checkouts/latest/pyam/utils.py:318: FutureWarning: The previous implementation of stack is deprecated and will be removed in a future version of pandas. See the What's New notes for pandas 2.1.0 for details. Specify future_stack=True to adopt the new implementation and silence this warning.
  .stack(dropna=True)
[2]:
df.timeseries()
[2]:
2005 2010
model scenario region variable unit subannual
model_a scen_a World Primary Energy EJ/y summer 3.6 4.5
winter 8.4 10.5
Primary Energy|Coal EJ/y summer 2.7 3.0
winter 6.3 7.0

1. Aggregating timeseries across sub-annual timesteps

Per default, the aggregate_time() function aggregates (by summation) the data from all sub-annual timesteps (given in the column subannual) to a year value.

The function returns an IamDataFrame, so we can use timeseries() to display the resulting data.

[3]:
df.aggregate_time("Primary Energy").timeseries()
[3]:
2005 2010
model scenario region variable unit subannual
model_a scen_a World Primary Energy EJ/y year 12.0 15.0

The function also supports directly appending the aggregated data to the original IamDataFrame. You can also pass a a list of variables, or call variables() to perform the aggregation on all timeseries data.

A user can also manually set the “target” sub-annual value and the components to be aggregated; for example, this can then be used to process an aggregate of hourly data to monthly values.

You will notice that the following cell returns a larger dataset compared to calling the same function above.

[4]:
df.aggregate_time(
    df.variable, value="year", components=["summer", "winter"], append=True
)
[5]:
df.timeseries()
[5]:
2005 2010
model scenario region variable unit subannual
model_a scen_a World Primary Energy EJ/y summer 3.6 4.5
winter 8.4 10.5
year 12.0 15.0
Primary Energy|Coal EJ/y summer 2.7 3.0
winter 6.3 7.0
year 9.0 10.0