Aggregating subannual timeseries data¶

The pyam package offers many tools to facilitate processing of scenario data. In this notebook, we illustrate methods to aggregate timeseries data that is given at a sub-annual resolution using timeslices (seasons, representative days, etc.).

The features for working with subannual time resolution are still in an experimental status. The functions illustrated in this tutorial are operational and tested, but other tools such as the plotting library may not work as expected (yet) when working with subannual data.

Overview¶

This notebook illustrates the following features:

Import data from file and inspect the scenario
Aggregate timeseries data given at a sub-annual time resolution to a yearly value

0. Import data from file and inspect the scenario¶

The stylized scenario used in this tutorial has data for primary-energy timeseries for two subannual timeslices summer and winter.

[1]:

from pyam import IamDataFrame

df = IamDataFrame(data="tutorial_data_subannual_time.csv")

[INFO] 16:22:48 - pyam.core: Reading file tutorial_data_subannual_time.csv

[2]:

df.timeseries()

[2]:

						2005	2010
model	scenario	region	variable	unit	subannual
model_a	scen_a	World	Primary Energy	EJ/y	summer	3.6	4.5
			Primary Energy	EJ/y	winter	8.4	10.5
			Primary Energy\|Coal	EJ/y	summer	2.7	3.0
			Primary Energy\|Coal	EJ/y	winter	6.3	7.0

1. Aggregating timeseries across sub-annual timesteps¶

Per default, the aggregate_time() function aggregates (by summation) the data from all sub-annual timesteps (given in the column subannual) to a year value.

The function returns an IamDataFrame, so we can use timeseries() to display the resulting data.

[3]:

df.aggregate_time("Primary Energy").timeseries()

/tmp/ipykernel_2522/3711121035.py:1: DeprecationWarning: Method `aggregate_time()` is deprecated and will be removed in future versions. Please use `aggregate_subannual()` instead.
  df.aggregate_time("Primary Energy").timeseries()

[3]:

						2005	2010
model	scenario	region	variable	unit	subannual
model_a	scen_a	World	Primary Energy	EJ/y	NaN	12.0	15.0

The function also supports directly appending the aggregated data to the original IamDataFrame. You can also pass a a list of variables, or call variables() to perform the aggregation on all timeseries data.

A user can also manually set the “target” sub-annual value and the components to be aggregated; for example, this can then be used to process an aggregate of hourly data to monthly values.

You will notice that the following cell returns a larger dataset compared to calling the same function above.

[4]:

df.aggregate_time(
    df.variable, value="year", components=["summer", "winter"], append=True
)

/tmp/ipykernel_2522/393143967.py:1: DeprecationWarning: Method `aggregate_time()` is deprecated and will be removed in future versions. Please use `aggregate_subannual()` instead.
  df.aggregate_time(

[5]:

df.timeseries()

[5]:

						2005	2010
model	scenario	region	variable	unit	subannual
model_a	scen_a	World	Primary Energy	EJ/y	summer	3.6	4.5
					winter	8.4	10.5
					year	12.0	15.0
			Primary Energy\|Coal	EJ/y	summer	2.7	3.0
					winter	6.3	7.0
					year	9.0	10.0