Aggregating subannual timeseries data¶
The pyam package offers many tools to facilitate processing of scenario data. In this notebook, we illustrate methods to aggregate timeseries data that is given at a sub-annual resolution using timeslices (seasons, representative days, etc.).
The features for working with subannual time resolution are still in an experimental status. The functions illustrated in this tutorial are operational and tested, but other tools such as the plotting library may not work as expected (yet) when working with subannual data.
Overview¶
This notebook illustrates the following features:
Import data from file and inspect the scenario
Aggregate timeseries data given at a sub-annual time resolution to a yearly value
0. Import data from file and inspect the scenario¶
The stylized scenario used in this tutorial has data for primary-energy timeseries for two subannual timeslices summer
and winter
.
[1]:
from pyam import IamDataFrame
df = IamDataFrame(data="tutorial_data_subannual_time.csv")
[INFO] 12:12:46 - pyam.core: Reading file tutorial_data_subannual_time.csv
/home/docs/checkouts/readthedocs.org/user_builds/pyam-iamc/checkouts/stable/pyam/utils.py:318: FutureWarning: The previous implementation of stack is deprecated and will be removed in a future version of pandas. See the What's New notes for pandas 2.1.0 for details. Specify future_stack=True to adopt the new implementation and silence this warning.
.stack(dropna=True)
[2]:
df.timeseries()
[2]:
2005 | 2010 | ||||||
---|---|---|---|---|---|---|---|
model | scenario | region | variable | unit | subannual | ||
model_a | scen_a | World | Primary Energy | EJ/y | summer | 3.6 | 4.5 |
winter | 8.4 | 10.5 | |||||
Primary Energy|Coal | EJ/y | summer | 2.7 | 3.0 | |||
winter | 6.3 | 7.0 |
1. Aggregating timeseries across sub-annual timesteps¶
Per default, the aggregate_time() function aggregates (by summation) the data from all sub-annual timesteps (given in the column subannual
) to a year
value.
The function returns an IamDataFrame, so we can use timeseries() to display the resulting data.
[3]:
df.aggregate_time("Primary Energy").timeseries()
[3]:
2005 | 2010 | ||||||
---|---|---|---|---|---|---|---|
model | scenario | region | variable | unit | subannual | ||
model_a | scen_a | World | Primary Energy | EJ/y | year | 12.0 | 15.0 |
The function also supports directly appending the aggregated data to the original IamDataFrame. You can also pass a a list of variables, or call variables() to perform the aggregation on all timeseries data.
A user can also manually set the “target” sub-annual value and the components to be aggregated; for example, this can then be used to process an aggregate of hourly data to monthly values.
You will notice that the following cell returns a larger dataset compared to calling the same function above.
[4]:
df.aggregate_time(
df.variable, value="year", components=["summer", "winter"], append=True
)
[5]:
df.timeseries()
[5]:
2005 | 2010 | ||||||
---|---|---|---|---|---|---|---|
model | scenario | region | variable | unit | subannual | ||
model_a | scen_a | World | Primary Energy | EJ/y | summer | 3.6 | 4.5 |
winter | 8.4 | 10.5 | |||||
year | 12.0 | 15.0 | |||||
Primary Energy|Coal | EJ/y | summer | 2.7 | 3.0 | |||
winter | 6.3 | 7.0 | |||||
year | 9.0 | 10.0 |