Algebraic operations on timeseries data¶

The pyam package offers many tools to facilitate processing of scenario data. In this notebook, we illustrate algebraic operations on the timeseries data of an IamDataFrame: addition, subtraction, multiplication, and division.

The algebraic operations are (by default) “unit-aware”, meaning that pyam tries to handle units correctly. This is implemented via the iam-units package, an extension of pint package.

The pint package natively handles conversion of standard (SI) units and commonly used equivalents (e.g., exajoule to terawatt-hours, EJ -> TWh), and it can parse combined units (e.g., exajoule per year, EJ/yr). To better support common use cases when working with energy systems analysis and integrated-assessment scenarios, the default pint.UnitRegistry used by pyam uses the iam-units registry (see IAMconsortium/units), which extends the pint-defaults with a wide range of conversion factors commonly used in that domain.

Overview¶

Import data from file and inspect the scenario
A simple subtraction
Multiplying timeseries data with scalars
Calculating shares and dealing with units
Overriding unit handling
Working on other dimensions of timeseries data

See Also

The pyam package also supports aggregation and downscaling along the sectoral and regional dimensions including consistency checks. See the aggregation/downscaling tutorial notebook for more information.

0. Import data from file and inspect the scenario¶

The stylized scenario used in this tutorial has data for two regions (reg_a & reg_b) as well as the World aggregate, and for categories of variables: primary energy demand, emissions, carbon price, and population.

[1]:

from pyam import IamDataFrame

df = IamDataFrame(data="tutorial_data_aggregating_downscaling.csv")
df

[INFO] 18:43:51 - pyam.core: Reading file tutorial_data_aggregating_downscaling.csv

[1]:

<class 'pyam.core.IamDataFrame'>
Index:
 * model    : model_a (1)
 * scenario : scen_a (1)
Timeseries data coordinates:
   region   : World, reg_a, reg_b (3)
   variable : Emissions|CO2, Emissions|CO2|AFOLU, ... Primary Energy|Wind (9)
   unit     : EJ/yr, Mt CO2, USD/t CO2, million (4)
   year     : 2005, 2010 (2)

[2]:

df.variable

[2]:

['Emissions|CO2',
 'Emissions|CO2|AFOLU',
 'Emissions|CO2|Bunkers',
 'Emissions|CO2|Energy',
 'Population',
 'Price|Carbon',
 'Primary Energy',
 'Primary Energy|Coal',
 'Primary Energy|Wind']

1. A simple subtraction¶

We first display the existing variables Primary Energy and Primary Energy|Coal.

[3]:

df.filter(variable=["Primary Energy", "Primary Energy|Coal"]).timeseries()

[3]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	Primary Energy	EJ/yr	12.0	15.0
		World	Primary Energy\|Coal	EJ/yr	9.0	10.0
		reg_a	Primary Energy	EJ/yr	8.0	9.0
		reg_a	Primary Energy\|Coal	EJ/yr	6.0	6.0
		reg_b	Primary Energy	EJ/yr	4.0	6.0
		reg_b	Primary Energy\|Coal	EJ/yr	3.0	4.0

Now, we subtract fossil fuels (coal) from the total to see non-fossil energy use, and display the timeseries in wide format.

All algebraic-operations functions follow the syntax:

df.<method>(a, b, c) => a <op> b = c

Note that in simple cases, pyam will try to keep the unit consistent during the operation.

[4]:

(
    df.subtract(
        "Primary Energy", "Primary Energy|Coal", "Primary Energy|Non-Fossil"
    ).timeseries()
)

[4]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	Primary Energy\|Non-Fossil	EJ/yr	3.0	5.0
		reg_a	Primary Energy\|Non-Fossil	EJ/yr	2.0	3.0
		reg_b	Primary Energy\|Non-Fossil	EJ/yr	1.0	2.0

We can also directly merge newly computed timeseries directly into the original IamDataFrame using the keyword argument append=True.

The new variable Primary Energy|Non-Fossil is then part of the variable list.

[5]:

(
    df.subtract(
        "Primary Energy",
        "Primary Energy|Coal",
        "Primary Energy|Non-Fossil",
        append=True,
    )
)

[6]:

df.variable

[6]:

['Emissions|CO2',
 'Emissions|CO2|AFOLU',
 'Emissions|CO2|Bunkers',
 'Emissions|CO2|Energy',
 'Population',
 'Price|Carbon',
 'Primary Energy',
 'Primary Energy|Coal',
 'Primary Energy|Non-Fossil',
 'Primary Energy|Wind']

2. Multiplying timeseries data with scalars¶

The algebraic operations do not only work on items in the IamDataFrame, but you can also pass scalars.

You will see that in more elaborate computations, pyam may change the notation of the units. In the example below, EJ/yr is changed to EJ / a. This is due to how the pint package works internally.

[7]:

df.multiply("Primary Energy", 3, "PE * 3").timeseries()

[7]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	PE * 3	EJ / a	36.0	45.0
		reg_a	PE * 3	EJ / a	24.0	27.0
		reg_b	PE * 3	EJ / a	12.0	18.0

You can also define a pint.Quantity from the iam-units registry and use this in the calculation. Note that pyam will (try to) correctly reduce the fraction.

[8]:

from iam_units import registry

q = registry.Quantity(3, "t / EJ")
df.multiply("Primary Energy", q, "custom variable").timeseries()

[8]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	custom variable	t / a	36.0	45.0
		reg_a	custom variable	t / a	24.0	27.0
		reg_b	custom variable	t / a	12.0	18.0

3. Calculating shares and dealing with units¶

As a next step, we calculate the primary energy use per capita.

[9]:

(df.divide("Primary Energy", "Population", "Energy/Capita").timeseries())

[9]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	Energy/Capita	EJ / million / a	4.000000	3.0
		reg_a	Energy/Capita	EJ / million / a	5.333333	3.6
		reg_b	Energy/Capita	EJ / million / a	2.666667	2.4

As illustrated above, the notation of the units may be changed during the computation.

If you do not like the returned units, you can change that using the rename() function.

[10]:

(
    df.divide("Primary Energy", "Population", "Energy/Capita")
    .rename(unit={"EJ / a / million": "EJ/yr/million"})
    .timeseries()
)

[10]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	Energy/Capita	EJ / million / a	4.000000	3.0
		reg_a	Energy/Capita	EJ / million / a	5.333333	3.6
		reg_b	Energy/Capita	EJ / million / a	2.666667	2.4

Or you can use the convert_unit() function; see the unit conversion tutorial notebook for more information.

[11]:

(
    df.divide("Primary Energy", "Population", "Energy/Capita")
    .convert_unit("EJ / a / million", "GWh/yr")
    .timeseries()
)

[11]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	Energy/Capita	EJ / million / a	4.000000	3.0
		reg_a	Energy/Capita	EJ / million / a	5.333333	3.6
		reg_b	Energy/Capita	EJ / million / a	2.666667	2.4

4. Overriding unit handling¶

Even though pint is quite powerful, it does not always work as expected. For example, Mt CO2 is (strictly speaking) not a unit, but a species indicator CO2 combined with a unit.

For illustration, computing the emissions per capita will raise a pint.UndefinedUnitError.

We can override this behavior by setting ignore_units=True; in this case, the unit of the returned timeseries data will be set to unknown.

[12]:

(
    df.divide(
        "Emissions|CO2", "Population", "Emissions/Capita", ignore_units=True
    ).timeseries()
)

[12]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	Emissions/Capita	unknown	3.333333	2.8
		reg_a	Emissions/Capita	unknown	4.000000	3.2
		reg_b	Emissions/Capita	unknown	2.000000	1.6

You can also pass a string as the ignore_units keyword argument. Then, this string will be used as unit.

Seeing that the unit of emissions is Mt CO2 and Population is given in million, we know that the returned value should be given in tons of CO2.

[13]:

(
    df.divide(
        "Emissions|CO2", "Population", "Emissions/Capita", ignore_units="t CO2"
    ).timeseries()
)

[13]:

					2005	2010
model	scenario	region	variable	unit
model_a	scen_a	World	Emissions/Capita	t CO2	3.333333	2.8
		reg_a	Emissions/Capita	t CO2	4.000000	3.2
		reg_b	Emissions/Capita	t CO2	2.000000	1.6

5. Working on other dimensions of timeseries data¶

By default, algebraic operations in pyam will work on the variable dimension. But you can pass an axis keyword argument to, for example, perform computations between scenarios or regions.

Try it!

[ ]: