Performing unit conversions

Conversion of timeseries data units is one of the most tedious aspects of modelling and scenario analysis - and it is a frequent source for errors!

The pyam function convert_unit() can support and simplify this task. The function uses the Python package pint, which natively handles conversion of standard (SI) units and commonly used equivalents (e.g., exajoule to terawatt-hours, EJ -> TWh). The pint package can also parse combined units (e.g., exajoule per year, EJ/yr).

To better support common use cases when working with energy systems analysis and integrated-assessment scenarios, the default pint.UnitRegistry used by pyam loads the unit definitions collected at IAMconsortium/units. This repository provides a wide range of conversion factors in a pint-compatible format so that they can easily be used across multiple applications (pyam is just one of them).

If you have suggestions for additional units to be handled in pyam by default, please start an issue in the units repository - or make a pull request!

Overview

This notebook illustrates the following features:

  1. Define timeseries data and initialize an IamDataFrame

  2. Use the default pint unit conversion

  3. Use a unit & conversion factor from the IAMconsortium/units repository

  4. Use a custom conversion factor

  5. Use contexts to specify conversion metrics

  6. More advanced use cases with a unit registry

[1]:
import pandas as pd
import pyam
pyam - INFO: Running in a notebook, setting `pyam` logging level to `logging.INFO` and adding stderr handler

0. Define timeseries data and initialize an IamDataFrame

This tutorial uses a scenario similar to the data in the first-steps tutorial (here on GitHub and on read the docs). Please read that tutorial for the reference and further information.

[2]:
UNIT_DF = pd.DataFrame([
    ['MESSAGEix-GLOBIOM 1.0', 'CD-LINKS_NPi', 'World', 'Primary Energy', 'EJ/yr', 500.74, 636.79, 809.93, 1284.78],
    ['MESSAGEix-GLOBIOM 1.0', 'CD-LINKS_NPi', 'World', 'Emissions|CH4', 'Mt CH4/yr', 327.92, 354.35, 377.88, 403.98],
],
    columns=pyam.IAMC_IDX + [2010, 2030, 2050, 2100],
)

df = pyam.IamDataFrame(UNIT_DF)
df.timeseries()
[2]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CH4/yr 327.92 354.35 377.88 403.98
Primary Energy EJ/yr 500.74 636.79 809.93 1284.78

1. Use the default pint unit conversion

As a first step, we illustrate unit conversion between “standard formats”, i.e., units that pint knows by default.

In this particular case, we convert exajoule to petawatthours, EJ/yr -> PWh/yr. Note that the timeseries data for other units (CO2 emissions in this case) are not changed.

[3]:
df.convert_unit('EJ/yr', to='PWh/yr').timeseries()
[3]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CH4/yr 327.920000 354.350000 377.880000 403.980000
Primary Energy PWh/yr 139.094444 176.886111 224.980556 356.883333

The pint package usually does a good job at parsing orders of magnitude (peta, giga, mega, milli, …) and their abbreviations (P, G, M, m, …) as well as common units (centimeter, inch, kilometer, mile). It also handles combined units like exajoule per year with various spellings: PWh/yr, PWh / yr and petawatthour / year will all be treated as synomyms by the conversion. The only difference is the format in the resulting IamDataFrame.

Read the docs for more information!

[4]:
df.convert_unit('EJ/yr', to='petawatthour / year').timeseries()
[4]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CH4/yr 327.920000 354.350000 377.880000 403.980000
Primary Energy petawatthour / year 139.094444 176.886111 224.980556 356.883333

2. Use a unit & conversion factor from the IAMconsortium/unit repository

The pint package includes many standard units, but many units often encountered in the context of energy systems analysis and integrated assessment scenarios are not defined by default.

Therefore, the IAMconsortium/units repository provides a common location to define such units. The pyam package loads these definitions and uses them by default in any unit conversion.

One entry defined there is ‘tons of coal equivalent’ (tce) as a measure of energy (content). This is used in the next cell.

[5]:
df.convert_unit('EJ/yr', to='Gtce/yr').timeseries()
[5]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CH4/yr 327.920000 354.350000 377.880000 403.980000
Primary Energy Gtce/yr 17.085437 21.727515 27.635117 43.837178

3. Use a custom conversion factor

In some cases, a user needs to specify a custom unit. The convert_unit() function supports that by specifying a factor as a keyword argument.

[6]:
df.convert_unit('EJ/yr', to='my_unit', factor=2.3).timeseries()
[6]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CH4/yr 327.920 354.350 377.880 403.980
Primary Energy my_unit 1151.702 1464.617 1862.839 2954.994

4. Use contexts to specify conversion metrics

There are unit conversions where no “default” factor exists. One such case is calculating the CO2-equivalent of CH4 emissions (or other greenhouse gases), because the conversion depends on the species’ “global warming potential” and estimates for that potential are updated regularly in the literature.

To facilitate such use cases, pint provides “contexts” to allow specifying the appropriate metric. The IAMconsortium/units parametrizes multiple contexts for the comversion of greenhouse gases; see the emissions module for details.

Performing a unit conversion with context is illustrated below using the IPCC AR5-GWP100 factor; in this situation, not specifying a context would result in a pint.DimensionalityError.

[7]:
df.convert_unit('Mt CH4/yr', to='Mt CO2e/yr', context='gwp_AR5GWP100').timeseries()
[7]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CO2e/yr 9181.76 9921.80 10580.64 11311.44
Primary Energy EJ/yr 500.74 636.79 809.93 1284.78

When working with contexts, it is important to track the information which metric was used. This can be done either in the metadata of the resulting data (file) or directly in the unit (or variable) of the timeseries. See an illustration below for a simple workflow.

[8]:
gwp = 'AR5GWP100'
target = 'Mt CO2e/yr'
(
    df.convert_unit('Mt CH4/yr', to=target, context=f'gwp_{gwp}')
    .rename(unit={target: f'{target} ({gwp})'})
    .timeseries()
)
[8]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CO2e/yr (AR5GWP100) 9181.76 9921.80 10580.64 11311.44
Primary Energy EJ/yr 500.74 636.79 809.93 1284.78

5. More advanced use cases with a unit registry

For more advanced use cases, pyam supports two further features: first, it can sometimes be useful to work with the UnitRegistry used by default directly. This registry can be accessed via pint.get_application_registry().

[9]:
import pint
pint.get_application_registry()
[9]:
<pint.registry.UnitRegistry at 0x7fdf18b08510>

In other use cases, it can be helpful to use one (or several) specific registries instead of the default application registry. The convert_unit() function therefore allows passing a registry as a keyword argument.

The specifications below are the same as the example in section 3.

[10]:
ureg = pint.UnitRegistry()
ureg.define('my_unit = 1 / 2.3 * EJ/yr')

df.convert_unit('EJ/yr', to='my_unit', registry=ureg).timeseries()
[10]:
2010 2030 2050 2100
model scenario region variable unit
MESSAGEix-GLOBIOM 1.0 CD-LINKS_NPi World Emissions|CH4 Mt CH4/yr 327.920 354.350 377.880 403.980
Primary Energy my_unit 1151.702 1464.617 1862.839 2954.994