{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Algebraic operations on timeseries data\n", "\n", "The **pyam** package offers many tools to facilitate processing of scenario data.\n", "In this notebook, we illustrate algebraic operations on the timeseries data of an **IamDataFrame**:\n", "addition, subtraction, multiplication, and division.\n", "\n", "The algebraic operations are (by default) \"unit-aware\", meaning that **pyam** tries to handle units correctly.\n", "This is implemented via the [iam-units](https://github.com/IAMconsortium/units) package,\n", "an extension of [pint](https://pint.readthedocs.io) package.\n", "\n", "The **pint** package natively handles conversion of standard (SI) units and commonly used equivalents \n", "(e.g., exajoule to terawatt-hours, *EJ -> TWh*), and it can parse combined units\n", "(e.g., exajoule per year, *EJ/yr*).\n", "To better support common use cases when working with energy systems analysis and integrated-assessment scenarios,\n", "the default [pint.UnitRegistry](https://pint.readthedocs.io/en/stable/developers_reference.html#pint.UnitRegistry)\n", "used by **pyam** uses the **iam-units** registry (see [IAMconsortium/units](https://github.com/IAMconsortium/units)),\n", "which extends the pint-defaults with a wide range of conversion factors commonly used in that domain.\n", "\n", "## Overview\n", "\n", "0. Import data from file and inspect the scenario\n", "1. A simple subtraction\n", "2. Multiplying timeseries data with scalars\n", "3. Calculating shares and dealing with units\n", "4. Overriding unit handling\n", "5. Working on other dimensions of timeseries data\n", "\n", "
\n", "\n", "**See Also**\n", "\n", "The **pyam** package also supports aggregation and downscaling\n", "along the sectoral and regional dimensions including consistency checks.\n", "See the [aggregation/downscaling tutorial notebook](https://pyam-iamc.readthedocs.io/en/stable/tutorials/aggregating_downscaling_consistency.html)\n", "for more information.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Import data from file and inspect the scenario\n", "\n", "The stylized scenario used in this tutorial has data for two regions (`reg_a` & `reg_b`) as well as the `World` aggregate, and for categories of variables: primary energy demand, emissions, carbon price, and population." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pyam import IamDataFrame\n", "\n", "df = IamDataFrame(data=\"tutorial_data_aggregating_downscaling.csv\")\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df.variable" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. A simple subtraction\n", "\n", "We first display the existing variables *Primary Energy* and *Primary Energy|Coal*." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df.filter(variable=[\"Primary Energy\", \"Primary Energy|Coal\"]).timeseries()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we subtract fossil fuels (coal) from the total to see non-fossil energy use, and display the timeseries in wide format.\n", "\n", "All algebraic-operations functions follow the syntax:\n", "\n", "```\n", "df.(a, b, c) => a b = c\n", "```\n", "\n", "Note that in simple cases, **pyam** will try to keep the unit consistent during the operation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(\n", " df.subtract(\n", " \"Primary Energy\", \"Primary Energy|Coal\", \"Primary Energy|Non-Fossil\"\n", " ).timeseries()\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also directly merge newly computed timeseries directly into the original **IamDataFrame** using the keyword argument ``append=True``.\n", "\n", "The new variable *Primary Energy|Non-Fossil* is then part of the variable list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(\n", " df.subtract(\n", " \"Primary Energy\",\n", " \"Primary Energy|Coal\",\n", " \"Primary Energy|Non-Fossil\",\n", " append=True,\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df.variable" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Multiplying timeseries data with scalars\n", "\n", "The algebraic operations do not only work on items in the **IamDataFrame**, but you can also pass scalars.\n", "\n", "You will see that in more elaborate computations, **pyam** may change the notation of the units.\n", "In the example below, *EJ/yr* is changed to *EJ / a*.\n", "This is due to how the **pint** package works internally." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df.multiply(\"Primary Energy\", 3, \"PE * 3\").timeseries()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also define a [pint.Quantity](https://pint.readthedocs.io/en/stable/developers_reference.html#pint.Quantity) from the **iam-units** registry\n", "and use this in the calculation. Note that **pyam** will (try to) correctly reduce the fraction." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from iam_units import registry\n", "\n", "q = registry.Quantity(3, \"t / EJ\")\n", "df.multiply(\"Primary Energy\", q, \"custom variable\").timeseries()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Calculating shares and dealing with units\n", "\n", "As a next step, we calculate the primary energy use per capita." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(df.divide(\"Primary Energy\", \"Population\", \"Energy/Capita\").timeseries())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As illustrated above, the notation of the units may be changed during the computation.\n", "\n", "If you do not like the returned units, you can change that using the [rename()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.rename) function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(\n", " df.divide(\"Primary Energy\", \"Population\", \"Energy/Capita\")\n", " .rename(unit={\"EJ / a / million\": \"EJ/yr/million\"})\n", " .timeseries()\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or you can use the [convert_unit()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.convert_unit) function;\n", "see the [unit conversion tutorial notebook](https://pyam-iamc.readthedocs.io/en/stable/tutorials/unit_conversion.html) for more information." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(\n", " df.divide(\"Primary Energy\", \"Population\", \"Energy/Capita\")\n", " .convert_unit(\"EJ / a / million\", \"GWh/yr\")\n", " .timeseries()\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Overriding unit handling\n", "\n", "Even though **pint** is quite powerful, it does not always work as expected.\n", "For example, *Mt CO2* is (strictly speaking) not a unit, but a species indicator *CO2* combined with a unit.\n", "\n", "For illustration, computing the emissions per capita will raise a [pint.UndefinedUnitError](https://pint.readthedocs.io/en/stable/developers_reference.html#pint.errors.UndefinedUnitError).\n", "\n", "We can override this behavior by setting ``ignore_units=True``; in this case, the unit of the returned timeseries data will be set to *unknown*." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(\n", " df.divide(\n", " \"Emissions|CO2\", \"Population\", \"Emissions/Capita\", ignore_units=True\n", " ).timeseries()\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also pass a string as the ``ignore_units`` keyword argument. Then, this string will be used as unit.\n", "\n", "Seeing that the unit of emissions is *Mt CO2* and Population is given in *million*, we know that the returned value should be given in *tons of CO2*." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(\n", " df.divide(\n", " \"Emissions|CO2\", \"Population\", \"Emissions/Capita\", ignore_units=\"t CO2\"\n", " ).timeseries()\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Working on other dimensions of timeseries data\n", "\n", "By default, algebraic operations in **pyam** will work on the *variable* dimension.\n", "But you can pass an ``axis`` keyword argument to, for example, perform computations\n", "between scenarios or regions.\n", "\n", "Try it!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.0" } }, "nbformat": 4, "nbformat_minor": 2 }