{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Algebraic operations on timeseries data\n",
    "\n",
    "The **pyam** package offers many tools to facilitate processing of scenario data.\n",
    "In this notebook, we illustrate algebraic operations on the timeseries data of an **IamDataFrame**:\n",
    "addition, subtraction, multiplication, and division.\n",
    "\n",
    "The algebraic operations are (by default) \"unit-aware\", meaning that **pyam** tries to handle units correctly.\n",
    "This is implemented via the [iam-units](https://github.com/IAMconsortium/units) package,\n",
    "an extension of [pint](https://pint.readthedocs.io) package.\n",
    "\n",
    "The **pint** package natively handles conversion of standard (SI) units and commonly used equivalents \n",
    "(e.g., exajoule to terawatt-hours, *EJ -> TWh*), and it can parse combined units\n",
    "(e.g., exajoule per year, *EJ/yr*).\n",
    "To better support common use cases when working with energy systems analysis and integrated-assessment scenarios,\n",
    "the default [pint.UnitRegistry](https://pint.readthedocs.io/en/stable/developers_reference.html#pint.UnitRegistry)\n",
    "used by **pyam** uses the **iam-units** registry (see [IAMconsortium/units](https://github.com/IAMconsortium/units)),\n",
    "which extends the pint-defaults with a wide range of conversion factors commonly used in that domain.\n",
    "\n",
    "## Overview\n",
    "\n",
    "0. Import data from file and inspect the scenario\n",
    "1. A simple subtraction\n",
    "2. Multiplying timeseries data with scalars\n",
    "3. Calculating shares and dealing with units\n",
    "4. Overriding unit handling\n",
    "5. Working on other dimensions of timeseries data\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "\n",
    "**See Also**\n",
    "\n",
    "The **pyam** package also supports aggregation and downscaling\n",
    "along the sectoral and regional dimensions including consistency checks.\n",
    "See the [aggregation/downscaling tutorial notebook](https://pyam-iamc.readthedocs.io/en/stable/tutorials/aggregating_downscaling_consistency.html)\n",
    "for more information.\n",
    "\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Import data from file and inspect the scenario\n",
    "\n",
    "The stylized scenario used in this tutorial has data for two regions (`reg_a` & `reg_b`) as well as the `World` aggregate, and for categories of variables: primary energy demand, emissions, carbon price, and population."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyam import IamDataFrame\n",
    "\n",
    "df = IamDataFrame(data=\"tutorial_data_aggregating_downscaling.csv\")\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.variable"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. A simple subtraction\n",
    "\n",
    "We first display the existing variables *Primary Energy* and *Primary Energy|Coal*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.filter(variable=[\"Primary Energy\", \"Primary Energy|Coal\"]).timeseries()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we subtract fossil fuels (coal) from the total to see non-fossil energy use, and display the timeseries in wide format.\n",
    "\n",
    "All algebraic-operations functions follow the syntax:\n",
    "\n",
    "```\n",
    "df.<method>(a, b, c) => a <op> b = c\n",
    "```\n",
    "\n",
    "Note that in simple cases, **pyam** will try to keep the unit consistent during the operation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    df.subtract(\n",
    "        \"Primary Energy\", \"Primary Energy|Coal\", \"Primary Energy|Non-Fossil\"\n",
    "    ).timeseries()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also directly merge newly computed timeseries directly into the original **IamDataFrame** using the keyword argument ``append=True``.\n",
    "\n",
    "The new variable *Primary Energy|Non-Fossil* is then part of the variable list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    df.subtract(\n",
    "        \"Primary Energy\",\n",
    "        \"Primary Energy|Coal\",\n",
    "        \"Primary Energy|Non-Fossil\",\n",
    "        append=True,\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.variable"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Multiplying timeseries data with scalars\n",
    "\n",
    "The algebraic operations do not only work on items in the **IamDataFrame**, but you can also pass scalars.\n",
    "\n",
    "You will see that in more elaborate computations, **pyam** may change the notation of the units.\n",
    "In the example below, *EJ/yr* is changed to *EJ / a*.\n",
    "This is due to how the **pint** package works internally."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.multiply(\"Primary Energy\", 3, \"PE * 3\").timeseries()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also define a [pint.Quantity](https://pint.readthedocs.io/en/stable/developers_reference.html#pint.Quantity) from the **iam-units** registry\n",
    "and use this in the calculation. Note that **pyam** will (try to) correctly reduce the fraction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from iam_units import registry\n",
    "\n",
    "q = registry.Quantity(3, \"t / EJ\")\n",
    "df.multiply(\"Primary Energy\", q, \"custom variable\").timeseries()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Calculating shares and dealing with units\n",
    "\n",
    "As a next step, we calculate the primary energy use per capita."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(df.divide(\"Primary Energy\", \"Population\", \"Energy/Capita\").timeseries())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As illustrated above, the notation of the units may be changed during the computation.\n",
    "\n",
    "If you do not like the returned units, you can change that using the [rename()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.rename) function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    df.divide(\"Primary Energy\", \"Population\", \"Energy/Capita\")\n",
    "    .rename(unit={\"EJ / a / million\": \"EJ/yr/million\"})\n",
    "    .timeseries()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or you can use the [convert_unit()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.convert_unit) function;\n",
    "see the [unit conversion tutorial notebook](https://pyam-iamc.readthedocs.io/en/stable/tutorials/unit_conversion.html) for more information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    df.divide(\"Primary Energy\", \"Population\", \"Energy/Capita\")\n",
    "    .convert_unit(\"EJ / a / million\", \"GWh/yr\")\n",
    "    .timeseries()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Overriding unit handling\n",
    "\n",
    "Even though **pint** is quite powerful, it does not always work as expected.\n",
    "For example, *Mt CO2* is (strictly speaking) not a unit, but a species indicator *CO2* combined with a unit.\n",
    "\n",
    "For illustration, computing the emissions per capita will raise a [pint.UndefinedUnitError](https://pint.readthedocs.io/en/stable/developers_reference.html#pint.errors.UndefinedUnitError).\n",
    "\n",
    "We can override this behavior by setting ``ignore_units=True``; in this case, the unit of the returned timeseries data will be set to *unknown*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    df.divide(\n",
    "        \"Emissions|CO2\", \"Population\", \"Emissions/Capita\", ignore_units=True\n",
    "    ).timeseries()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also pass a string as the ``ignore_units`` keyword argument. Then, this string will be used as unit.\n",
    "\n",
    "Seeing that the unit of emissions is *Mt CO2* and Population is given in *million*, we know that the returned value should be given in *tons of CO2*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    df.divide(\n",
    "        \"Emissions|CO2\", \"Population\", \"Emissions/Capita\", ignore_units=\"t CO2\"\n",
    "    ).timeseries()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Working on other dimensions of timeseries data\n",
    "\n",
    "By default, algebraic operations in **pyam** will work on the *variable* dimension.\n",
    "But you can pass an ``axis`` keyword argument to, for example, perform computations\n",
    "between scenarios or regions.\n",
    "\n",
    "Try it!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}