Integration with R

This tutorial illustrates how to use pyam features in an R workflow using the reticulate package.

0c25fab8a2f4438db0f5e150a2676bca

The source code of the tutorial is available in the folder docs/R_tutorials of the pyam GitHub repository.

Installation and requirements

Assuming you have a working and current R installation, you can follow the steps below to install the requirements to run this tutorial:

  1. Install Python using Anaconda
    This is the recommended approach for users that aren’t already experienced with Python…
  2. Install pyam (read the docs for more information)

    conda install -c conda-forge pyam
    
  3. To use R in a Jupyter notebook as in this tutorial, install Jupyter and the IRkernel via conda

    conda install jupyter ir-kernel
    
  4. In R, install the reticulate package from CRAN

    install.packages("reticulate")
    
This tutorial is an illustration on how to get started, not a full-fledged documentation of a pyam-R interface…
The integration between Python and R using reticulate is a bit fickle - having multiple Python installations/environments or outdated versions of reticulate/pandas can cause issues.
Please read the docs if you experience any problems!

This notebook was run with R version 4.0 and Python 3.8 on Mac OS.


Developers note: Running this notebook on CI and RTD is currently not supported.
For the time being, this notebook is not executed by nbsphinx and has to be saved with output.

Setting up the session

[1]:
require(reticulate)

Loading required package: reticulate

The next line instructs reticulate to use the conda ‘base’ environment.
Read the docs in case you want to use an alternative installation or environment!
[2]:
use_condaenv()

Now, we import pyam to the R session.

[3]:
pyam <- import("pyam")

Passing an R dataframe to pyam

The first cell of this section creates a simple timeseries data table following the IAMC format as an R dataframe.

[4]:
data <- data.frame(
  model = c("model_a", "model_a", "model_a"),
  scenario = c("scen_a", "scen_a", "scen_b"),
  region = c("World", "World", "World"),
  variable = c("Primary Energy", "Primary Energy|Coal", "Primary Energy"),
  unit = c("EJ/yr", "EJ/yr", "EJ/yr"),
  "2005" = c(1, 0.5, 2),
  "2010" = c(6, 3, 7)
)
data

A data.frame: 3 × 7
modelscenarioregionvariableunitX2005X2010
<chr><chr><chr><chr><chr><dbl><dbl>
model_ascen_aWorldPrimary Energy EJ/yr1.06
model_ascen_aWorldPrimary Energy|CoalEJ/yr0.53
model_ascen_bWorldPrimary Energy EJ/yr2.07

The following cell casts the data to a pyam.IamDataFrame.

When migrating code from Python to R, keep in mind that
you have to use $ instead of . to call an object’s methods!
[5]:
df <- pyam$IamDataFrame(data)
df

<class 'pyam.core.IamDataFrame'>
Index:
 * model    : model_a (1)
 * scenario : scen_a, scen_b (2)
Timeseries data coordinates:
   region   : World (1)
   variable : Primary Energy, Primary Energy|Coal (2)
   unit     : EJ/yr (1)
   year     : 2005, 2010 (2)
Meta indicators:
   exclude (bool) False (1)

Ending a cell of a Jupyter notebook with theIamDataFrameinstance shows the same overview as if you were in a Python environment!

The last cell of this section retrieves the timeseries data from the IamDataFrame in long form and returns it again as an R data.frame.

[6]:
df$data

A data.frame: 6 × 7
modelscenarioregionvariableunityearvalue
<chr><chr><chr><chr><chr><dbl><dbl>
model_ascen_aWorldPrimary Energy EJ/yr20051.0
model_ascen_aWorldPrimary Energy EJ/yr20106.0
model_ascen_aWorldPrimary Energy|CoalEJ/yr20050.5
model_ascen_aWorldPrimary Energy|CoalEJ/yr20103.0
model_ascen_bWorldPrimary Energy EJ/yr20052.0
model_ascen_bWorldPrimary Energy EJ/yr20107.0

Calling a pyam method

For illustration, we will use the interpolate() method on the IamDataFrame instance.

One key difference between R and Python are the default types:
whereas Python will interpret “2007” as an integer, R will treat the same input as a float.
In the example below, we have to explicitly instruct R to pass an integer to the pyam function.
[7]:
x <- df$interpolate(as.integer(2007), inplace = TRUE)

[8]:
df$data

A data.frame: 9 × 7
modelscenarioregionvariableunityearvalue
<chr><chr><chr><chr><chr><dbl><dbl>
model_ascen_aWorldPrimary Energy EJ/yr20051.0
model_ascen_aWorldPrimary Energy EJ/yr20073.0
model_ascen_aWorldPrimary Energy EJ/yr20106.0
model_ascen_aWorldPrimary Energy|CoalEJ/yr20050.5
model_ascen_aWorldPrimary Energy|CoalEJ/yr20071.5
model_ascen_aWorldPrimary Energy|CoalEJ/yr20103.0
model_ascen_bWorldPrimary Energy EJ/yr20052.0
model_ascen_bWorldPrimary Energy EJ/yr20074.0
model_ascen_bWorldPrimary Energy EJ/yr20107.0

Query data from an IIASA data resource

The IIASA Energy, Climate, and Environment Program hosts a suite of Scenario Explorer instances and related infrastructure to support analysis of integrated-assessment pathways in IPCC reports and model comparison projects. High-profile use cases include the IAMC 1.5°C Scenario Explorer hosted by IIASA supporting the IPCC Special Report on Global Warming of 1.5°C (SR15) and the Horizon 2020 project CD-LINKS.

The pyam package can retrieve scenario data directly from any Scenario Explorer database instance hosted by IIASA for use in your processing and analysis workflows.

[9]:
df <- pyam$read_iiasa(
    "iamc15",
    model = "MESSAGEix*",
    variable = c("Emissions|CO2", "Primary Energy|Coal"),
    region = "World",
    meta = c("category")
)

[10]:
df

<class 'pyam.core.IamDataFrame'>
Index:
 * model    : MESSAGEix-GLOBIOM 1.0 (1)
 * scenario : CD-LINKS_INDCi, CD-LINKS_NPi, CD-LINKS_NPi2020_1000, ... LowEnergyDemand (7)
Timeseries data coordinates:
   region   : World (1)
   variable : Emissions|CO2, Primary Energy|Coal (2)
   unit     : EJ/yr, Mt CO2/yr (2)
   year     : 2000, 2005, 2010, 2020, 2030, 2040, 2050, 2060, ... 2100 (12)
Meta indicators:
   version (object) 1 (1)
   carbon price|Avg NPV (2030-2100) (object) 0.9123249940000001, ... 30.1503013256 (7)
   year of netzero CO2 emissions (object) nan, 2065, 2078, 2053, 2060.0 (5)
   carbon price|2100 (NPV) (object) 0.0426419715, 0.012143570900000001, ... 14.2051784281 (7)
   cumulative CO2 emissions (2016 to peak warming, Gt CO2) (object) ... (7)
   ...

See the pyam-IIASA-database tutorial or the API documentation for more information and a complete list of features!

Next steps

New to Jupyter notebooks?
Read this page for helpful tips and tricks when working with Jupyter notebooks.

Questions?

Take a look at the other tutorials to see the scope of pyam features - then join our mailing list!

Problems?

If you encounter any functions or methods that don’t work as expected, please check whether there are already any issues in our GitHub repo.
If not, start a new one!