Integration with R

This tutorial illustrates how to use pyam features in an R workflow using the reticulate package.

a2b8adf3d0b847eab19414cfe54079fc

The source code of the tutorial is available in the folder doc/source/R_tutorials of the pyam GitHub repository.

Installation and requirements

Assuming you have a working and current R installation, you can follow the steps below to install the requirements to run this tutorial:

  1. Install Python using Anaconda
    This is the recommended approach for users that aren’t already experienced with Python…
  2. Install pyam (read the docs for more information)
    conda install -c conda-forge pyam
  3. To use R in a Jupyter notebook as in this tutorial, install Jupyter and the IRkernel via conda
    conda install jupyter ir-kernel
  4. In R, install the reticulate package from CRAN
    r    install.packages("reticulate")
This tutorial is an illustration on how to get started, not a full-fledged documentation of a pyam-R interface…
The integration between Python and R using reticulate is a bit fickle - having multiple Python installations/enviroments or outdated versions of reticulate/pandas can cause issues.
Please read the docs if you experience any problems!

This notebook was run with R version 4.0 and Python 3.8 on Mac OS.


Developers note: Running this notebook on CI and RTD is currently not supported.
For the time being, this notebook is not executed by nbsphinx and has to be saved with output.

Setting up the session

[1]:
require(reticulate)
Loading required package: reticulate

The next line instructs reticulate to use the conda ‘base’ environment.
Read the docs in case you want to use an alternative installation or environment!
[2]:
use_condaenv()

Now, we import pyam to the R session.

[3]:
pyam <- import("pyam")

Passing an R dataframe to pyam

The first cell of this section creates a simple timeseries data table following the IAMC format as an R dataframe.

[4]:
data <- data.frame(
  model=c('model_a', 'model_a', 'model_a'),
  scenario=c('scen_a', 'scen_a', 'scen_b'),
  region=c('World', 'World', 'World'),
  variable=c('Primary Energy', 'Primary Energy|Coal', 'Primary Energy'),
  unit=c('EJ/yr', 'EJ/yr', 'EJ/yr'),
  "2005"=c(1, 0.5, 2),
  "2010"=c(6, 3, 7)
)
data
A data.frame: 3 × 7
modelscenarioregionvariableunitX2005X2010
<chr><chr><chr><chr><chr><dbl><dbl>
model_ascen_aWorldPrimary Energy EJ/yr1.06
model_ascen_aWorldPrimary Energy|CoalEJ/yr0.53
model_ascen_bWorldPrimary Energy EJ/yr2.07

The following cell casts the data to a pyam.IamDataFrame.

When migrating code from Python to R, keep in mind that
you have to use $ instead of . to call an object’s methods!
[5]:
df <- pyam$IamDataFrame(data)
df
<class 'pyam.core.IamDataFrame'>
Index dimensions:
 * model    : model_a (1)
 * scenario : scen_a, scen_b (2)
Timeseries data coordinates:
   region   : World (1)
   variable : Primary Energy, Primary Energy|Coal (2)
   unit     : EJ/yr (1)
   year     : 2005, 2010 (2)
Meta indicators:
   exclude (bool) False (1)

Ending a cell of a Jupyter notebook with the **IamDataFrame* instance shows the same overview as if you were in a Python environment!*

The last cell of this section retrieves the timeseries data from the IamDataFrame in long form and returns it again as an R data.frame.

[6]:
df$data
A data.frame: 6 × 7
modelscenarioregionvariableunityearvalue
<chr><chr><chr><chr><chr><dbl><dbl>
model_ascen_aWorldPrimary Energy EJ/yr20051.0
model_ascen_aWorldPrimary Energy EJ/yr20106.0
model_ascen_aWorldPrimary Energy|CoalEJ/yr20050.5
model_ascen_aWorldPrimary Energy|CoalEJ/yr20103.0
model_ascen_bWorldPrimary Energy EJ/yr20052.0
model_ascen_bWorldPrimary Energy EJ/yr20107.0

Calling a pyam method

For illustration, we will use the interpolate() method on the IamDataFrame instance.

One key difference between R and Python are the default types:
whereas Python will interpret “2007” as an integer, R will treat the same input as a float.
In the example below, we have to explicitly instruct R to pass an integer to the pyam function.
[7]:
x <- df$interpolate(as.integer(2007), inplace=TRUE)
[8]:
df$data
A data.frame: 9 × 7
modelscenarioregionvariableunityearvalue
<chr><chr><chr><chr><chr><dbl><dbl>
model_ascen_aWorldPrimary Energy EJ/yr20051.0
model_ascen_aWorldPrimary Energy EJ/yr20073.0
model_ascen_aWorldPrimary Energy EJ/yr20106.0
model_ascen_aWorldPrimary Energy|CoalEJ/yr20050.5
model_ascen_aWorldPrimary Energy|CoalEJ/yr20071.5
model_ascen_aWorldPrimary Energy|CoalEJ/yr20103.0
model_ascen_bWorldPrimary Energy EJ/yr20052.0
model_ascen_bWorldPrimary Energy EJ/yr20074.0
model_ascen_bWorldPrimary Energy EJ/yr20107.0

Next steps

New to Jupyter notebooks?
Read this page for helpful tips and tricks when working with Jupyter notebooks.

Questions?

Take a look at the other tutorials to see the scope of pyam features - then join our mailing list!

Problems?

If you encounter any functions or methods that don’t work as expected, please check whether there are already any issues in our GitHub repo.
If not, start a new one!