Integration with R¶

This tutorial illustrates how to use pyam features in an R workflow using the reticulate package.

a2b8adf3d0b847eab19414cfe54079fc

The source code of the tutorial is available in the folder doc/source/R_tutorials of the pyam GitHub repository.

Installation and requirements¶

Assuming you have a working and current R installation, you can follow the steps below to install the requirements to run this tutorial:

Install Python using Anaconda

This is the recommended approach for users that aren’t already experienced with Python…
Install pyam (read the docs for more information)

conda install -c conda-forge pyam
To use R in a Jupyter notebook as in this tutorial, install Jupyter and the IRkernel via conda

conda install jupyter ir-kernel
In R, install the reticulate package from CRAN

r install.packages("reticulate")

This tutorial is an illustration on how to get started, not a full-fledged documentation of a pyam-R interface…
The integration between Python and R using reticulate is a bit fickle - having multiple Python installations/enviroments or outdated versions of reticulate/pandas can cause issues.
Please read the docs if you experience any problems!

This notebook was run with R version 4.0 and Python 3.8 on Mac OS.

Developers note: Running this notebook on CI and RTD is currently not supported.

For the time being, this notebook is not executed by nbsphinx and has to be saved with output.

Setting up the session¶

[1]:

require(reticulate)

Loading required package: reticulate

The next line instructs reticulate to use the conda ‘base’ environment.

Read the docs in case you want to use an alternative installation or environment!

[2]:

use_condaenv()

Now, we import pyam to the R session.

[3]:

pyam <- import("pyam")

Passing an R dataframe to pyam¶

The first cell of this section creates a simple timeseries data table following the IAMC format as an R dataframe.

[4]:

data <- data.frame(
  model=c('model_a', 'model_a', 'model_a'),
  scenario=c('scen_a', 'scen_a', 'scen_b'),
  region=c('World', 'World', 'World'),
  variable=c('Primary Energy', 'Primary Energy|Coal', 'Primary Energy'),
  unit=c('EJ/yr', 'EJ/yr', 'EJ/yr'),
  "2005"=c(1, 0.5, 2),
  "2010"=c(6, 3, 7)
)
data

A data.frame: 3 × 7
model	scenario	region	variable	unit	X2005	X2010
<chr>	<chr>	<chr>	<chr>	<chr>	<dbl>	<dbl>
model_a	scen_a	World	Primary Energy	EJ/yr	1.0	6
model_a	scen_a	World	Primary Energy\|Coal	EJ/yr	0.5	3
model_a	scen_b	World	Primary Energy	EJ/yr	2.0	7

The following cell casts the data to a pyam.IamDataFrame.

When migrating code from Python to R, keep in mind that

you have to use $ instead of . to call an object’s methods!

[5]:

df <- pyam$IamDataFrame(data)
df

<class 'pyam.core.IamDataFrame'>
Index dimensions:
 * model    : model_a (1)
 * scenario : scen_a, scen_b (2)
Timeseries data coordinates:
   region   : World (1)
   variable : Primary Energy, Primary Energy|Coal (2)
   unit     : EJ/yr (1)
   year     : 2005, 2010 (2)
Meta indicators:
   exclude (bool) False (1)

Ending a cell of a Jupyter notebook with the **IamDataFrame* instance shows the same overview as if you were in a Python environment!*

The last cell of this section retrieves the timeseries data from the IamDataFrame in long form and returns it again as an R data.frame.

[6]:

df$data

A data.frame: 6 × 7
model	scenario	region	variable	unit	year	value
<chr>	<chr>	<chr>	<chr>	<chr>	<dbl>	<dbl>
model_a	scen_a	World	Primary Energy	EJ/yr	2005	1.0
model_a	scen_a	World	Primary Energy	EJ/yr	2010	6.0
model_a	scen_a	World	Primary Energy\|Coal	EJ/yr	2005	0.5
model_a	scen_a	World	Primary Energy\|Coal	EJ/yr	2010	3.0
model_a	scen_b	World	Primary Energy	EJ/yr	2005	2.0
model_a	scen_b	World	Primary Energy	EJ/yr	2010	7.0

Calling a pyam method¶

For illustration, we will use the interpolate() method on the IamDataFrame instance.

One key difference between R and Python are the default types:
whereas Python will interpret “2007” as an integer, R will treat the same input as a float.
In the example below, we have to explicitly instruct R to pass an integer to the pyam function.

[7]:

x <- df$interpolate(as.integer(2007), inplace=TRUE)

[8]:

df$data

A data.frame: 9 × 7
model	scenario	region	variable	unit	year	value
<chr>	<chr>	<chr>	<chr>	<chr>	<dbl>	<dbl>
model_a	scen_a	World	Primary Energy	EJ/yr	2005	1.0
model_a	scen_a	World	Primary Energy	EJ/yr	2007	3.0
model_a	scen_a	World	Primary Energy	EJ/yr	2010	6.0
model_a	scen_a	World	Primary Energy\|Coal	EJ/yr	2005	0.5
model_a	scen_a	World	Primary Energy\|Coal	EJ/yr	2007	1.5
model_a	scen_a	World	Primary Energy\|Coal	EJ/yr	2010	3.0
model_a	scen_b	World	Primary Energy	EJ/yr	2005	2.0
model_a	scen_b	World	Primary Energy	EJ/yr	2007	4.0
model_a	scen_b	World	Primary Energy	EJ/yr	2010	7.0

Next steps¶

New to Jupyter notebooks?

Read this page for helpful tips and tricks when working with Jupyter notebooks.

Questions?¶

Take a look at the other tutorials to see the scope of pyam features - then join our mailing list!

Problems?¶

If you encounter any functions or methods that don’t work as expected, please check whether there are already any issues in our GitHub repo.

If not, start a new one!