Working with Percentiles (and Quantiles) of Distributions¶

Many times we want to observe different distributional properties of scenario data. The pyam function compute.quantiles() can help!

0. Define timeseries data and initialize an IamDataFrame¶

This tutorial uses a scenario similar to the data in the first-steps tutorial (here on GitHub and on read the docs).

Please read that tutorial for the reference and further information.

[1]:

from pyam import IamDataFrame

df = IamDataFrame(data="tutorial_data.csv")
df.timeseries().head()

[INFO] 16:21:43 - pyam.core: Reading file tutorial_data.csv

[1]:

					2010	2020	2030	2040	2050	2060	2070	2080	2090	2100
model	scenario	region	variable	unit
AIM/CGE 2.1	CD-LINKS_INDCi	R5ASIA	Emissions\|CO2	Mt CO2/yr	11231.0880	14359.2801	14873.5967	15238.9081	15180.1854	15513.1760	16003.2060	16343.3124	17097.8681	17722.1245
			Primary Energy	EJ/yr	145.7409	191.0565	216.2135	234.2793	245.9771	258.3201	268.7644	275.0764	283.1479	288.6838
			Primary Energy\|Biomass	EJ/yr	23.6647	24.0751	25.9262	27.3646	29.6938	29.8102	30.1178	30.0109	29.6166	29.5846
			Primary Energy\|Fossil	EJ/yr	116.1932	155.0735	168.2376	179.0562	185.2168	195.6202	203.4916	207.3614	214.7828	217.7714
			Primary Energy\|Non-Biomass Renewables	EJ/yr	4.5139	9.2641	17.0767	22.0967	25.3211	26.6589	27.9490	29.3259	29.3942	30.6799

1. Let’s see how many scenarios define CO2 emissions¶

[2]:

df.filter(variable="Emissions|CO2", region="World").scenario

[2]:

['1.0',
 'CD-LINKS_INDCi',
 'CD-LINKS_NPi',
 'CD-LINKS_NPi2020_1000',
 'CD-LINKS_NPi2020_1600',
 'CD-LINKS_NPi2020_400',
 'CD-LINKS_NoPolicy',
 'Faster Transition Scenario']

2. Get the median¶

The median is the 0.5 quantile (or percentile) - let’s take a look!

[3]:

from matplotlib import pyplot as plt

fig, ax = plt.subplots()

# plot the background field of scenario data
(
    df.filter(variable="Emissions|CO2", region="World").plot.line(
        color="variable", alpha=0, fill_between=True, ax=ax
    )
)

# plot just the median
(
    df.filter(variable="Emissions|CO2", region="World")
    .compute.quantiles([0.5])
    .plot.line(ax=ax)
)

[3]:

<Axes: title={'center': 'model: Quantiles - scenario: 0.5 - region: World - variable: Emissions|CO2'}, xlabel='Year', ylabel='Mt CO2/yr'>

3. Get arbitrary quantiles¶

[4]:

fig, ax = plt.subplots()

# plot the background field of scenario data
(
    df.filter(variable="Emissions|CO2", region="World").plot.line(
        color="variable", alpha=0, fill_between=True, ax=ax
    )
)

# plot quantiles
(
    df.filter(variable="Emissions|CO2", region="World")
    .compute.quantiles([0.2, 0.6, 0.8])
    .plot.line(ax=ax)
)

[4]:

<Axes: title={'center': 'model: Quantiles - region: World - variable: Emissions|CO2'}, xlabel='Year', ylabel='Mt CO2/yr'>

4. Weighted Quantiles¶

Weighted quantiles are also supported via the wquantiles package and are keyed to model/scenario combinations (unless the level argument is provided to compute.quantiles()).

[5]:

import numpy as np

weights = df.meta.assign(weight=np.random.rand(len(df.meta)))
weights.head()

[5]:

		weight
model	scenario
AIM/CGE 2.1	CD-LINKS_INDCi	0.890174
	CD-LINKS_NPi	0.077682
	CD-LINKS_NPi2020_1000	0.364202
	CD-LINKS_NPi2020_1600	0.305781
	CD-LINKS_NPi2020_400	0.191420

[6]:

fig, ax = plt.subplots()

# plot the background field of scenario data
(
    df.filter(variable="Emissions|CO2", region="World").plot.line(
        color="variable", alpha=0, fill_between=True, ax=ax
    )
)

# plot weighted quantiles
(
    df.filter(variable="Emissions|CO2", region="World")
    .compute.quantiles([0.2, 0.6, 0.8], weights=weights["weight"])
    .plot.line(ax=ax)
)

[6]:

<Axes: title={'center': 'model: Weighted Quantiles - region: World - variable: Emissions|CO2'}, xlabel='Year', ylabel='Mt CO2/yr'>

[ ]: