Working with Percentiles (and Quantiles) of Distributions¶
Many times we want to observe different distributional properties of scenario data. The pyam function compute.quantiles()
can help!
0. Define timeseries data and initialize an IamDataFrame¶
This tutorial uses a scenario similar to the data in the first-steps tutorial (here on GitHub and on read the docs).
Please read that tutorial for the reference and further information.
[1]:
from pyam import IamDataFrame
df = IamDataFrame(data="tutorial_data.csv")
df.timeseries().head()
[INFO] 13:06:29 - pyam.core: Reading file tutorial_data.csv
/home/docs/checkouts/readthedocs.org/user_builds/pyam-iamc/checkouts/latest/pyam/utils.py:318: FutureWarning: The previous implementation of stack is deprecated and will be removed in a future version of pandas. See the What's New notes for pandas 2.1.0 for details. Specify future_stack=True to adopt the new implementation and silence this warning.
.stack(dropna=True)
[1]:
2010 | 2020 | 2030 | 2040 | 2050 | 2060 | 2070 | 2080 | 2090 | 2100 | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
model | scenario | region | variable | unit | ||||||||||
AIM/CGE 2.1 | CD-LINKS_INDCi | R5ASIA | Emissions|CO2 | Mt CO2/yr | 11231.0880 | 14359.2801 | 14873.5967 | 15238.9081 | 15180.1854 | 15513.1760 | 16003.2060 | 16343.3124 | 17097.8681 | 17722.1245 |
Primary Energy | EJ/yr | 145.7409 | 191.0565 | 216.2135 | 234.2793 | 245.9771 | 258.3201 | 268.7644 | 275.0764 | 283.1479 | 288.6838 | |||
Primary Energy|Biomass | EJ/yr | 23.6647 | 24.0751 | 25.9262 | 27.3646 | 29.6938 | 29.8102 | 30.1178 | 30.0109 | 29.6166 | 29.5846 | |||
Primary Energy|Fossil | EJ/yr | 116.1932 | 155.0735 | 168.2376 | 179.0562 | 185.2168 | 195.6202 | 203.4916 | 207.3614 | 214.7828 | 217.7714 | |||
Primary Energy|Non-Biomass Renewables | EJ/yr | 4.5139 | 9.2641 | 17.0767 | 22.0967 | 25.3211 | 26.6589 | 27.9490 | 29.3259 | 29.3942 | 30.6799 |
1. Let’s see how many scenarios define CO2 emissions¶
[2]:
df.filter(variable="Emissions|CO2", region="World").scenario
[2]:
['1.0',
'CD-LINKS_INDCi',
'CD-LINKS_NPi',
'CD-LINKS_NPi2020_1000',
'CD-LINKS_NPi2020_1600',
'CD-LINKS_NPi2020_400',
'CD-LINKS_NoPolicy',
'Faster Transition Scenario']
2. Get the median¶
The median is the 0.5
quantile (or percentile) - let’s take a look!
[3]:
from matplotlib import pyplot as plt
fig, ax = plt.subplots()
# plot the background field of scenario data
(
df.filter(variable="Emissions|CO2", region="World").plot.line(
color="variable", alpha=0, fill_between=True, ax=ax
)
)
# plot just the median
(
df.filter(variable="Emissions|CO2", region="World")
.compute.quantiles([0.5])
.plot.line(ax=ax)
)
[3]:
<Axes: title={'center': 'model: Quantiles - scenario: 0.5 - region: World - variable: Emissions|CO2'}, xlabel='Year', ylabel='Mt CO2/yr'>
3. Get arbitrary quantiles¶
[4]:
fig, ax = plt.subplots()
# plot the background field of scenario data
(
df.filter(variable="Emissions|CO2", region="World").plot.line(
color="variable", alpha=0, fill_between=True, ax=ax
)
)
# plot quantiles
(
df.filter(variable="Emissions|CO2", region="World")
.compute.quantiles([0.2, 0.6, 0.8])
.plot.line(ax=ax)
)
[4]:
<Axes: title={'center': 'model: Quantiles - region: World - variable: Emissions|CO2'}, xlabel='Year', ylabel='Mt CO2/yr'>
4. Weighted Quantiles¶
Weighted quantiles are also supported via the wquantiles
package and are keyed to model/scenario combinations (unless the level
argument is provided to compute.quantiles()
).
[5]:
import numpy as np
weights = df.meta.assign(weight=np.random.rand(len(df.meta)))
weights.head()
[5]:
weight | ||
---|---|---|
model | scenario | |
AIM/CGE 2.1 | CD-LINKS_INDCi | 0.917125 |
CD-LINKS_NPi | 0.834336 | |
CD-LINKS_NPi2020_1000 | 0.102644 | |
CD-LINKS_NPi2020_1600 | 0.505005 | |
CD-LINKS_NPi2020_400 | 0.573783 |
[6]:
fig, ax = plt.subplots()
# plot the background field of scenario data
(
df.filter(variable="Emissions|CO2", region="World").plot.line(
color="variable", alpha=0, fill_between=True, ax=ax
)
)
# plot weighted quantiles
(
df.filter(variable="Emissions|CO2", region="World")
.compute.quantiles([0.2, 0.6, 0.8], weights=weights["weight"])
.plot.line(ax=ax)
)
[6]:
<Axes: title={'center': 'model: Weighted Quantiles - region: World - variable: Emissions|CO2'}, xlabel='Year', ylabel='Mt CO2/yr'>
[ ]: