The Statistics class

This class provides a wrapper for generating descriptive summary statistics of timeseries data from a scenario ensemble. It internally uses the pandas.DataFrame.describe() function and hides the tedious work of filters, groupbys and merging of dataframes.

class pyam.Statistics(df, groupby=None, filters=None, rows=False, percentiles=[0.25, 0.5, 0.75])[source]

This class generates descriptive statistics of timeseries data

Parameters
dfIamDataFrame

an IamDataFrame from which to retrieve meta indicators for grouping or filtering

groupbystr or dict

a column of df.meta to be used for groupby or a dictionary of {column: list}, where list is used for ordering

filterslist of tuples

arguments for filtering and describing, either ((index, dict) or ((index[0], index[1]), dict); when also using groupby, index must have length 2.

percentileslist-like of numbers, optional

The percentiles to get from pandas.DataFrame.describe(). All should fall between 0 and 1. The default is [.25, .5, .75], which returns the 25th, 50th, and 75th percentiles.

Methods

add(data, header[, row, subheader])

Filter 'data' by arguments of this Statistics instance,

reindex([copy])

Reindex the summary statistics dataframe

summarize([center, fullrange, ...])

Format the compiled statistics to a concise string output

add(data, header, row=None, subheader=None)[source]

Filter ‘data’ by arguments of this Statistics instance,

Apply pandas.DataFrame.describe() and format the statistics

Parameters
datapandas.DataFrame or pandas.Series

data for which summary statistics should be computed

headerstr

column name for descriptive statistics

rowstr

row name for descriptive statistics (required if Statistics(rows=True))

subheaderstr, optional

column name (level=1) if data is a unnamed pandas.Series

reindex(copy=True)[source]

Reindex the summary statistics dataframe

summarize(center='mean', fullrange=None, interquartile=None, custom_format='{:.2f}')[source]

Format the compiled statistics to a concise string output

Parameters
centerstr, default mean

what to return as ‘center’ of the summary: mean, 50%, median

fullrangebool, default None

return full range of data if True or fullrange, interquartile and format_spec are None

interquartilebool, default None

return interquartile range if True

custom_formatformatting specifications