The Statistics class

This class provides a wrapper for generating descriptive summary statistics of timeseries data from a scenario ensemble. It internally uses the pandas.DataFrame.describe() function and hides the tedious work of filters, groupbys and merging of dataframes.

class pyam.Statistics(df, groupby=None, filters=None, rows=False, percentiles=[0.25, 0.5, 0.75])[source]

This class generates descriptive statistics of timeseries data

Parameters:
dfIamDataFrame

an IamDataFrame from which to retrieve meta indicators for grouping or filtering

groupbystr or dict

a column of df.meta to be used for groupby or a dictionary of {column: list}, where list is used for ordering

filterslist of tuples

arguments for filtering and describing, either ((index, dict) or ((index[0], index[1]), dict); when also using groupby, index must have length 2.

percentileslist-like of numbers, optional

The percentiles to get from pandas.DataFrame.describe(). All should fall between 0 and 1. The default is [.25, .5, .75], which returns the 25th, 50th, and 75th percentiles.

Methods

add(data, header[, row, subheader])

Filter 'data' by arguments of this Statistics instance,

reindex([copy])

Reindex the summary statistics dataframe

summarize([center, fullrange, ...])

Format the compiled statistics to a concise string output

add(data, header, row=None, subheader=None)[source]

Filter ‘data’ by arguments of this Statistics instance,

Apply pandas.DataFrame.describe() and format the statistics

Parameters:
datapandas.DataFrame or pandas.Series

data for which summary statistics should be computed

headerstr

column name for descriptive statistics

rowstr

row name for descriptive statistics (required if Statistics(rows=True))

subheaderstr, optional

column name (level=1) if data is a unnamed pandas.Series

reindex(copy=True)[source]

Reindex the summary statistics dataframe

summarize(center='mean', fullrange=None, interquartile=None, custom_format='{:.2f}')[source]

Format the compiled statistics to a concise string output

Parameters:
centerstr, default mean

what to return as ‘center’ of the summary: mean, 50%, median

fullrangebool, default None

return full range of data if True or fullrange, interquartile and format_spec are None

interquartilebool, default None

return interquartile range if True

custom_formatformatting specifications