The Statistics class¶
This class provides a wrapper for generating descriptive summary statistics
of timeseries data from a scenario ensemble.
It internally uses the pandas.DataFrame.describe()
function
and hides the tedious work of filters, groupbys and merging of dataframes.
- class pyam.Statistics(df, groupby=None, filters=None, rows=False, percentiles=[0.25, 0.5, 0.75])[source]¶
This class generates descriptive statistics of timeseries data
- Parameters:
- dfIamDataFrame
an IamDataFrame from which to retrieve meta indicators for grouping or filtering
- groupbystr or dict
a column of df.meta to be used for groupby or a dictionary of {column: list}, where list is used for ordering
- filterslist of tuples
arguments for filtering and describing, either ((index, dict) or ((index[0], index[1]), dict); when also using groupby, index must have length 2.
- percentileslist-like of numbers, optional
The percentiles to get from
pandas.DataFrame.describe()
. All should fall between 0 and 1. The default is [.25, .5, .75], which returns the 25th, 50th, and 75th percentiles.
Methods
add
(data, header[, row, subheader])Filter 'data' by arguments of this Statistics instance,
reindex
([copy])Reindex the summary statistics dataframe
summarize
([center, fullrange, ...])Format the compiled statistics to a concise string output
- add(data, header, row=None, subheader=None)[source]¶
Filter ‘data’ by arguments of this Statistics instance,
Apply
pandas.DataFrame.describe()
and format the statistics- Parameters:
- datapandas.DataFrame or pandas.Series
data for which summary statistics should be computed
- headerstr
column name for descriptive statistics
- rowstr
row name for descriptive statistics (required if
Statistics(rows=True)
)- subheaderstr, optional
column name (level=1) if data is a unnamed
pandas.Series
- summarize(center='mean', fullrange=None, interquartile=None, custom_format='{:.2f}')[source]¶
Format the compiled statistics to a concise string output
- Parameters:
- centerstr, default mean
what to return as ‘center’ of the summary: mean, 50%, median
- fullrangebool, default None
return full range of data if True or fullrange, interquartile and format_spec are None
- interquartilebool, default None
return interquartile range if True
- custom_formatformatting specifications