OverfittingAnalysis¶

class
qf_lib.analysis.backtests_overfitting.overfitting_analysis.
OverfittingAnalysis
(multiple_returns_timeseries: qf_lib.containers.dataframe.simple_returns_dataframe.SimpleReturnsDataFrame, ranking_function: Callable, num_of_slices: int = 14)[source]¶ Bases:
object
Class providing statistics and analysis for checking if backtest is overfitted. It is based on the algorithms described in “The probability of backtest overfitting” by Bailey, Borwein, Lopez de Prado and Jim Zhu.
Methods
Returns the expected return of best strategies in the OutOfSample.
Returns the probability of backtest overfitting.
Returns the probability of loss for the best strategy.
calculate_relative_rank_logits
(strategies_names)Computes relative ranks for the strategies named in the strategies_names list and afterwards calculates the logits.
Splits slices into two groups of equal sizes for all possible combinations.
 returns
dataframe with two columns: OOS and IS, each row contains the quality value of the best IS strategy for both
rank_strategies
(df, ascending)Rank strategies using the ranking function.
Attributes
List of QFDataFrames, each of which contains 2 columns  quality and rank, and is indexed by the strategies names.
List of QFDataFrames, each of which contains 2 columns  quality and rank, and is indexed by the strategies names.
List of strategies with the maximum rank.

best_is_strategies_names
¶ List of strategies with the maximum rank. If multiple values equal the maximum, the first strategy with that rank is returned.

calculate_expected_return
()[source]¶ Returns the expected return of best strategies in the OutOfSample.

calculate_relative_rank_logits
(strategies_names: List)[source]¶ Computes relative ranks for the strategies named in the strategies_names list and afterwards calculates the logits. High logit values imply a consistency between IS and OOS performances, which indicates a low lever of backtest overfitting.

form_different_is_and_oos_sets
(multiple_returns_timeseries: qf_lib.containers.dataframe.qf_dataframe.QFDataFrame) → Tuple[source]¶ Splits slices into two groups of equal sizes for all possible combinations.
Returns an list of tuples. 1st element of the tuple contains the InSample set and the 2nd one contains the OutOfSample set (both in form of QFDataFrames). Each tuple contains one of possible combinations of slices forming IS and OOS sets. E.g. if there are 4 slices: A,B,C,D then one of possible combinations is IS: A,B and OOS: C,D. The given example will be one of rows of the result list. A and B (C and D) will be concatenated (so that there will be one timeseries AB), and so will be one CD timeseries.

get_best_strategies_is_oos_qualities
()[source]¶  Returns
dataframe with two columns: OOS and IS, each row contains the quality value of the best IS strategy for both in sample (quality of the best strategy) nad out of sample (quality of the strategy that was in this combination set, the best one in the insample period).
 Return type

is_ranking
: Optional[List[QFDataFrame]]¶ List of QFDataFrames, each of which contains 2 columns  quality and rank, and is indexed by the strategies names. Looking at one of these data frames we can learn which strategy (described using its name) had the highest performance (“rank”) and what was that performance (“quality”) in the InSample period.

oos_ranking
: Optional[List[QFDataFrame]]¶ List of QFDataFrames, each of which contains 2 columns  quality and rank, and is indexed by the strategies names. Looking at one of these data frames we can learn which strategy (described using its name) had the highest performance (“rank”) and what was that performance (“quality”) in the OutOfSample period.

rank_strategies
(df: qf_lib.containers.dataframe.simple_returns_dataframe.SimpleReturnsDataFrame, ascending: bool = True) → qf_lib.containers.dataframe.qf_dataframe.QFDataFrame[source]¶ Rank strategies using the ranking function. The worst strategy should be marked as 1. :param df: dataframe containing different strategies returns in the columns :type df: SimpleReturnsDataFrame :param ascending: if True  the smaller the measure, the worse is the strategy :type ascending: bool
 Returns
data frame indexed by strategy names with two columns: quality (containing the quality measure computed for the given strategy, e.g. sharpe ratio) and rank.
 Return type