Analysing Backtest Results#

After running a backtest, QF-Lib generates a rich set of output files and in-memory objects that let you understand exactly how your strategy performed. This tutorial covers:

What output files are produced and what each contains.
How to compute key performance metrics from any price series using TimeseriesAnalysis.
How to generate the different tearsheet documents - with benchmark, without benchmark, and comparative.
How to analyse individual trades through TradeAnalysisSheet.

Note

All code samples on this page are based on the demo data provider and can be run directly from the demo_scripts/common/ folder of the repository.

Understanding the output directory#

When you run a backtest with BacktestTradingSessionBuilder, every result is written to a directory under output/backtesting/<backtest_name>/. The location of the output root is set by the output_directory key in your Settings JSON file.

The default output contains the following files:

File	Contents
`Transactions.csv`	One row per fill: timestamp, name of the asset, contract symbol (ticker), security type (stock, future etc), contract size (important for e.g. futures), quantity, price and commission.
`Config.yaml`	A record of every parameter set on the `BacktestTradingSessionBuilder` (frequency, commission model, slippage model, position sizer, etc.).
`Timeseries.xlsx`	Daily end-of-day portfolio value time series (and optional leverage sheet).
`Portfolio Analysis Sheet.pdf`	Charts showing asset-level performance, position count, and exposure over time.
`Tearsheet.pdf`	Summary with performance metrics and the equity curve.
`Trades Analysis Sheet.pdf`	Trade-level statistics: win rate, average trade duration, best/worst trade return, etc.

These files are generated automatically by BacktestMonitor. If you want to suppress output during a fast parameter sweep, pass BacktestMonitorSettings.no_stats() to session_builder.set_monitor_settings().

Performance metrics reference#

The tearsheet and console output both print a standard table of risk/return metrics. The table below explains each one:

Metric	Definition
Total Return	Cumulative return over the full period: `(final_value / initial_value) - 1`.
Annualised Return (CAGR)	Compound Annual Growth Rate - the geometric mean annual return.
Annualised Volatility	Standard deviation of daily returns scaled to annual frequency.
Annualised Upside / Downside Vol.	Volatility calculated using only positive / negative daily returns respectively.
Sharpe Ratio	Difference between the portfolio returns and the risk-free return, divided by the standard deviation of the portfolio returns.
Sortino Ratio	Difference between the portfolio returns and the risk-free return, divided by the standard deviation of the negative portfolio returns.
Omega Ratio	Probability-weighted ratio of gains to losses above a threshold return.
Calmar Ratio	Ratio comparing average annual returns to maximum drawdown.
Gain to Pain Ratio	Sum of all returns divided by the absolute sum of all negative returns.
5% CVaR	Conditional Value-at-Risk (expected shortfall) at the 5% level - average return on the worst 5% of days.
Max Drawdown	Largest peak-to-trough decline in portfolio value.
Avg Drawdown	Mean of all individual drawdown depths.
Avg Drawdown Duration	Mean number of days spent below a previous high-water mark.
Best / Worst Return	Largest single-day gain and loss.
Avg Positive / Negative Return	Mean return on days with gains / losses.
Skewness	Third moment of the return distribution. Positive skew means a long right tail (rare large gains).

TimeseriesAnalysis#

TimeseriesAnalysis is a standalone utility that computes all metrics above from any PricesSeries or ReturnsSeries. You do not need to run a backtest first.

Analysing a single series#

from qf_lib.analysis.timeseries_analysis.timeseries_analysis import TimeseriesAnalysis
from qf_lib.common.enums.frequency import Frequency
from qf_lib.common.enums.price_field import PriceField
from qf_lib.common.utils.dateutils.string_to_date import str_to_date

from demo_scripts.common.utils.dummy_ticker import DummyTicker
from demo_scripts.demo_configuration.demo_data_provider import daily_data_provider

start_date = str_to_date('2016-01-01')
end_date = str_to_date('2017-12-31')

# Get a price series for ticker AAA
prices = daily_data_provider.get_price(
    tickers=DummyTicker('AAA'),
    fields=PriceField.Close,
    start_date=start_date,
    end_date=end_date,
)

ta = TimeseriesAnalysis(prices, Frequency.DAILY)

# Pretty-print the full metrics table to the console
print(TimeseriesAnalysis.values_in_table(ta))

Output of the code snippet above:

                                          AAA
Start Date                         2016-01-04
End Date                           2017-12-29
Total Return                            13.53 %
Annualised Return                        6.59 %
Annualised Volatility                   10.65 %
Annualised Upside Vol.                   6.59 %
Annualised Downside Vol.                 7.13 %
Sharpe Ratio                             0.60
Omega Ratio                              1.11
Calmar Ratio                             0.57
Gain to Pain Ratio                       0.54
Sortino Ratio                            0.92
5% CVaR                                 -1.49 %
Annualised 5% CVaR                     -21.23 %
Max Drawdown                            11.47 %
Avg Drawdown                             4.01 %
Avg Drawdown Duration                   32.81 days
Best Return                              2.76 %
Worst Return                            -2.56 %
Avg Positive Return                      0.52 %
Avg Negative Return                     -0.51 %
Skewness                                -0.14
No. of daily samples                      520

Analysing multiple series at once#

If you have prices for several tickers in a QFDataFrame, pass it to table_for_df to get a side-by-side comparison table:

from qf_lib.analysis.timeseries_analysis.timeseries_analysis import TimeseriesAnalysis
from qf_lib.common.enums.price_field import PriceField
from qf_lib.common.utils.dateutils.string_to_date import str_to_date

from demo_scripts.common.utils.dummy_ticker import DummyTicker
from demo_scripts.demo_configuration.demo_data_provider import daily_data_provider

start_date = str_to_date('2016-01-01')
end_date = str_to_date('2017-12-31')

tickers = [DummyTicker('AAA'), DummyTicker('BBB'), DummyTicker('CCC'), DummyTicker('DDD')]

prices_df = daily_data_provider.get_price(
    tickers=tickers,
    fields=PriceField.Close,
    start_date=start_date,
    end_date=end_date,
)
prices_df.ffill(inplace=True)  # forward-fill any gaps

print(TimeseriesAnalysis.table_for_df(prices_df))

The result is a DataFrame with one row per metric and one column per ticker - convenient for programmatic comparison.Output of the code snippet above:

Analysed period: 2016-01-04 - 2017-12-29, using daily data
Name            total_ret        cagr         vol      up_vol    down_vol      sharpe       omega      calmar   gain/pain     sortino        cvar     cvar_an      max_dd      avg_dd  avg_dd_dur    best_ret   worst_ret avg_pos_ret avg_neg_ret    skewness     #observ
AAA                 13.53        6.59       10.65        6.59        7.13        0.60        1.11        0.57        0.54        0.92       -1.49      -21.23       11.47        4.01       32.81        2.76       -2.56        0.52       -0.51       -0.14         520
BBB                 36.32       16.87        9.96        6.85        5.44        1.56        1.29        1.40        1.86        3.10       -1.17      -17.01       12.04        3.29       24.36        2.24       -1.72        0.51       -0.47        0.31         520
CCC                -35.35      -19.70       12.55        7.86        8.15       -1.75        0.77       -0.47       -0.73       -2.42       -1.80      -25.03       41.95       26.53      180.50        4.07       -2.82        0.58       -0.64        0.13         520
DDD                 34.16       15.93       10.30        7.15        6.13        1.44        1.27        2.02        1.44        2.60       -1.25      -18.09        7.90        2.63       23.10        2.52       -2.65        0.52       -0.47        0.25         520

Accessing individual metrics#

You can also read individual attributes from a TimeseriesAnalysis object directly:

ta = TimeseriesAnalysis(prices, Frequency.DAILY)

print("Sharpe:       ", ta.sharpe_ratio)
print("Max Drawdown: ", ta.max_drawdown)
print("CAGR:         ", ta.cagr)
print("Sortino:      ", ta.sortino_ratio)

Generating tearsheets#

Tearsheets can be generated independently of a backtest, from any two price series (strategy and benchmark). There are three types:

Class	When to use
`TearsheetWithoutBenchmark`	You only have a strategy series and no benchmark to compare against.
`TearsheetWithBenchmark`	You want to show the strategy alongside a benchmark with relative performance charts.
`TearsheetComparative`	You want to compare two strategies or a strategy vs benchmark side by side.

All three classes follow the same pattern: instantiate, call build_document(), then save().

Tearsheet without benchmark#

from qf_lib.analysis.tearsheets.tearsheet_without_benchmark import TearsheetWithoutBenchmark
from qf_lib.common.enums.price_field import PriceField
from qf_lib.common.utils.dateutils.string_to_date import str_to_date
from qf_lib.documents_utils.document_exporting.pdf_exporter import PDFExporter

from demo_scripts.common.utils.dummy_ticker import DummyTicker
from demo_scripts.demo_configuration.demo_data_provider import daily_data_provider
from demo_scripts.demo_configuration.demo_settings import get_demo_settings

start_date = str_to_date('2013-01-01')
end_date = str_to_date('2017-12-31')

strategy = daily_data_provider.get_price(
    tickers=DummyTicker('AAA'),
    fields=PriceField.Close,
    start_date=start_date,
    end_date=end_date,
)
strategy.name = "My Strategy"

settings = get_demo_settings()
pdf_exporter = PDFExporter(settings)

tearsheet = TearsheetWithoutBenchmark(
    settings, pdf_exporter, strategy, title="Strategy Tearsheet"
)
tearsheet.build_document()
tearsheet.save(file_name="Strategy Tearsheet")

Tearsheet with benchmark#

Add a benchmark series and an optional live_date to mark the boundary between in-sample and out-of-sample periods:

from qf_lib.analysis.tearsheets.tearsheet_with_benchmark import TearsheetWithBenchmark

live_date = str_to_date('2015-01-01')  # strategy went live on this date

benchmark = daily_data_provider.get_price(
    tickers=DummyTicker('BBB'),
    fields=PriceField.Close,
    start_date=start_date,
    end_date=end_date,
)
benchmark.name = "Benchmark"

tearsheet = TearsheetWithBenchmark(
    settings, pdf_exporter, strategy, benchmark,
    live_date=live_date,
    title="Strategy vs Benchmark",
)
tearsheet.build_document()
tearsheet.save(file_name="Strategy vs Benchmark")

The tearsheet will draw a vertical line at live_date to separate the backtested period from live trading. If you omit live_date (or pass None), no line is drawn.

Comparative tearsheet#

Use TearsheetComparative to overlay two equity curves on the same page:

from qf_lib.analysis.tearsheets.tearsheet_comparative import TearsheetComparative

tearsheet = TearsheetComparative(
    settings, pdf_exporter, strategy, benchmark,
    title="Strategy vs Benchmark Comparative",
)
tearsheet.build_document()
tearsheet.save(file_name="Comparative Tearsheet")

Trade analysis#

A transaction is a single fill (one buy or sell execution). A trade is a completed round-trip in one asset: all fills from opening a position until it is flat again.

BacktestPosition stores that round-trip inside the backtester. Each closed position records:

start_time / end_time - when the position was opened and fully closed
total_pnl and total_commission() - currency P&L including fees
direction() - 1 for long, -1 for short
ticker() - the instrument

TradesGenerator can build Trade objects in two equivalent ways:

From closed positions - portfolio.closed_positions() after a backtest (one trade per closed position).
From transactions - any sequence of Transaction objects, including rows loaded from Transactions.csv.

Both paths use the same trade logic; unit tests verify that create_trades_from_transactions matches create_trades_from_backtest_positions for the same fills.

From closed `BacktestPosition` objects (in memory)#

After ts.start_trading(), use closed positions and the portfolio NAV series so each trade gets a percentage P&L (P&L divided by portfolio value at position open):

from qf_lib.analysis.trade_analysis.trade_analysis_sheet import TradeAnalysisSheet
from qf_lib.analysis.trade_analysis.trades_generator import TradesGenerator

trades_generator = TradesGenerator()
closed_positions = ts.portfolio.closed_positions()
portfolio_tms = ts.portfolio.portfolio_eod_series()

trades = trades_generator.create_trades_from_backtest_positions(
    closed_positions, portfolio_tms
)

if trades:
    nr_of_assets = len({t.ticker.name for t in trades})
    start = portfolio_tms.index[0]
    end = portfolio_tms.index[-1]

    sheet = TradeAnalysisSheet(
        settings, pdf_exporter,
        nr_of_assets_traded=nr_of_assets,
        trades=trades,
        start_date=start,
        end_date=end,
        title="Trade Analysis",
    )
    sheet.build_document()
    sheet.save("output/backtesting/my_run")

TradeAnalysisSheet produces a PDF with win rate, average trade duration, best/worst trade, return distributions, and related statistics.

From `Transactions.csv` (after the backtest)#

BacktestMonitor writes Transactions.csv when issue_transaction_log is enabled (default in BacktestMonitorSettings). Columns: Timestamp, Asset Name, Contract symbol, Security type, Contract size, Quantity, Price, Commission.

Use this when you want to rebuild trade analysis without re-running the backtest - for example from an archived output folder.

import os

import pandas as pd

from demo_scripts.common.utils.dummy_ticker import DummyTicker
from demo_scripts.demo_configuration.demo_settings import get_demo_settings
from qf_lib.analysis.trade_analysis.trade_analysis_sheet import TradeAnalysisSheet
from qf_lib.analysis.trade_analysis.trades_generator import TradesGenerator
from qf_lib.backtesting.portfolio.transaction import Transaction
from qf_lib.containers.series.qf_series import QFSeries
from qf_lib.documents_utils.document_exporting.pdf_exporter import PDFExporter

def load_transactions_csv(path: str):
    """Map monitor CSV rows to Transaction objects (demo: DummyTicker per contract symbol)."""
    df = pd.read_csv(path)
    transactions = []
    for _, row in df.iterrows():
        # For Bloomberg or other providers, replace DummyTicker with your Ticker subclass.
        ticker = DummyTicker(row["Contract symbol"])
        transactions.append(Transaction(
            pd.to_datetime(row["Timestamp"]),
            ticker,
            row["Quantity"],
            row["Price"],
            row["Commission"],
        ))
    return transactions

def load_portfolio_eod_xlsx(path: str) -> QFSeries:
    """Load NAV from Timeseries.xlsx written by BacktestMonitor (first data column)."""
    df = pd.read_excel(path, index_col=0, parse_dates=True)
    series = QFSeries(df.iloc[:, 0])
    series.name = "Portfolio"
    return series

def main():
    # Folder produced by BacktestMonitor, e.g. output/backtesting/<backtest_name>/
    report_dir = "output/backtesting/my_backtest"
    transactions_path = os.path.join(report_dir, "2025_01_15-1200 Transactions.csv")
    timeseries_path = os.path.join(report_dir, "2025_01_15-1200 Timeseries.xlsx")

    settings = get_demo_settings()
    pdf_exporter = PDFExporter(settings)

    transactions = load_transactions_csv(transactions_path)
    portfolio_tms = load_portfolio_eod_xlsx(timeseries_path)

    trades_generator = TradesGenerator()
    trades = trades_generator.create_trades_from_transactions(transactions, portfolio_tms)

    if not trades:
        print("No closed round-trip trades found in the CSV.")
        return

    nr_of_assets = len({t.ticker.name for t in trades})
    start_date = min(t.start_time for t in trades)
    end_date = max(t.end_time for t in trades)

    sheet = TradeAnalysisSheet(
        settings, pdf_exporter,
        nr_of_assets_traded=nr_of_assets,
        trades=trades,
        start_date=start_date,
        end_date=end_date,
        title="Trade Analysis from CSV",
    )
    sheet.build_document()
    sheet.save(report_dir)

if __name__ == "__main__":
    main()

Note

Match the actual filenames in your output folder (timestamp prefix varies per run).
load_transactions_csv must construct the same Ticker types you used in the backtest. The demo uses DummyTicker from demo_scripts; for live data use e.g. BloombergTicker and map Security type / Contract size from the CSV.
If you omit portfolio_tms, trades are still created but percentage_pnl may be empty and some charts in the sheet will be limited.

Reading trade results programmatically#

Each Trade exposes:

for trade in trades:
    print(f"Ticker:   {trade.ticker}")
    print(f"P&L:      {trade.pnl:.2f}")
    if trade.percentage_pnl is not None:
        print(f"P&L (%):  {trade.percentage_pnl:.2%}")
    print(f"Direction: {trade.direction}")
    print(f"Open:     {trade.start_time}  →  Close: {trade.end_time}")

Example output:

Ticker:   DummyTicker('AAA')
P&L:      399319.04
P&L (%):  4.01%
Direction: 1.0
Open:     2010-01-01 13:30:00  →  Close: 2010-02-09 13:30:00

Ticker:   DummyTicker('AAA')
P&L:      -35780.43
P&L (%):  -0.34%
Direction: 1.0
Open:     2010-03-25 13:30:00  →  Close: 2010-04-09 13:30:00

Analysing Backtest Results#

Understanding the output directory#

Performance metrics reference#

TimeseriesAnalysis#

Analysing a single series#

Analysing multiple series at once#

Accessing individual metrics#

Generating tearsheets#

Tearsheet without benchmark#

Tearsheet with benchmark#

Comparative tearsheet#

Trade analysis#

From closed BacktestPosition objects (in memory)#

From Transactions.csv (after the backtest)#

Reading trade results programmatically#

From closed `BacktestPosition` objects (in memory)#

From `Transactions.csv` (after the backtest)#