QFDataFrame

class qf_lib.containers.dataframe.qf_dataframe.QFDataFrame(data=None, index: Axes | None = None, columns: Axes | None = None, dtype: Dtype | None = None, copy: bool | None = None)[source]

Bases: DataFrame, TimeIndexedContainer

Base class for all data frames (2-D matrix-like objects) used in the project. All the columns within the dataframe contain values for the same date range and have the same frequencies. All the columns are of the same types (e.g. log-returns/prices).

Methods:

exponential_average([lambda_coeff])

Calculates the exponential average of a dataframe.

get_frequency()

Attempts to infer the frequency of each column in this dataframe.

min_max_normalized([original_min_values, ...])

Normalizes the data using min-max scaling: it maps all the data to the [0;1] range, so that 0 corresponds to the minimal value in the original series and 1 corresponds to the maximal value.

rolling_time_window(window_length, step, func)

Runs a given function on each rolling window in the dataframe.

rolling_window(window_size, func[, step, ...])

Looks at a number of windows of size window_size and transforms the data in those windows based on the specified func.

to_log_returns()

Converts dataframe to the dataframe of logarithmic returns.

to_prices([initial_prices, ...])

Converts a dataframe to the dataframe of prices.

to_simple_returns()

Converts dataframe to the dataframe of simple returns.

total_cumulative_return()

Calculates total cumulative return for each column.

exponential_average(lambda_coeff: float = 0.94) QFDataFrame[source]

Calculates the exponential average of a dataframe.

Parameters:

lambda_coeff – lambda coefficient

Returns:

smoothed version (exponential average) of the data frame

Return type:

QFDataFrame

get_frequency() Mapping[str, Frequency][source]

Attempts to infer the frequency of each column in this dataframe. The analysis uses pandas’ infer_freq, as well as a heuristic to reduce the amount of Irregular results.

See the implementation of the Frequency.infer_freq function for more information.

min_max_normalized(original_min_values: Optional[Sequence[float]] = None, original_max_values: Optional[Sequence[float]] = None) QFDataFrame[source]

Normalizes the data using min-max scaling: it maps all the data to the [0;1] range, so that 0 corresponds to the minimal value in the original series and 1 corresponds to the maximal value. It is also possible to specify values which should correspond to 0 and 1 after applying the normalization. It is useful if the same normalization parameters are used to normalize different data.

Parameters:
  • original_min_values – values which should correspond to 0 after applying the normalization (one value for each column)

  • original_max_values – values which should correspond to 1 after applying the normalization (one value for each column)

Returns:

dataframe of normalized values

Return type:

QFDataFrame

rolling_time_window(window_length: int, step: int, func: Callable[[Union[QFDataFrame, ndarray]], QFSeries]) Union[None, QFSeries, QFDataFrame][source]

Runs a given function on each rolling window in the dataframe. The content of a rolling window is also a QFDataFrame thus the funciton which should be applied should accept a QFDataFrame as an argument.

The function may return either a QFSeries (then the output of rolling_time_window will be QFDataFrame) or a scalar value (then the output of rolling_time_window will be QFSeries).

The rolling window is moved along the time index (rows).

Parameters:
  • window_length – number of rows which should be taken into rolling window

  • step – number of rows by which rolling window should be moved

  • func – function to apply on each rolling window. If it returns a QFSeries then the output of rolling_time_window() will be a QFDataFrame; if it returns a scalar value, the return value of rolling_time_window() will be a QFSeries

Returns:

None (if the result of running the rolling window was empty) or QFSeries (if the function applied returned scalar value for each window) or QFDataFrame (if the function applied returned QFSeries for each window)

Return type:

None, QFSeries, QFDataFrame

rolling_window(window_size: int, func: Callable[[Union[QFSeries, ndarray]], float], step: int = 1, optimised: bool = False) QFDataFrame[source]

Looks at a number of windows of size window_size and transforms the data in those windows based on the specified func. This is performed for each column inside this data frame.

The window indices are stepped at a rate specified by step.

Warning: The other parameter is only present to keep consistency with QFSeries’ rolling_window function, it should always be None.

Parameters:
  • window_size – The size of the window to look at specified as the number of data points.

  • func – The function to call during each iteration. When other is None this function should take one QFSeries and return a value (Usually a number such as a float). Otherwise, this function should take two QFSeries arguments and return a value.

  • step – The amount of data points to step through after each iteration, i.e. how much to move the window by in each iteration.

  • optimised – Whether the more efficient pandas algorithm should be used for the rolling window application. Note: This has some limitations: The step must be 1 and func will get an ndarray parameter which only contains values and no index.

Returns:

data frame containing the transformed data

Return type:

QFDataFrame

to_log_returns() LogReturnsDataFrame[source]

Converts dataframe to the dataframe of logarithmic returns. First date of prices in the returns dataframe won’t be present.

Returns:

dataframe of log returns

Return type:

LogReturnsDataFrame

to_prices(initial_prices: Sequence[float] = None, suggested_initial_date: Union[datetime, int, float] = None, frequency: Frequency = None) PricesDataFrame[source]

Converts a dataframe to the dataframe of prices. The dataframe of prices returned will have an extra date at the beginning (in comparison to the returns’ dataframe). The difference between the extra date and the rest of the dates can be inferred from the returns’ dataframe or can be calculated using the frequency passed as the optional argument. Additional date at the beginning (so called “initial date”) is caused by the fact, that return for the first date of prices timeseries cannot be calculated, so it’s missing. Thus, during the opposite conversion, extra date at the beginning will be added.

Parameters:
  • initial_prices – initial price for all timeseries. If no prices are specified, then they will be assumed to be 1. If only one value is passed (instead of a list with values for each column), then the initial price will be the same for each series contained within the dataframe.

  • suggested_initial_date – the first date or initial value for the prices series. It won’t be necessarily the first date of the price series (e.g. if the method is run on the PricesDataFrame then it won’t be used).

  • frequency – the frequency of the returns’ timeseries. It is used to infer the initial date for the prices series.

Returns:

dataframe of prices

Return type:

PricesDataFrame

to_simple_returns() SimpleReturnsDataFrame[source]

Converts dataframe to the dataframe of simple returns. First date of prices in the returns timeseries won’t be present.

Returns:

dataframe of simple returns

Return type:

SimpleReturnsDataFrame

total_cumulative_return() QFSeries[source]

Calculates total cumulative return for each column.

Returns:

Series containing total cumulative return for each column of the original DataFrame.

Return type:

QFSeries