QFDataFrame¶

class
qf_lib.containers.dataframe.qf_dataframe.
QFDataFrame
(data=None, index=None, columns=None, dtype=None, copy=False)[source]¶ Bases:
pandas.core.frame.DataFrame
,qf_lib.containers.time_indexed_container.TimeIndexedContainer
Base class for all data frames (2D matrixlike objects) used in the project. All the columns within the dataframe contain values for the same date range and have the same frequencies. All the columns are of the same types (e.g. logreturns/prices).
Methods
exponential_average
(lambda_coeff)Calculates the exponential average of a dataframe.
Attempts to infer the frequency of each column in this dataframe.
min_max_normalized
(original_min_values, …)Normalizes the data using minmax scaling: it maps all the data to the [0;1] range, so that 0 corresponds to the minimal value in the original series and 1 corresponds to the maximal value.
rolling_time_window
(window_length, step, …)Runs a given function on each rolling window in the dataframe.
rolling_window
(window_size, func, …)Looks at a number of windows of size
window_size
and transforms the data in those windows based on the specifiedfunc
.Converts dataframe to the dataframe of logarithmic returns.
to_prices
(initial_prices, …)Converts a dataframe to the dataframe of prices.
Converts dataframe to the dataframe of simple returns.
Calculates total cumulative return for each column.

exponential_average
(lambda_coeff: float = 0.94) → qf_lib.containers.dataframe.qf_dataframe.QFDataFrame[source]¶ Calculates the exponential average of a dataframe.
 Parameters
lambda_coeff – lambda coefficient
 Returns
smoothed version (exponential average) of the data frame
 Return type

get_frequency
() → Mapping[str, qf_lib.common.enums.frequency.Frequency][source]¶ Attempts to infer the frequency of each column in this dataframe. The analysis uses pandas’ infer_freq, as well as a heuristic to reduce the amount of
Irregular
results.See the implementation of the Frequency.infer_freq function for more information.

min_max_normalized
(original_min_values: Sequence[float] = None, original_max_values: Sequence[float] = None) → qf_lib.containers.dataframe.qf_dataframe.QFDataFrame[source]¶ Normalizes the data using minmax scaling: it maps all the data to the [0;1] range, so that 0 corresponds to the minimal value in the original series and 1 corresponds to the maximal value. It is also possible to specify values which should correspond to 0 and 1 after applying the normalization. It is useful if the same normalization parameters are used to normalize different data.
 Parameters
original_min_values – values which should correspond to 0 after applying the normalization (one value for each column)
original_max_values – values which should correspond to 1 after applying the normalization (one value for each column)
 Returns
dataframe of normalized values
 Return type

rolling_time_window
(window_length: int, step: int, func: Callable[[Union[QFDataFrame, numpy.ndarray]], QFSeries]) → Union[None, QFSeries, QFDataFrame][source]¶ Runs a given function on each rolling window in the dataframe. The content of a rolling window is also a QFDataFrame thus the funciton which should be applied should accept a QFDataFrame as an argument.
The function may return either a QFSeries (then the output of rolling_time_window will be QFDataFrame) or a scalar value (then the output of rolling_time_window will be QFSeries).
The rolling window is moved along the time index (rows).
 Parameters
window_length – number of rows which should be taken into rolling window
step – number of rows by which rolling window should be moved
func – function to apply on each rolling window. If it returns a QFSeries then the output of rolling_time_window() will be a QFDataFrame; if it returns a scalar value, the return value of rolling_time_window() will be a QFSeries
 Returns
None (if the result of running the rolling window was empty) or QFSeries (if the function applied returned scalar value for each window) or QFDataFrame (if the function applied returned QFSeries for each window)
 Return type
None, QFSeries, QFDataFrame

rolling_window
(window_size: int, func: Callable[[Union[QFSeries, numpy.ndarray]], float], step: int = 1, optimised: bool = False) → QFDataFrame[source]¶ Looks at a number of windows of size
window_size
and transforms the data in those windows based on the specifiedfunc
. This is performed for each column inside this data frame.The window indices are stepped at a rate specified by
step
.Warning: The
other
parameter is only present to keep consistency with QFSeries’ rolling_window function, it should always beNone
. Parameters
window_size – The size of the window to look at specified as the number of data points.
func – The function to call during each iteration. When
other
isNone
this function should take oneQFSeries
and return a value (Usually a number such as afloat
). Otherwise, this function should take twoQFSeries
arguments and return a value.step – The amount of data points to step through after each iteration, i.e. how much to move the window by in each iteration.
optimised – Whether the more efficient pandas algorithm should be used for the rolling window application. Note: This has some limitations: The
step
must be 1 andfunc
will get anndarray
parameter which only contains values and no index.
 Returns
data frame containing the transformed data
 Return type

to_log_returns
() → LogReturnsDataFrame[source]¶ Converts dataframe to the dataframe of logarithmic returns. First date of prices in the returns dataframe won’t be present.
 Returns
dataframe of log returns
 Return type

to_prices
(initial_prices: Sequence[float] = None, suggested_initial_date: Union[datetime.datetime, int, float] = None, frequency: qf_lib.common.enums.frequency.Frequency = None) → PricesDataFrame[source]¶ Converts a dataframe to the dataframe of prices. The dataframe of prices returned will have an extra date at the beginning (in comparison to the returns’ dataframe). The difference between the extra date and the rest of the dates can be inferred from the returns’ dataframe or can be calculated using the frequency passed as the optional argument. Additional date at the beginning (so called “initial date”) is caused by the fact, that return for the first date of prices timeseries cannot be calculated, so it’s missing. Thus, during the opposite conversion, extra date at the beginning will be added.
 Parameters
initial_prices – initial price for all timeseries. If no prices are specified, then they will be assumed to be 1. If only one value is passed (instead of a list with values for each column), then the initial price will be the same for each series contained within the dataframe.
suggested_initial_date – the first date or initial value for the prices series. It won’t be necessarily the first date of the price series (e.g. if the method is run on the PricesDataFrame then it won’t be used).
frequency – the frequency of the returns’ timeseries. It is used to infer the initial date for the prices series.
 Returns
dataframe of prices
 Return type
