QFDataFrame
- class qf_lib.containers.dataframe.qf_dataframe.QFDataFrame(data=None, index: Axes | None = None, columns: Axes | None = None, dtype: Dtype | None = None, copy: bool | None = None)[source]
Bases:
DataFrame
,TimeIndexedContainer
Base class for all data frames (2-D matrix-like objects) used in the project. All the columns within the dataframe contain values for the same date range and have the same frequencies. All the columns are of the same types (e.g. log-returns/prices).
Methods:
exponential_average
([lambda_coeff])Calculates the exponential average of a dataframe.
Attempts to infer the frequency of each column in this dataframe.
min_max_normalized
([original_min_values, ...])Normalizes the data using min-max scaling: it maps all the data to the [0;1] range, so that 0 corresponds to the minimal value in the original series and 1 corresponds to the maximal value.
rolling_time_window
(window_length, step, func)Runs a given function on each rolling window in the dataframe.
rolling_window
(window_size, func[, step, ...])Looks at a number of windows of size
window_size
and transforms the data in those windows based on the specifiedfunc
.Converts dataframe to the dataframe of logarithmic returns.
to_prices
([initial_prices, ...])Converts a dataframe to the dataframe of prices.
Converts dataframe to the dataframe of simple returns.
Calculates total cumulative return for each column.
- exponential_average(lambda_coeff: float = 0.94) QFDataFrame [source]
Calculates the exponential average of a dataframe.
- Parameters:
lambda_coeff – lambda coefficient
- Returns:
smoothed version (exponential average) of the data frame
- Return type:
- get_frequency() Mapping[str, Frequency] [source]
Attempts to infer the frequency of each column in this dataframe. The analysis uses pandas’ infer_freq, as well as a heuristic to reduce the amount of
Irregular
results.See the implementation of the Frequency.infer_freq function for more information.
- min_max_normalized(original_min_values: Optional[Sequence[float]] = None, original_max_values: Optional[Sequence[float]] = None) QFDataFrame [source]
Normalizes the data using min-max scaling: it maps all the data to the [0;1] range, so that 0 corresponds to the minimal value in the original series and 1 corresponds to the maximal value. It is also possible to specify values which should correspond to 0 and 1 after applying the normalization. It is useful if the same normalization parameters are used to normalize different data.
- Parameters:
original_min_values – values which should correspond to 0 after applying the normalization (one value for each column)
original_max_values – values which should correspond to 1 after applying the normalization (one value for each column)
- Returns:
dataframe of normalized values
- Return type:
- rolling_time_window(window_length: int, step: int, func: Callable[[Union[QFDataFrame, ndarray]], QFSeries]) Union[None, QFSeries, QFDataFrame] [source]
Runs a given function on each rolling window in the dataframe. The content of a rolling window is also a QFDataFrame thus the funciton which should be applied should accept a QFDataFrame as an argument.
The function may return either a QFSeries (then the output of rolling_time_window will be QFDataFrame) or a scalar value (then the output of rolling_time_window will be QFSeries).
The rolling window is moved along the time index (rows).
- Parameters:
window_length – number of rows which should be taken into rolling window
step – number of rows by which rolling window should be moved
func – function to apply on each rolling window. If it returns a QFSeries then the output of rolling_time_window() will be a QFDataFrame; if it returns a scalar value, the return value of rolling_time_window() will be a QFSeries
- Returns:
None (if the result of running the rolling window was empty) or QFSeries (if the function applied returned scalar value for each window) or QFDataFrame (if the function applied returned QFSeries for each window)
- Return type:
None, QFSeries, QFDataFrame
- rolling_window(window_size: int, func: Callable[[Union[QFSeries, ndarray]], float], step: int = 1, optimised: bool = False) QFDataFrame [source]
Looks at a number of windows of size
window_size
and transforms the data in those windows based on the specifiedfunc
. This is performed for each column inside this data frame.The window indices are stepped at a rate specified by
step
.Warning: The
other
parameter is only present to keep consistency with QFSeries’ rolling_window function, it should always beNone
.- Parameters:
window_size – The size of the window to look at specified as the number of data points.
func – The function to call during each iteration. When
other
isNone
this function should take oneQFSeries
and return a value (Usually a number such as afloat
). Otherwise, this function should take twoQFSeries
arguments and return a value.step – The amount of data points to step through after each iteration, i.e. how much to move the window by in each iteration.
optimised – Whether the more efficient pandas algorithm should be used for the rolling window application. Note: This has some limitations: The
step
must be 1 andfunc
will get anndarray
parameter which only contains values and no index.
- Returns:
data frame containing the transformed data
- Return type:
- to_log_returns() LogReturnsDataFrame [source]
Converts dataframe to the dataframe of logarithmic returns. First date of prices in the returns dataframe won’t be present.
- Returns:
dataframe of log returns
- Return type:
- to_prices(initial_prices: Sequence[float] = None, suggested_initial_date: Union[datetime, int, float] = None, frequency: Frequency = None) PricesDataFrame [source]
Converts a dataframe to the dataframe of prices. The dataframe of prices returned will have an extra date at the beginning (in comparison to the returns’ dataframe). The difference between the extra date and the rest of the dates can be inferred from the returns’ dataframe or can be calculated using the frequency passed as the optional argument. Additional date at the beginning (so called “initial date”) is caused by the fact, that return for the first date of prices timeseries cannot be calculated, so it’s missing. Thus, during the opposite conversion, extra date at the beginning will be added.
- Parameters:
initial_prices – initial price for all timeseries. If no prices are specified, then they will be assumed to be 1. If only one value is passed (instead of a list with values for each column), then the initial price will be the same for each series contained within the dataframe.
suggested_initial_date – the first date or initial value for the prices series. It won’t be necessarily the first date of the price series (e.g. if the method is run on the PricesDataFrame then it won’t be used).
frequency – the frequency of the returns’ timeseries. It is used to infer the initial date for the prices series.
- Returns:
dataframe of prices
- Return type:
- to_simple_returns() SimpleReturnsDataFrame [source]
Converts dataframe to the dataframe of simple returns. First date of prices in the returns timeseries won’t be present.
- Returns:
dataframe of simple returns
- Return type: