DataModel

class qf_lib.common.utils.factorization.data_models.data_model.DataModel(data_model_input: DataModelInput)[source]

Bases: object

Class grouping the results of factorization.

Parameters:: data_model_input – data from which the model is built

Attributes:

`AUTOCORR_MAX_LAG`	int maximal lag used during testing for autocorrelation of the fit; lags used for testing will be values 1, ..., autocorr_max_lag
`AUTOCORR_SIGNIFICANCE_LEVEL`	float significance level for the autocorrelation of the fit test
`autocorrelation`	Extension of Durbin-Watson test to add many lags (1-5).
`coefficients`	Vector of coefficients [beta1, beta2, ...].
`condition_number`	Condition number of a matrix measures the sensitivity of the solution of a system of linear equations to errors in the data.
`cooks_distance_tms`	Cooks distance.
`durbin_watson_test`	Used to test if linear regression residuals are uncorrelated.
`factors_performance_attribution_ret`	Vector containing annualised performance attribution of each factor.
`fit_model`	Structure with a result of multilinear regression (based on all data points and using OLS to calculate coefficients).
`fit_tms_analysis`	TimeseriesAnalysis class based on returns of the fit.
`fitted_tms`	Fitted (predicted) response values based on input data.
`fund_tms_analysis`	TimeseriesAnalysis class based on returns of the analysed fund.
`heteroskedasticity`	Probability of a hypothesis that the error variance doesn't depend on input data (regressors).
`in_sample_and_out_sample_returns`	Returns of a fit based on in-sample coefficients.
`intercept`	Constant alpha (y = beta * x + constant).
`ols_influence`	Class for calculating outliers and influence measures for OLS result.
`oos_start_date`	Date on which the Out-Of-Sample period started (In-Sample vs Out-Of-Sample test).
`r_squared_of_each_predictor`	Concerns about collinearity can be ignored if rSquare is higher than rSquare of each predictor.
`risk_contribution`	Vector containing normalised risk contribution of each factor.
`unexplained_performance_attribution_ret`	Scalar with annualised return unexplained by factors.

AUTOCORR_MAX_LAG = 3: int maximal lag used during testing for autocorrelation of the fit; lags used for testing will be values 1, …, autocorr_max_lag

AUTOCORR_SIGNIFICANCE_LEVEL = 0.05: float significance level for the autocorrelation of the fit test

autocorrelation: Extension of Durbin-Watson test to add many lags (1-5). 0 - not autocorrelated, 1 - autocorrelated.

coefficients: Vector of coefficients [beta1, beta2, …].

condition_number: Condition number of a matrix measures the sensitivity of the solution of a system of linear equations to errors in the data.

cooks_distance_tms: Cooks distance. Used for checking the influence of outliers for the model.

durbin_watson_test: Used to test if linear regression residuals are uncorrelated. Small p-values indicate correlation among residuals.

factors_performance_attribution_ret: Vector containing annualised performance attribution of each factor.

fit_model: Structure with a result of multilinear regression (based on all data points and using OLS to calculate coefficients).

fit_tms_analysis: TimeseriesAnalysis class based on returns of the fit.

fitted_tms: Fitted (predicted) response values based on input data.

fund_tms_analysis: TimeseriesAnalysis class based on returns of the analysed fund.

heteroskedasticity: Probability of a hypothesis that the error variance doesn’t depend on input data (regressors).

in_sample_and_out_sample_returns: Returns of a fit based on in-sample coefficients. Vector with in-sample and out-of-sample simple returns. Its length is equal to length of fitted returns.

intercept: Constant alpha (y = beta * x + constant).

ols_influence: Class for calculating outliers and influence measures for OLS result.

oos_start_date: Date on which the Out-Of-Sample period started (In-Sample vs Out-Of-Sample test).

r_squared_of_each_predictor: Concerns about collinearity can be ignored if rSquare is higher than rSquare of each predictor.

risk_contribution: Vector containing normalised risk contribution of each factor.

unexplained_performance_attribution_ret: Scalar with annualised return unexplained by factors.