CSVDataProvider

class qf_lib.data_providers.csv.csv_data_provider.CSVDataProvider(path: str, tickers: Union[qf_lib.common.tickers.tickers.Ticker, Sequence[qf_lib.common.tickers.tickers.Ticker]], index_col: str, field_to_price_field_dict: Optional[Dict[str, qf_lib.common.enums.price_field.PriceField]] = None, fields: Optional[Union[str, List[str]]] = None, start_date: Optional[datetime.datetime] = None, end_date: Optional[datetime.datetime] = None, frequency: Optional[qf_lib.common.enums.frequency.Frequency] = <Frequency.DAILY: 252>, dateformat: Optional[str] = None, ticker_col: Optional[str] = None)[source]

Bases: qf_lib.data_providers.preset_data_provider.PresetDataProvider

Generic Data Provider that loads csv files. All the files should have a certain naming convention (see Notes). Additionally, the data provider requires providing mapping between header names in the file and corresponding price fields in the form of dictionary where the key is a column name from the file, and the value is a corresponding Price field. Please note that this is required to use get_price method. For example:

Time,Open price,Close Price, … …

Should me mapped as following: {‘Open Price’: PriceField.Open, ‘Close Price’: PriceField.Close, …} in order to have correctly working get_price method that requires PriceFields as the fields.

Parameters
  • path (str) – it should be either path to the directory containing the CSV files or path to the specific file when ticker_col is used and only one file should be loaded

  • tickers (Ticker, Sequence[Ticker]) – one or a list of tickers, used further to download the prices data

  • index_col (str) – Label of the dates / timestamps column, which will be later on used to index the data

  • field_to_price_field_dict (Optional[Dict[str, PriceField]]) – mapping of header to fields. The key is a column name, and the value is a corresponding field. It is requried if we want to map str fields to PriceFields and use get_price method. Please note that mappedd fields will be still available in get_history method using initial str values. All str fields specified as the keys should also be specified in the fields

  • fields (Optional[str, List[str]]) – fields that should be downloaded. By default all fields (columns) are downloaded. Based on field_to_price_field_dict additional columns will be created and available in the get_price method thanks to PriceFields mapping.

  • start_date (Optional[datetime]) – first date to be downloaded

  • end_date (Optional[datetime]) – last date to be downloaded

  • frequency (Optional[Frequency]) – frequency of the data. The parameter is optional, and by default equals to daily Frequency.

  • dateformat (Optional[str]) – the strftime to parse time, e.g. “%d/%m/%Y”. Parameter is Optional and if not provided, the data provider will try to infer the dates format from the data. By default None.

  • ticker_col (Optional[str]) – column name with the tickers

Notes

  • FutureTickers are not supported by this data provider.

  • By default, data for each ticker should be in a separate file named after this tickers’ string representation

(in most cases it is simply its name, to check what is the string representation of a given ticker use Ticker.as_string() function). However, you can also load one file containing all data with specified tickers in one column row by row as it is specified in demo example file daily_data.csv or intraday_data.csv. In order to do so you need to specify the name of the ticker column in ticker_col and specify the path to the file. - Please note that when using ticker_col it is required to provide the path to specific file (loading is not based on ticker names as it is in the default approach) - By providing mapping field_to_price_field_dict you are able to use get_price method which allows you to aggregate intraday data (currently, get_history does not allow using intraday data aggregation)