CSVDataProvider

class qf_lib.data_providers.csv.csv_data_provider.CSVDataProvider(path: str, tickers: Union[Ticker, Sequence[Ticker]], index_col: str, field_to_price_field_dict: Optional[Dict[str, PriceField]] = None, fields: Optional[Union[str, List[str]]] = None, start_date: Optional[datetime] = None, end_date: Optional[datetime] = None, frequency: Optional[Frequency] = Frequency.DAILY, dateformat: Optional[str] = None, ticker_col: Optional[str] = None)[source]

Bases: PresetDataProvider

Generic Data Provider that loads csv files. All the files should have a certain naming convention (see Notes). Additionally, the data provider requires providing mapping between header names in the file and corresponding price fields in the form of dictionary where the key is a column name in the csv file, and the value is a corresponding PriceField. Please note that this is required to use get_price method

Parameters:

path (str) – it should be either path to the directory containing the CSV files or path to the specific file when ticker_col is used and only one file should be loaded
tickers (Ticker, Sequence[Ticker]) – one or a list of tickers, used further to download the prices data
index_col (str) – Label of the dates / timestamps column, which will be later on used to index the data. No need to repeat it in the fields.
field_to_price_field_dict (Optional[Dict[str, PriceField]]) – mapping of header names to PriceField. It is required to call get_price method which uses PriceField enum. In the mapping, the key is a column name, and the value is a corresponding PriceField. for example if header for open price is called ‘Open price’ put mapping {‘Open price’: Pricefield:Open} Preferably map all: open, high, low, close to corresponding price fields.
fields (Optional[str, List[str]]) – all columns that will be loaded to the CSVDataProvider from given file. these fields will be available in get_history method. By default all fields (columns) are loaded.
start_date (Optional[datetime]) – first date to be downloaded
end_date (Optional[datetime]) – last date to be downloaded
frequency (Optional[Frequency]) – frequency of the data. The parameter is optional, and by default equals to daily Frequency.
dateformat (Optional[str]) – the strftime to parse time, e.g. “%d/%m/%Y”. Parameter is Optional and if not provided, the data provider will try to infer the dates format from the data. By default None.
ticker_col (Optional[str]) – column name with the tickers

Notes

FutureTickers are not supported by this data provider.
By default, data for each ticker should be in a separate file named after this tickers’ string representation

(in most cases it is simply its name, to check what is the string representation of a given ticker use Ticker.as_string() function). However, you can also load one file containing all data with specified tickers in one column row by row as it is specified in demo example file daily_data.csv or intraday_data.csv. In order to do so you need to specify the name of the ticker column in ticker_col and specify the path to the file. - Please note that when using ticker_col it is required to provide the path to specific file (loading is not based on ticker names as it is in the default approach) - By providing mapping field_to_price_field_dict you are able to use get_price method which allows you to aggregate intraday data (currently, get_history does not allow using intraday data aggregation)

Example

start_date = str_to_date(“2018-01-01”) end_date = str_to_date(“2022-01-01”)

index_column = ‘Open time’ field_to_price_field_dict = {

‘Open’: PriceField.Open, ‘High’: PriceField.High, ‘Low’: PriceField.Low, ‘Close’: PriceField.Close, ‘Volume’: PriceField.Volume,

}

tickers = #create your ticker here. ticker.as_string() should match file name, unless you specify ticker_col path = “C:data_dir” data_provider = CSVDataProvider(path, tickers, index_column, field_to_price_field_dict, start_date,

end_date, Frequency.MIN_1)