Data Management - supertypeai/sectors-kb GitHub Wiki

Data Retention for each symbol

  • Last 6 years from the latest annual date in idx_financials_annual
  • Last 7 quarters from the latest quarter date in idx_financials_quarterly
  • Last 5 years from today's date in idx_daily_data
  • Last 5 years from today's date in idx_historical_mcap
  • Last 5 years from the latest dividend date in idx_dividend

Data Rules

  • idx_calc_metrics_quarter: For TTM, a symbol has to have a minimum of latest quarter date - last 2 quarter date. Example: If latest quarter is 2023-09-30 (Q3 2023) then a symbol has to have 2023-03-31 (Q1 2023) as the latest date to be relevant.
  • idx_calc_metrics_quarter: Calculation is derived from annual financial report in case the latest date of annual financial report is greater than or equal to the the latest date of quarterly financial report.
  • idx_financials_annual and idx_financials_quarterly: Each symbol can only have one data source (either YF or WSJ). For wsj_format of 3 (insurance) or 4 (banking), WSJ is the preferred data source because some of the important metrics are not available in YF API. For wsj_format of 1 (general) or -1 (symbols not available in WSJ), YF is the preferred data source due to its more convenient data retrieval through API, as opposed to web scraping. Refer to this section for more details on the format explanation.

Data Validations

  • On inserting data into idx_financials_annual and idx_financials_quarterly, delete data based on data retention above.
  • On inserting data into idx_daily_data, delete data based on data retention above.