06 ‐ Infrastructure & Deployment - IMGitH/dvmax GitHub Wiki

Development Options

  • Google Colab (12 hr, T4 GPU optional, 16GB RAM)
  • Kaggle Kernels (9 hr, CPU only, no card needed)
  • Local dev: Polars + Parquet

Deployment Options

  • Periodic local script or cron job (e.g. quarterly)
  • Lightweight API via Flask or FastAPI
  • Use Docker for consistent environments

Data Format & Storage

  • Store all structured datasets in Parquet:
/data
  /prices/
    prices_raw.parquet
    prices_weekly_agg.parquet
  /fundamentals/
    fundamentals_q.parquet
  /dividends/
    dividend_history_q.parquet
  /macro/
    interest_rates.parquet
  /features/
    stock_features_YYYYQX.parquet

Tools

  • Polars + PyArrow for pipelines
  • DuckDB (optional) for SQL queries on Parquet
  • GitHub Actions + Docker for CI

Refresh Schedule

  • Weekly: prices (for returns, vol, drawdowns)
  • Quarterly: fundamentals, dividends, macro