Fast and Scalable Dynamic Quantile Models - rstats-gsoc/gsoc2025 GitHub Wiki

This page seeks experienced mentors to collaborate on improving and extending the exDQLM package, a Bayesian framework for dynamic quantile regression with an emphasis on scalability, Variational Bayes inference, and computational efficiency via C++ acceleration.


Project Background

Time series modeling is essential for accurate forecasting in areas such as climate science, economics, hydrology, and finance. Traditional forecasting methods, primarily focused on mean-based inference, often fail to capture the critical dynamics at the tails of distributions. Quantile regression addresses this by providing insights across the entire conditional distribution of a response variable.

Bayesian dynamic quantile regression extends these benefits by integrating Bayesian inference within a state-space modeling context, offering probabilistic forecasts that adapt dynamically over time. The current exDQLM package employs the Extended Asymmetric Laplace (exAL) distribution combined with Dynamic Linear Models (DLMs) to deliver scalable Bayesian quantile inference, but significant improvements and optimizations are needed.


Comparison with Existing Packages

The table below summarizes current R packages and demonstrates how the proposed improvements to exDQLM address existing gaps:

Feature quantreg dynquant SPQR qrjoint bayesQR lqr exDQLM
Frequentist Approach ✔️ ✔️ ✖️ ✔️ ✖️ ✔️ ✔️
Bayesian Inference ✖️ ✖️ ✔️ ✔️ ✔️ ✖️ ✔️
Time-Dependent Data ✖️ ✔️ ✖️ ✖️ ✖️ ✖️ ✔️
Covariates ✔️ ✖️ ✔️ ✖️ ✔️ ✔️ ✔️
Scalability (Large Data) ✖️ ✔️ ✔️ ✖️ ✔️ ✔️ ✔️
Non-Linear Regression ✔️ ✖️ ✔️ ✖️ ✖️ ✖️ ✖️
Multivariate Response ✔️ ✖️ ✖️ ✖️ ✖️ ✖️ ✔️
Missing Data Handling ✖️ ✖️ ✖️ ✖️ ✖️ ✔️ ✔️
Non-Crossing Quantiles ✖️ ✖️ ✔️ ✔️ ✖️ ✖️ ✔️

Proposed Enhancements to exDQLM

This project aims to substantially enhance exDQLM by addressing the following key areas:

Area Description
Variational Bayes (VB) Inference Implement VB inference using Laplace/Delta approximations for non-conjugate priors (Wang & Blei, 2012), ensuring fast and scalable posterior updates.
C++ Acceleration Refactor core routines (e.g., Kalman filtering and smoothing) in C++ using Rcpp and RcppParallel for computational efficiency.
Modular Model Structure Decompose the model into customizable components—trend, seasonal, and regression—each with its own evolution and observation matrices.
Posterior Predictive Quantile Synthesis (PPQS) Ensure coherent non-crossing quantiles by synthesizing multiple quantile forecasts into a unified posterior predictive distribution.
Multivariate Extensions Extend the model to handle multivariate time series, enabling joint quantile estimation across several response variables.
Hyperparameter Specification Clearly specify informative priors using inverse gamma and truncated location–scale Student’s t-distributions (Yan & Kottas, 2017).
Discount Factor Matrix Implement adaptive evolution covariance updates using discount factors, enhancing the model’s flexibility in dynamic settings.
Comprehensive Testing & Documentation Provide thorough unit tests, vignettes, and documentation using tools such as testthat and roxygen2.

Expected Impact of the Project

The improved exDQLM package will significantly benefit various research areas and applications:

  • Environmental Modeling: Enhanced predictions of extreme climate and hydrological events.
  • Financial Forecasting: More robust estimation of tail risks and Value-at-Risk (VaR).
  • Health Economics: Improved modeling of uncertain outcomes and costs in healthcare.

Call for Mentors

We are actively seeking experienced mentors who can support this ambitious project. If you are interested in mentoring, please add your details below or contact the contributor directly.

Mentor Name Role (Evaluating/Co-Mentor) Email Institution
Raquel Barata Evaluating Mentor [email protected] Industry
Rebecca Killick Co-Mentor [email protected] Lancaster University, UK

Qualification Tests for Contributors

Contributors interested in working on this project should complete at least one of these tests:

Test Tasks

Easy: Kalman Filtering and Smoothing in C++

  • Implement Univariate Kalman Filtering (KF) and Kalman Smoothing (KS) algorithms in C++.
  • Develop an R interface via Rcpp and RcppArmadillo.
  • Use the implementation to fit a Normal Dynamic Linear Model (DLM).
  • Compare results with the R package dlm to validate correctness.
  • Optimize computation using robust matrix factorization techniques (Cholesky, QR, SVD).

Medium: Implementation of the exAL Distribution in C++

  • Implement the Extended Asymmetric Laplace (exAL) distribution in C++ following Yan & Kottas (2017).
  • Implement and expose the following functions to R using Rcpp:
    • dexal(x, p0, mu, sigma, gamma, log = FALSE): Density function.
    • pexal(q, p0, mu, sigma, gamma, lower.tail = TRUE, log.p = FALSE): CDF function.
    • qexal(p, p0, mu, sigma, gamma, lower.tail = TRUE, log.p = FALSE): Quantile function.
    • rexal(n, p0, mu, sigma, gamma): Random sampling function.
  • Ensure numerical stability and efficiency of the C++ implementation.
  • Validate implementation by comparing against analytical properties.

Hard: Bayesian Dynamic Quantile Regression via VB

  • Implement an univariate Bayesian dynamic quantile regression using the exAL distribution in R via Variational Bayes (VB), following Barata et al. (2022).
  • Use Laplace/Delta approximation for VB inference of non-conjugate parameters, following Wang & Blei (2012).
  • Compare model performance and inference results against the R package dynquant.
  • Validate posterior convergence, quantile estimates, and parameter inference accuracy.

Current Contributor Test Solutions

Contributor Name GitHub Profile Test Status
Antonio Aguirre AntonioAPDL Medium Test CompletedHard Tests In-Progress

Additional Resources & Documentation


Contact Information

Feel free to contact Antonio directly to express interest, offer mentorship, or discuss potential collaboration.