ML4EO ‐ MVP - jejjohnson/research_journal_v2 GitHub Wiki
Interpolation
Ideas
- Patch-Based GPR
- EOF Priors with Probabilistic PCA
- GP Priors with Conditional Flows
- L2 Data Priors with Conditional Flows
- L3 Data Priors with Conditional Flows
- Deep Equilibrium Models
- Dynamical Emulators
Gaussian Process
The upside of interpolation methods is that they are physically consistent, i.e., things that are closer should be similar. The downside of this is that it isn’t causally consistent. Especially with respect to time. In addition, coordinate-based methods have trouble capturing multiscale activities because this requires many many samples which becomes expensive very fast.
EOF
We use a the classic DINEOF algorithm to perform gap-filling on missing data.
Resources
- EUMETSAT Training
- Linear Algebra with xarray - einstats
- Computational Linear Algebra - FastAI
- Randomized SVD - scikit-learn | Cola - TBD | rSVD - JAX
- Truncated SVD - scikit-learn
- PCA - scikit-learn
- Many Matrix Factorization & Decomposition Algorithms - scikit-learn
- Scalable Eigenvalue Decomposition on GPU - Cola - Docs
- Adjoint Matrix Linear Operator - Cola - Adjoint | Cola - Linear Operator Method
- Overview - POD, DMD, etc - libROM
Missing Data Challenges
- Losses with masks - MVN - BayesNewton | numpyro
- Initialization with missing data
- Land-Ocean Masks - xeofs - nan types | xeofs - Sanitizer
Baseline
- Classic DINEOF algorithm
- Iterative Updates
- Scalable Iterative Eigenvalue Decomposition
- Covariance Matrix Regularizers (Laplacian)
- Equilibrium Model Formulation
Examples
- Simplest Example (no fast eigenvalue solver) - tieof
- Simple Example (no fast eigenvalue solver) - PyPlume
Stategy
- Parameter Estimation w/ Probabilistic PCA
- State Estimation w/ PPCA AutoEncoder
- Latent State Estimation w/ PPCA Decoder
Field Initialization
- Mean
- Partial Convolutions - Astropy
Weight Initialization
- scikit-learn - PCA
Tutorials - GP
- Locality - K-Nearest Neighbours (Unstructured, Semantic) vs Radius Neighbours (Structured)
- Weighted Distances - Inverse, RBF
- Scale - Algorithm (KD-Tree, Ball-Tree, R-Tree, PyNNDescent)
- Scale - Hardware (Parallel CPU, GPU)
- Kernel Density Estimation
- GPs from scratch
- GPs with a PPL
- Scale - Algorithm (Subset, Approximate Kernel - inversion, logdet), Hardware (GPU)
- Reduced Points - Sparse GP (Fixed vs Variable)
- Locality at Scale - Patches - Split-Apply-Combine (Patch Size, Stride, DataLoader, Weighted Stitching)
- MegaScale Patching from the cloud
- Patches - MegaScale Combination with memory issues
- Patches vs Neighbours
- Patch-Based - GP & SparseGP Interpolation
- Linear Regression
- Basis Function - Polynomial, RBF, Spherical Harmonics
- Neural Fields
Tutorials
In this tutorial, we will look at feature extractors as a way to fill in the gaps. we will start with the simplest method: PCA/EOFs/POD which is a parametric linear method. We will apply this method to missing data. Afterwards, we will enhance this method by using more non-linear representations.
- PCA From Scratch - EOFs Perspective - SVD or Eigs | A Tutorial on the Proper Orthogonal Decomposition
- Scalable PCA From Scratch - Jax + GPU + Randomized PCA | Randomized Eigs
- PCA w/ scikit-learn - API, Scale
- PCA as a minimization problem -
- PCA w/ Observation Operator (Missing Data) - DEQ vs BiLevelOpt | Amortization Tutorial
- DINEOF w/ Missing Data
- AE with missing data - CNN AE Denoiser - Keras | MAE - keras
- PPCA from scratch - PPCA w/ EM | has PPCA w/ EM - Tipping | PPCA w/ EM (clear)
- PPCA w/ Missing Data - PPCA + EM
- PPCA with Numpyro - MLE, MAP, VI, MCMC
- state estimation with PPCA AE
- latent state estimation with PPCA Decoder
- VAE from scratch
- VAE generalization - Conditional Flow Model w/ Stochastic Transformations
- state estimation w/ Conditional Flow
- Latent State Estimation w/ Conditional Flow
- Patch-Based Conditional Flow
Tutorials - Dynamical Models
In this tutorial, we will work with dynamical models. We will look at the anatomy of a state space model to understand all of the pieces. Then we will look at the Dynamic Mode Decomposition as the simplest start. Then we will look at some more non-linear structures taking inspiration from Numerical methods.
- Anatomy of a State Space Model - Initial Condition, Dynamical Model, Measurement Model
- DMD From Scratch - Video
- Scaling -> Randomized SVD, Scalable Eigenvalue Decomposition
- DMD as a Minimization Problem - OptDMD
- DMD w/ Missing Observations
- Markovian Gaussian Process
- Conditional Flows