Star Stuff: ImageMM (MFDeconvolution) - setiastro/setiastrosuitepro GitHub Wiki
Status: Stable core, streaming/tiled variant merged
Context: This page documents how our MFDeconv in Seti Astro Suite / SASpro evolved from the original “simple RL over many aligned frames” into the current, robust, GPU-aware, SR-aware, and Mac-safe implementation.
Related docs:
-
📄 Our whitepaper: Seti Astro Multi-Frame Deconvolution in the Presence of Outliers and Heterogeneous PSFs – (put your URL here)
https://www.setiastro.com/docs/mfdeconv-whitepaper -
📄 Original paper (Sukurdeep et al., 2025): Robust Multi-Frame Deconvolution for Seeing-Limited Astrophotography –
https://iopscience.iop.org/article/10.3847/1538-3881/adfb72/pdf -
💻 SAS / SASpro implementation (repo) – (link to your GitHub / internal repo page)
https://github.com/setiastro/setiastrosuitepro
(see:pro/mfdeconv.py,pro/star_alignment.py,legacy/image_manager.py,pro/runtime_torch.py)
Our MFDeconv is designed for large, aligned astro stacks (tens to hundreds of frames) where:
-
Each frame can have a slightly different PSF (focus drift, seeing, guiding, filter changes).
-
Some frames contain impulsive outliers (planes, satellites, sensor glitches).
-
We want to optionally solve on a super-resolution (SR) grid and back-project corrections.
-
We want it to run on GPU when available (CUDA / MPS / DirectML) but not crash Macs.
-
We want it to stream and tile so that laptops with 16 GB can still finish the run.
This is basically the pipeline described in Sukurdeep et al. (2025), but:
-
we tightened the seeding (robust median / σ-clipped running mean),
-
we generalized to FITS and XISF with tile-level reads,
-
we added an early-stop based on actual update magnitudes,
-
we added a GPU → CPU fallback when the device goes busy mid-run.
-
Sukurdeep et al. (2025) describe a multi-frame RL-style method where each frame contributes a PSF-weighted correction, and robustness is obtained by downweighting large residuals (Huber / Tukey class).
→ Link:https://doi.org/10.xxxx/sukurdeep2025-mfdeconv(placeholder) -
Our whitepaper rephrases that in Seti Astro terms:
-
“frame” = calibrated + registered light
-
“PSF” = per-frame PSF measured during alignment or synthesized from FWHM
-
“robustness” = σ-clipped or Huber-weighted residuals
-
“streaming” = read just enough of each frame to update the global estimate
→ Link:https://www.setiastro.com/docs/mfdeconv-whitepaper(placeholder)
-
-
Implementation note: our code lives in SASpro right next to the stacking, star-alignment, and PSF tools so we can reuse:
-
per-frame PSFs from alignment
-
star masks from SEP
-
variance maps from background estimation
-
-
Scan stack
-
Find common intersection size across FITS/XISF (
_common_hw_from_paths(...)). -
Prime XISF cache if any XISF present.
-
-
Build / lift PSFs
-
Build per-frame PSFs and their adjoints (flipped PSFs).
-
If SR is requested (r > 1): lift PSFs to SR grid (solve or naïve upsample), force odd size.
-
-
Seed (robust)
-
If
seed_mode="median"→ use tiled median over all frames, FITS/XISF-aware, channel-aware. -
Else → use σ-clipped bootstrap streaming: average first B, estimate global MAD, masked-mean, then Welford updates.
-
-
SR lift (optional)
-
If r > 1, upsample seed to SR grid using sum-preserving upsample.
-
-
Iterate (RL/MM style) for
iters:-
For each batch of frames:
-
Load frame(s) as CHW (float32), plus mask and variance if present.
-
Predict with that frame’s PSF on the SR grid.
-
Downsample prediction to native if r > 1.
-
Form residual = (observed − predicted).
-
Build robust weight map = f(residual, mask, variance, huber_delta).
-
Upsample weighted residuals back to SR grid if r > 1.
-
Back-project with adjoint PSF → accumulate num and den.
-
-
Do multiplicative update
x ← x * clip(num/den, 1/kappa, kappa)
and relax:
x ← (1 − α) x + α x_next
-
-
Early stop
-
Track EMA of |update−1| and EMA of relative change.
-
If both small for ≥2 iters and
it ≥ min_iters→ stop.
-
-
Save FITS with full MFDeconv metadata.
### 3a. Robustness: Huber, δ (delta), and κ (kappa)
This is the little “why is it not blowing up?” section.
Our MFDeconv lets you pick ρ (the loss) as either:
-
rho="l2"→ plain least-squares (fast, but very sensitive to bad pixels) -
rho="huber"→ Huber loss (what we recommend)
Huber is a loss that is quadratic near 0 (like L2) and linear in the tails (like L1). For a residual ( r ):
[
\text{Huber}_\delta(r) =
\begin{cases}
\frac{1}{2} r^2, & |r| \le \delta
\delta \bigl(|r| - \frac{1}{2}\delta\bigr), & |r| > \delta
\end{cases}
]
- Small residuals → treated like normal L2 → good for detail.
- Big residuals → downweighted → good for planes, hot pixels, cosmic rays, bad dither.
From this loss we get a weight ( \frac{\psi(r)}{r} ) that we apply to each pixel’s correction. That’s the thing you see in the code as psi_over_r.
-
δ is the threshold where we switch from “this is a normal residual” to “this is probably an outlier.”
-
In our code you can:
- set a fixed
huber_delta = 0.01(example), or - set auto by passing negative delta (what the code does in the GPU path): it measures the batch RMS/MAD and picks a δ that matches the actual noise of this stack.
- set a fixed
Smaller δ → more aggressive rejection. Bigger δ → more forgiving.
So:
- If you have a lot of impulsive junk → smaller δ (or keep auto, it will track it).
- If your data is very clean and you want it to behave like classic RL → bigger δ or even
rho="l2".
After we build the usual RL/MM ratio
[ \text{ratio} = \frac{\text{num}}{\text{den} + \varepsilon} ]
we do not trust it blindly. We clip it:
upd = clip(ratio, 1/κ, κ)
x = x * upd
- κ > 1 limits how much a single iteration can brighten/darken any pixel.
- Typical values: 2.0 (our default), sometimes 1.5 if you want it really calm.
Lower κ → safer, slower, fewer artifacts, but more iterations. Higher κ → more aggressive, faster local contrast recovery, but easier to ring/blow out.
This is independent of Huber. Huber says “how much do I trust this pixel’s residual?” κ says “how much do I let any pixel change this iteration?”
- Big residual + Huber → residual gets downweighted → less influence → stable.
- Big residual + L2 → residual stays big → can dominate → unstable unless κ is low.
- Auto-delta Huber + κ≈2 → good general-purpose setting for real astro stacks.
-
Noisy, widefield, satellites
rho="huber"-
huber_delta = -2.5(auto from MAD, 2.5×) -
kappa = 1.8→ very safe.
-
Good data, want faster convergence
rho="huber"-
huber_delta = -3.5(more forgiving) -
kappa = 2.0 – 2.5→ sharper per-iter.
-
Lab / synthetic / no outliers
rho="l2"-
kappa = 2.5→ classic RL feel.
-
Tiled median (
_seed_median_streaming(...)):-
Reads tiles
(256×256)over all frames -
Works on luma or per-channel
-
Can offload the per-tile median to torch on GPU
-
Uses FITS/XISF tile readers
-
-
Bootstrap + σ-clipped Welford (
_seed_bootstrap_streaming(...)):-
Average first B frames
-
Estimate global MAD → build mask
-
Masked mean over B
-
Then Welford μ–σ updates with σ-clip on the rest
-
Why: planes/sats/hot columns in early frames used to poison the seed. This fixes it.
We introduced a generic helper:
_mem_model(
grid_hw=(H, W),
r=sr_factor,
ksize=psf_size,
channels=C,
mem_target_mb=mem_budget,
prefer_tiles=False,
)
This picks:
-
batch_frames (how many frames to process per iter)
-
tiles (HxW) + halo size
When low_mem=True:
-
disable CPU→GPU prefetch
-
clamp LRU cache to 2
-
use bigger SEP tiles for varmaps
-
shrink star-mask canvas
👉 This is purely to bound peak memory on 16 GB Macs and lower-tier Windows laptops.
On Apple Silicon we can’t rely on the “1 big grouped conv with G=B*C” trick. So we added the guarded version:
-
CUDA / CPU: run one grouped conv
-
MPS: loop per-sample, per-channel → slower but correct → no random crashes
This is the part that fixes “it works on Windows (CUDA) but not on Mac.”
If, during an iteration, we hit:
-
CUDA OOM that can’t be fixed by halving batch, or
-
CUDA “device busy / unavailable” type error,
then we:
-
Switch this iteration to CPU,
-
Recompute accumulators in NumPy,
-
From that point on, keep running on CPU.
That way the user doesn’t lose a 30-minute run because Windows decided to hiccup the GPU.
Old “quick” SR paths just upscaled at the end.
Current path:
-
PSFs are lifted to SR
-
Predictions are made on SR
-
Residuals are formed on native
-
Adjoint is done on SR
-
Final image is already SR sized
So SR isn’t cosmetic — it’s in the actual deconvolution math.
Let:
-
( x ) = current SR estimate (C×H×W)
-
( y_i ) = i-th observed frame (C×h×w)
-
( H_i ) = forward operator = (PSF conv on SR) + (downsample if SR>1)
-
( H_i^\top ) = adjoint = (upsample if SR>1) + (PSFᵀ conv on SR)
-
( w_i ) = robust weight map (includes mask & variance)
Then per iter we do:
-
Forward:
[
\hat y_i = H_i x
] -
Residual:
[
r_i = y_i - \hat y_i
] -
Robust weight (Huber):
[
w_i = \frac{\psi(r_i)}{r_i} \cdot m_i \cdot \frac{1}{\sigma_i^2 + \varepsilon}
]
where ( m_i ) is the mask (0…1) and ( \sigma_i^2 ) is the variance map. -
Backproject:
[
\text{num} \mathrel{+}= H_i^\top ( w_i \odot y_i )
]
[
\text{den} \mathrel{+}= H_i^\top ( w_i \odot \hat y_i )
] -
Multiplicative update:
[
x \leftarrow x \cdot \text{clip}\left(\frac{\text{num}}{\text{den} + \varepsilon}, \frac{1}{\kappa}, \kappa \right)
] -
Relax:
[
x \leftarrow (1-\alpha) x_{\text{old}} + \alpha x
]
This is the same structure as Sukurdeep et al., except:
-
we explicitly model masks,
-
we explicitly model per-frame variances,
-
we allow SR in both forward and adjoint.
-
We support:
-
2-D FITS → interpreted as mono
-
3-D FITS (CHW or HWC) → we detect channel axis
-
XISF → we go through
_xisf_cached_array(...)and slice tiles
-
-
Tile readers:
-
_read_tile_fits(...)– FITS-only, memmaped -
_read_tile_fits_any(...)– FITS or XISF, figure out axis order, return HW / CHW / HWC -
_infer_channels_from_tile(...)– reads a 1×1 tile to decide C=1 or 3
-
Why this is here: this is what makes the tiled median seed possible.
When MFDeconv finishes, we write a FITS with at least:
-
MFDECONV = T -
MF_COLOR = 'luma' | 'rgb' | ... -
MF_RHO = 'huber' | 'l2' -
MF_HDEL = <float>(Huber delta; <0 means “auto from RMS”) -
MF_MASK = T/F(auto star masks used) -
MF_VAR = T/F(auto variance maps used) -
MF_SR = <int>(super-resolution factor, 1 = native) -
MF_SRSIG = <float>(if SR>1) -
MF_SRIT = <int>(if SR>1) -
MF_ITMAX = <int>(requested iters) -
MF_ITERS = <int>(actual iters used, post early-stop) -
MF_ESTOP = T/F(early stop triggered)
This is so SAS can later show “how was this built?” in the UI.
-
Core logic (what you pasted):
-
pro/mfdeconv.py(or your current name) -
possibly split helpers in
legacy/image_manager.pyfor FITS/XISF
-
-
Upstream producers (what feeds it):
-
pro/star_alignment.py→ PSFs, transforms -
pro/stacking_suite.py→ calibrated + registered frames -
pro/runtime_torch.py→ uniform torch import on Windows/macOS/Linux
-
-
Downstream consumers (what reads the result):
-
main doc manager (
pro/doc_manager.py) -
image viewers (Qt) – show the MFDeconv FITS
-
future: drizzle / SR inspect
-
| Paper / Doc | What we took | What we added |
|---|---|---|
| Sukurdeep et al. (2025) | Per-frame PSF, robust residual weighting, adjoint back-projection | SR-coherent loop, per-frame variance maps, SAS FITS/XISF realities |
| Our MFDeconv Whitepaper | SAS-specific terms (light, PSF from alignment, SEP masks), streaming/tiled story | GPU→CPU fallback, MPS-safe grouped conv, early-stop heuristics |
| SAS Implementation | all of the above | I/O adapters, QProgress feedback, per-iter saves, QSettings-friendly options |
-
2025-03 – integrated low-mem branch (tiled median, smaller SEP tiles, LRU clamp)
-
2025-03 – added MPS-safe grouped conv (fixes mac)
-
2025-03 – added GPU→CPU fallback (keeps run alive)
-
2025-03 – unified SR path (PSF, forward, adjoint all on SR grid)
-
2025-04 – documented in this wiki
-
Whitepaper (Seti Astro):
https://www.setiastro.com/docs/mfdeconv-whitepaper -
Sukurdeep et al. 2025 (original):
https://iopscience.iop.org/article/10.3847/1538-3881/adfb72/pdf -
SAS / SASpro implementation:
https://github.com/setiastro/setiastrosuitepro/tree/main/pro
(look formfdeconv,runtime_torch, and FITS/XISF helpers)
If you drop this into the wiki as-is, just replace the three placeholder links above with the real ones and it’ll read clean.