Star Stuff: ImageMM (MFDeconvolution) - setiastro/setiastrosuitepro GitHub Wiki

Multi-Frame Deconvolution (MFDeconv) – Seti Astro Suite Implementation

Status: Stable core, streaming/tiled variant merged
Context: This page documents how our MFDeconv in Seti Astro Suite / SASpro evolved from the original “simple RL over many aligned frames” into the current, robust, GPU-aware, SR-aware, and Mac-safe implementation.

Related docs:

  • 📄 Our whitepaper: Seti Astro Multi-Frame Deconvolution in the Presence of Outliers and Heterogeneous PSFs(put your URL here)
    https://www.setiastro.com/docs/mfdeconv-whitepaper

  • 📄 Original paper (Sukurdeep et al., 2025): Robust Multi-Frame Deconvolution for Seeing-Limited Astrophotography
    https://iopscience.iop.org/article/10.3847/1538-3881/adfb72/pdf

  • 💻 SAS / SASpro implementation (repo)(link to your GitHub / internal repo page)
    https://github.com/setiastro/setiastrosuitepro
    (see: pro/mfdeconv.py, pro/star_alignment.py, legacy/image_manager.py, pro/runtime_torch.py)


1. Overview

Our MFDeconv is designed for large, aligned astro stacks (tens to hundreds of frames) where:

  1. Each frame can have a slightly different PSF (focus drift, seeing, guiding, filter changes).

  2. Some frames contain impulsive outliers (planes, satellites, sensor glitches).

  3. We want to optionally solve on a super-resolution (SR) grid and back-project corrections.

  4. We want it to run on GPU when available (CUDA / MPS / DirectML) but not crash Macs.

  5. We want it to stream and tile so that laptops with 16 GB can still finish the run.

This is basically the pipeline described in Sukurdeep et al. (2025), but:

  • we tightened the seeding (robust median / σ-clipped running mean),

  • we generalized to FITS and XISF with tile-level reads,

  • we added an early-stop based on actual update magnitudes,

  • we added a GPU → CPU fallback when the device goes busy mid-run.


2. Background and References

  • Sukurdeep et al. (2025) describe a multi-frame RL-style method where each frame contributes a PSF-weighted correction, and robustness is obtained by downweighting large residuals (Huber / Tukey class).
    → Link: https://doi.org/10.xxxx/sukurdeep2025-mfdeconv (placeholder)

  • Our whitepaper rephrases that in Seti Astro terms:

    • “frame” = calibrated + registered light

    • “PSF” = per-frame PSF measured during alignment or synthesized from FWHM

    • “robustness” = σ-clipped or Huber-weighted residuals

    • “streaming” = read just enough of each frame to update the global estimate
      → Link: https://www.setiastro.com/docs/mfdeconv-whitepaper (placeholder)

  • Implementation note: our code lives in SASpro right next to the stacking, star-alignment, and PSF tools so we can reuse:

    • per-frame PSFs from alignment

    • star masks from SEP

    • variance maps from background estimation


3. High-Level Algorithm (what we actually run)

  1. Scan stack

    • Find common intersection size across FITS/XISF (_common_hw_from_paths(...)).

    • Prime XISF cache if any XISF present.

  2. Build / lift PSFs

    • Build per-frame PSFs and their adjoints (flipped PSFs).

    • If SR is requested (r > 1): lift PSFs to SR grid (solve or naïve upsample), force odd size.

  3. Seed (robust)

    • If seed_mode="median" → use tiled median over all frames, FITS/XISF-aware, channel-aware.

    • Else → use σ-clipped bootstrap streaming: average first B, estimate global MAD, masked-mean, then Welford updates.

  4. SR lift (optional)

    • If r > 1, upsample seed to SR grid using sum-preserving upsample.

  5. Iterate (RL/MM style) for iters:

    • For each batch of frames:

      • Load frame(s) as CHW (float32), plus mask and variance if present.

      • Predict with that frame’s PSF on the SR grid.

      • Downsample prediction to native if r > 1.

      • Form residual = (observed − predicted).

      • Build robust weight map = f(residual, mask, variance, huber_delta).

      • Upsample weighted residuals back to SR grid if r > 1.

      • Back-project with adjoint PSF → accumulate num and den.

    • Do multiplicative update
      x ← x * clip(num/den, 1/kappa, kappa)
      and relax:
      x ← (1 − α) x + α x_next

  6. Early stop

    • Track EMA of |update−1| and EMA of relative change.

    • If both small for ≥2 iters and it ≥ min_iters → stop.

  7. Save FITS with full MFDeconv metadata.


### 3a. Robustness: Huber, δ (delta), and κ (kappa)

This is the little “why is it not blowing up?” section.

Our MFDeconv lets you pick ρ (the loss) as either:

  • rho="l2" → plain least-squares (fast, but very sensitive to bad pixels)
  • rho="huber"Huber loss (what we recommend)

What is Huber?

Huber is a loss that is quadratic near 0 (like L2) and linear in the tails (like L1). For a residual ( r ):

[ \text{Huber}_\delta(r) = \begin{cases} \frac{1}{2} r^2, & |r| \le \delta
\delta \bigl(|r| - \frac{1}{2}\delta\bigr), & |r| > \delta \end{cases} ]

  • Small residuals → treated like normal L2 → good for detail.
  • Big residuals → downweighted → good for planes, hot pixels, cosmic rays, bad dither.

From this loss we get a weight ( \frac{\psi(r)}{r} ) that we apply to each pixel’s correction. That’s the thing you see in the code as psi_over_r.

What is δ (delta)?

  • δ is the threshold where we switch from “this is a normal residual” to “this is probably an outlier.”

  • In our code you can:

    • set a fixed huber_delta = 0.01 (example), or
    • set auto by passing negative delta (what the code does in the GPU path): it measures the batch RMS/MAD and picks a δ that matches the actual noise of this stack.

Smaller δ → more aggressive rejection. Bigger δ → more forgiving.

So:

  • If you have a lot of impulsive junk → smaller δ (or keep auto, it will track it).
  • If your data is very clean and you want it to behave like classic RL → bigger δ or even rho="l2".

What is κ (kappa) in our update?

After we build the usual RL/MM ratio

[ \text{ratio} = \frac{\text{num}}{\text{den} + \varepsilon} ]

we do not trust it blindly. We clip it:

upd = clip(ratio, 1/κ, κ)
x   = x * upd
  • κ > 1 limits how much a single iteration can brighten/darken any pixel.
  • Typical values: 2.0 (our default), sometimes 1.5 if you want it really calm.

Lower κ → safer, slower, fewer artifacts, but more iterations. Higher κ → more aggressive, faster local contrast recovery, but easier to ring/blow out.

This is independent of Huber. Huber says “how much do I trust this pixel’s residual?” κ says “how much do I let any pixel change this iteration?”

How do these interact?

  • Big residual + Huber → residual gets downweighted → less influence → stable.
  • Big residual + L2 → residual stays big → can dominate → unstable unless κ is low.
  • Auto-delta Huber + κ≈2 → good general-purpose setting for real astro stacks.

Practical tuning (for the wiki)

  • Noisy, widefield, satellites

    • rho="huber"
    • huber_delta = -2.5 (auto from MAD, 2.5×)
    • kappa = 1.8 → very safe.
  • Good data, want faster convergence

    • rho="huber"
    • huber_delta = -3.5 (more forgiving)
    • kappa = 2.0 – 2.5 → sharper per-iter.
  • Lab / synthetic / no outliers

    • rho="l2"
    • kappa = 2.5 → classic RL feel.

4. Key Additions vs “older normal” MFDeconv

4.1 Robust seeding (two paths)

  • Tiled median (_seed_median_streaming(...)):

    • Reads tiles (256×256) over all frames

    • Works on luma or per-channel

    • Can offload the per-tile median to torch on GPU

    • Uses FITS/XISF tile readers

  • Bootstrap + σ-clipped Welford (_seed_bootstrap_streaming(...)):

    • Average first B frames

    • Estimate global MAD → build mask

    • Masked mean over B

    • Then Welford μ–σ updates with σ-clip on the rest

Why: planes/sats/hot columns in early frames used to poison the seed. This fixes it.


4.2 Memory / streaming model

We introduced a generic helper:

_mem_model(
    grid_hw=(H, W),
    r=sr_factor,
    ksize=psf_size,
    channels=C,
    mem_target_mb=mem_budget,
    prefer_tiles=False,
)

This picks:

  • batch_frames (how many frames to process per iter)

  • tiles (HxW) + halo size

When low_mem=True:

  • disable CPU→GPU prefetch

  • clamp LRU cache to 2

  • use bigger SEP tiles for varmaps

  • shrink star-mask canvas

👉 This is purely to bound peak memory on 16 GB Macs and lower-tier Windows laptops.


4.3 MPS-safe grouped convolution

On Apple Silicon we can’t rely on the “1 big grouped conv with G=B*C” trick. So we added the guarded version:

  • CUDA / CPU: run one grouped conv

  • MPS: loop per-sample, per-channel → slower but correct → no random crashes

This is the part that fixes “it works on Windows (CUDA) but not on Mac.”


4.4 GPU → CPU fallback

If, during an iteration, we hit:

  • CUDA OOM that can’t be fixed by halving batch, or

  • CUDA “device busy / unavailable” type error,

then we:

  1. Switch this iteration to CPU,

  2. Recompute accumulators in NumPy,

  3. From that point on, keep running on CPU.

That way the user doesn’t lose a 30-minute run because Windows decided to hiccup the GPU.


4.5 SR-coherent loop

Old “quick” SR paths just upscaled at the end.
Current path:

  1. PSFs are lifted to SR

  2. Predictions are made on SR

  3. Residuals are formed on native

  4. Adjoint is done on SR

  5. Final image is already SR sized

So SR isn’t cosmetic — it’s in the actual deconvolution math.


5. Math Sketch (Seti Astro wording)

Let:

  • ( x ) = current SR estimate (C×H×W)

  • ( y_i ) = i-th observed frame (C×h×w)

  • ( H_i ) = forward operator = (PSF conv on SR) + (downsample if SR>1)

  • ( H_i^\top ) = adjoint = (upsample if SR>1) + (PSFᵀ conv on SR)

  • ( w_i ) = robust weight map (includes mask & variance)

Then per iter we do:

  1. Forward:
    [
    \hat y_i = H_i x
    ]

  2. Residual:
    [
    r_i = y_i - \hat y_i
    ]

  3. Robust weight (Huber):
    [
    w_i = \frac{\psi(r_i)}{r_i} \cdot m_i \cdot \frac{1}{\sigma_i^2 + \varepsilon}
    ]
    where ( m_i ) is the mask (0…1) and ( \sigma_i^2 ) is the variance map.

  4. Backproject:
    [
    \text{num} \mathrel{+}= H_i^\top ( w_i \odot y_i )
    ]
    [
    \text{den} \mathrel{+}= H_i^\top ( w_i \odot \hat y_i )
    ]

  5. Multiplicative update:
    [
    x \leftarrow x \cdot \text{clip}\left(\frac{\text{num}}{\text{den} + \varepsilon}, \frac{1}{\kappa}, \kappa \right)
    ]

  6. Relax:
    [
    x \leftarrow (1-\alpha) x_{\text{old}} + \alpha x
    ]

This is the same structure as Sukurdeep et al., except:

  • we explicitly model masks,

  • we explicitly model per-frame variances,

  • we allow SR in both forward and adjoint.


6. File I/O specifics (FITS / XISF)

  • We support:

    • 2-D FITS → interpreted as mono

    • 3-D FITS (CHW or HWC) → we detect channel axis

    • XISF → we go through _xisf_cached_array(...) and slice tiles

  • Tile readers:

    • _read_tile_fits(...) – FITS-only, memmaped

    • _read_tile_fits_any(...) – FITS or XISF, figure out axis order, return HW / CHW / HWC

    • _infer_channels_from_tile(...) – reads a 1×1 tile to decide C=1 or 3

Why this is here: this is what makes the tiled median seed possible.


7. Headers we write

When MFDeconv finishes, we write a FITS with at least:

  • MFDECONV = T

  • MF_COLOR = 'luma' | 'rgb' | ...

  • MF_RHO = 'huber' | 'l2'

  • MF_HDEL = <float> (Huber delta; <0 means “auto from RMS”)

  • MF_MASK = T/F (auto star masks used)

  • MF_VAR = T/F (auto variance maps used)

  • MF_SR = <int> (super-resolution factor, 1 = native)

  • MF_SRSIG = <float> (if SR>1)

  • MF_SRIT = <int> (if SR>1)

  • MF_ITMAX = <int> (requested iters)

  • MF_ITERS = <int> (actual iters used, post early-stop)

  • MF_ESTOP = T/F (early stop triggered)

This is so SAS can later show “how was this built?” in the UI.


8. Where this lives in Seti Astro Suite

  • Core logic (what you pasted):

    • pro/mfdeconv.py (or your current name)

    • possibly split helpers in legacy/image_manager.py for FITS/XISF

  • Upstream producers (what feeds it):

    • pro/star_alignment.py → PSFs, transforms

    • pro/stacking_suite.py → calibrated + registered frames

    • pro/runtime_torch.py → uniform torch import on Windows/macOS/Linux

  • Downstream consumers (what reads the result):

    • main doc manager (pro/doc_manager.py)

    • image viewers (Qt) – show the MFDeconv FITS

    • future: drizzle / SR inspect


9. How it maps to the papers

Paper / Doc What we took What we added
Sukurdeep et al. (2025) Per-frame PSF, robust residual weighting, adjoint back-projection SR-coherent loop, per-frame variance maps, SAS FITS/XISF realities
Our MFDeconv Whitepaper SAS-specific terms (light, PSF from alignment, SEP masks), streaming/tiled story GPU→CPU fallback, MPS-safe grouped conv, early-stop heuristics
SAS Implementation all of the above I/O adapters, QProgress feedback, per-iter saves, QSettings-friendly options

10. Changelog (short)

  • 2025-03 – integrated low-mem branch (tiled median, smaller SEP tiles, LRU clamp)

  • 2025-03 – added MPS-safe grouped conv (fixes mac)

  • 2025-03 – added GPU→CPU fallback (keeps run alive)

  • 2025-03 – unified SR path (PSF, forward, adjoint all on SR grid)

  • 2025-04 – documented in this wiki


11. Links

  • Whitepaper (Seti Astro):
    https://www.setiastro.com/docs/mfdeconv-whitepaper

  • Sukurdeep et al. 2025 (original):
    https://iopscience.iop.org/article/10.3847/1538-3881/adfb72/pdf

  • SAS / SASpro implementation:
    https://github.com/setiastro/setiastrosuitepro/tree/main/pro
    (look for mfdeconv, runtime_torch, and FITS/XISF helpers)


If you drop this into the wiki as-is, just replace the three placeholder links above with the real ones and it’ll read clean.

⚠️ **GitHub.com Fallback** ⚠️