# ELBDM Spectral Solver - hyschive/gamer-fork GitHub Wiki

# Spectral Solver for the Schrödinger Equation in GAMER

## Setup and Configuration

### Compilation Options

**Wave Scheme Selection**: To enable the spectral solver, set`WAVE_SCHEME`

to`WAVE_GRAMFE`

in the Makefile or configure with`--wave_scheme=WAVE_GRAMFE`

for the ELBDM model.**Spectral Scheme Options**: Choose between`GRAMFE_MATMUL`

(faster for`PATCH_SIZE=8`

) and`GRAMFE_FFT`

(faster for larger patch sizes) by setting`GRAMFE_SCHEME`

in the Makefile or configure with`--gramfe_scheme`

accordingly.**Additional Libraries**: The CPU version requires FFTW 3 for single precision and FFTW 2/3 for double precision. The GPU version of`GRAMFE_FFT`

needs the cuFFTDx library. Set`FFTW2_PATH`

,`FFTW3_PATH`

, and`CUFFTDX_PATH`

in the Makefile or configuration files`*.config`

.

### Precision and Performance Options

**Floating Point Precision**: Controlled by`GRAMFE_ENABLE_SINGLE_PRECISION`

. The default is double precision. Single precision is not recommended due to stability issues.**Extension Parameters**: Set via`GRAMFE_ORDER`

,`GRAMFE_NDELTA`

, and`GRAMFE_ND`

in`Macro.h`

. New extensions can be added in`GramExtensionTables.h`

.

## Advantages and Disadvantages

### Advantages

**Higher Accuracy**: Achieves greater accuracy than the inbuilt finite-difference scheme.**Efficient FFT Performance**: Optimized for factorizations into small prime numbers.

### Disadvantages

**Computational Cost**: Slightly larger computational cost compared to the finite-difference scheme.**GPU Solver Limitations**: GPU solver is slower on non-HPC GPUs for required double-precision operations.**No Mass Conservation**: Scheme is inherently not conservative. (Mass can move from physical into extended domain.)**Stability only Empirical**: Requires filter because of spectral blocking.

## Test problems

- Observe differences in accuracy for all inbuilt test problems that output L1 error
- Check test problem
`RestrictionMismatch`

to see how spectral solver can reduce artifacts- It features a traveling Gaussian wave refined around the center.
- The following plots generated with the default visualisation scripts of the test problem highlight the mismatch between the wave function on level 0 and level 1 before and after time evolution.
- Using the finite-difference scheme on all levels, there is a large mismatch between the evolution of the wave function on different levels.
- Using the base-level spectral method together with the finite-difference scheme on level 1 reduces the mismatch slightly.
- Using the base-level spectral method together with the local spectral scheme on level 1 significantly reduces artifacts.

## Implementation Details

### Overview

- The spectral solver is based on the Gram-Fourier extension algorithm, offering a local spectral approach to the Schrödinger equation.

### Gram-Fourier extension algorithm

- Given a non-periodic function on $[0, L]$, how to find a periodic extension on $[0, L + d]$?
- Optimise linear, under-determined least squares problem via SVD $A = U \Sigma V^T$.
- This approach gives excellent periodic extension up to arbitrary precision with the right choice of hyperparameters. (Figure taken from Mark Lyon's PhD thesis, DOI: 10.7907/3FFW-GK56)
**Drawback**: Cost for SVD of $n\times m$ matrix $\mathcal{O}(\min(m, n)^2 \cdot \max(m, n))$.- How to make SVD extensions efficient?
- Key idea: Use
**linearity**of optimisation problem. - Expand a non-periodic function in terms of a suitable basis at the left and right boundaries.
- Compute SVD extensions (orange) of basis functions (blue) (left: constant polynomials, center: linear polynomial, right: quadratic polynomial).
- Express periodic extension as a suitable linear combination of SVD basis functions at the left and right boundaries. (Figures taken from Mark Lyon's PhD thesis, DOI: 10.7907/3FFW-GK56)

### Steps of the Algorithm

**Boundary Polynomial Expansion**: Expand real and imaginary parts at boundaries (not the same as GAMER's ghost boundaries).**Periodic Extension**: Continue polynomials using precomputed periodic extensions. This extension is unphysical and doesn't use neighboring patch information.**FFT Evolution**: Use FFT on the extended domain to evolve the wave function, which becomes periodic on this domain.**Avoidance of Gibbs Phenomenon**: Smooth extensions largely prevent the Gibbs phenomenon.**Discarding Extended Domain**: After computation, the extended domain is discarded.

## Differences between GRAMFE_MATMUL and GRAMFE_FFT Schemes

### Overview

Understanding the differences between the `GRAMFE_MATMUL`

and `GRAMFE_FFT`

schemes in GAMER is crucial for users to choose the appropriate method for their specific simulation needs. Both schemes are based on the Gram-Fourier extension algorithm but differ in their computational approach and performance characteristics.

### GRAMFE_MATMUL Scheme

#### Description

**Matrix Multiplication Approach**: In the`GRAMFE_MATMUL`

scheme, the entire operation sequence - GramFE extension, FFT, time evolution, IFFT, and discarding of extension data - is precomputed and represented as a single matrix (which is possible because of linearity).**Application to Input Vectors**: This precomputed matrix can be directly applied to input vectors, eliminating the need to compute each operation at runtime.

#### Advantages

**Reduced Runtime Computation**: By precomputing the entire operation sequence, the`GRAMFE_MATMUL`

scheme minimizes runtime computational overhead.**Efficiency for Small Patch Sizes**: This scheme is particularly efficient for smaller data sizes where the cost of matrix multiplication is lower.

#### Disadvantages

**Memory Usage**: The precomputed matrix, especially for large patch sizes, can consume significant memory resources.**Precomputation Time**: There is an upfront cost in time to compute and store the matrix, which might be substantial depending on the accuracy used.

### GRAMFE_FFT Scheme

#### Description

**Runtime Computation**: The`GRAMFE_FFT`

scheme performs the GramFE extension, FFT, time evolution, IFFT, and discarding of extension data operations at runtime.

#### Advantages

**Scalability for Large Patch Sizes**: This scheme is more scalable for larger patch sizes, as it does not require storing large precomputed matrices.**Better Round-off Errors**: Slightly better mass conservation properties than matrix multiplication scheme. This gap may be overcome by using higher precision for the precomputation of the time evolution matrix in the matrix multiplication scheme.

#### Disadvantages

**Increased Runtime Operations**: Each step of the operation sequence is computed at runtime, which can increase the computational overhead, especially for larger simulations.**Dependence on FFT Performance**: The efficiency of the scheme is closely tied to the performance of the FFT algorithm used.

### Conclusion

**Choice of Scheme**: By default`GRAMFE_MATMUL`

should be used.`GRAMFE_FFT`

should be considered for larger patch sizes and debugging.

## Performance Metrics

- The performance ratio
`Perf_FD / Perf_Spectral`

indicates the efficiency compared to the finite-difference scheme. This varies based on patch size (e.g., 4.3 for`PATCH_SIZE=8`

and 2.6 for`PATCH_SIZE=16`

for`GRAMFE_FFT`

; comparable performance of`GRAMFE_MATMUL`

and`WAVE_FD`

for`PATCH_SIZE=8`

).

## Issues and Considerations

**Stability with Single Precision**: Using single precision can lead to instability in the scheme.**Patch Size Impact**: The choice of patch size directly affects the performance and efficiency of the FFT operations.**Not conservative**: Scheme is not conservative and there are no fluxes for the flux-fixup operation.

## Possible improvements

**Stability and Accuracy**: Fine-tune the parameters of the GramFE extension, the boundary size as well as the filter parameters to achieve a good compromise between stability, accuracy and the size of the ghost zone.**Mass Conservation**: Solve continuity equation (possibly using Gram-FE and DFT) to ensure mass conservation and compute estimate for fluxes.**Density & Phase**: Compute extension for density & phase or another set of more suitable variables (e.g. density * sin(phase/N) and density * cos(phase/N) with N>1 to avoid phase discontinuity at vortices) for higher accuracy.

### High Precision Requirements for Evolution Matrix Calculation

#### Current Precision and Limitations

The calculation of the evolution matrix in the `GRAMFE_MATMUL`

scheme currently requires at least 128-bit precision to maintain an error below single precision. The extension matrix has a high condition number, typically around 10^15 or higher, which necessitates this high precision.

#### Proposed Enhancements

**Upgrade to 256-bit Type**: Ideally, the`gramfe_evo_float`

data type should be upgraded to a 256-bit type in future iterations. This would significantly improve the accuracy of the evolution matrix calculation.**Impact on Extension Tables**: Correspondingly, the`GramFE_ExtensionTables`

would need to be updated to include more significant digits to leverage the increased precision effectively.**Current Usage of Long Double Type**: Some routines currently use the`long double`

type instead of the`quadmath`

version. This choice is based on the requirement that these routines do not need more than double precision. In light of the proposed shift to a 256-bit type for`gramfe_evo_float`

, a review of the data types used in these routines may be warranted to ensure consistency and optimal performance across the scheme.