esdc_toolz - jejjohnson/ml4eo GitHub Wiki
Nomanclature
Recipe -
Pipelines - a sequence
Benchmark - A fixed
Steps
How to effectively use the
xarray.open_mfdatasetfunction with a custompreprocessfunction.
- Test Function on one file
- Fix Coordinates (Space, Time, Sorted)
- Fix Units
- Do Reductions
- Make Preprocessing Function
- Apply Preprocessing Function to multiple files
Core Operations
These are mainly native
xarrayfunctions that already exists that can do basic things.
- Validate Coordinates - Lat, Lon, Time (Names, Attributes, Ranges, Bounds)
- Selection/Subset/Slice - Region, Period
- Coordinate Reference System
- Resample - Frequency
- Coarsen Reductions - Spatial Scale, construct, reduce
- Rolling Transformations - construct, reduce
- Groupby Reductions
- Weighted Reductions
Higher Level Tasks
- Calculate Physical Quantities - Radiance, Reflectance, Kinetic Energy,
- Discretization - Histogram (Counts, Max, Mean)
- Climatology - Frequency
- Anomalies - Filtering + Climatology
- Coordinate Encoders - Time, Space, Wavelength
- Reprojection
- Interpolate - Unstructured, Curvilinear, Rectilinear, Regular, Target (Regrid), Lower Res (Resample/Coarsen)
- Interpolate NANs - Astrophysics - Conv + NANs, pyinterp - LOESS, Gauss-Seidel, SciPy - Unstructured, Rectilinear
- Filtering - Channel, Space, Time
- Masking - RegionMask
Machine Learning Pre-Processing
- Running Standardization - Channel, Space, Time
Statistics
- Power Spectrum Stats
- Pixel-Based Stats
- MultiScale Pixel-Based Stats