Getting Started - Wx-Alliance-Alliance-Meteo/paradis_model GitHub Wiki
Getting Started: Training and Forecasting with PARADIS
Welcome! This guide walks you through setting up your environment, training the model, and generating forecasts using PARADIS.
Pre-requisites
Necessary python packages can be installed via pip install -r requirements.txt
Obtaining a compatible dataset
Download the original dataset from WeatherBench 2:
cd scripts
bash download_dataset.sh OUTPUT_DIR
where OUTPUT_DIR
is the destination directory and then preprocess it
python scripts/preprocess_weatherbench_data.py -i /path/to/ERA5/5.625deg_wb2 -o /path/to/ERA5/5.65deg
Running PARADIS
For training and forecasting, the configuration file, located in the config/
directory, provides the default list of hyperparameters and options. A short description of these parameters is provided in that file.
Training
An example running script to generate a training at low resolution can be
# Define the dataset path
root_dir=PATH/TO/DATASET
python train.py \
dataset.root_dir="${root_dir}" \
dataset.n_time_inputs=2 \
compute.batch_size=32 \
compute.use_amp=True \
compute.num_devices=1 \
compute.num_workers=10 \
training.log_every_n_steps=10 \
training.print_losses=False \
training.max_epochs=30 \
training.dataset.start_date=2010-01-01 \
training.dataset.end_date=2015-12-31 \
training.validation_dataset.start_date=2020-01-01 \
training.validation_dataset.end_date=2020-12-31 \
training.optimizer.lr=3e-3 \
training.scheduler.wsd.warmup=0.1 \
training.scheduler.wsd.decay=0.2 \
training.loss_function.type=reversed_huber \
normalization.standard=true
For faster training at low resolutions, you may set the options training.dataset.preload=True
and training.validation_dataset.preload=True
to keep the dataset in CPU memory and avoid frequent disk reads.
Forecasting
The following script generates a forecast with the above trained model for the year 2020. Results are stored at results/forecast.zarr
. A checkpoint from the training section is required.
root_dir=PATH/TO/DATASET
checkpoint_path=PATH/TO/CHECKPOINT
/python forecast.py \
dataset.root_dir="${root_dir}" \
dataset.n_time_inputs=2 \
model.forecast_steps=1 \
compute.use_amp=True \
compute.num_devices=1 \
compute.num_workers=5 \
forecast.enable=true \
forecast.start_date=2020-01-01T00:00:00 \
forecast.end_date=2020-12-31T00:00:00 \
forecast.output_file='results/forecast.zarr' \
training.dataset.start_date=2020-01-01 \
training.dataset.end_date=2021-01-02 \
init.checkpoint_path=${checkpoint_path}
Visualizing Results & Post-Processing
A notebook with detailed steps on computing root-mean-square error (RSME) is available in the scripts
directory.