TROPO CalVal Acceptance Testing Instructions - nasa/opera-sds-pge GitHub Wiki
This page contains instructions for performing Acceptance Testing for the TROPO CalVal delivery from the OPERA-ADT team. These instructions pertain to the latest version of the CalVal release, currently SAS v0.2. These instructions assume the user has access to the JPL FN-Artifactory and has Docker installed on their local machine.
The image is currently hosted on JPL FN-Artifactory, which requires JPL VPN access and JPL credentials. You may also need to be added to the gov.nasa.jpl.opera.adt organization.
Once you have access, the container tarball delivery is available under
general/gov/nasa/jpl/opera/adt/tropo/r2/calval/dockerimg_tropo_calval_0.2.tar
.
The "golden dataset" is available under general/gov/nasa/jpl/opera/adt/tropo/r2/calval/delivery_data_tropo_calval.tar
.
Optional documentation is available under general/gov/nasa/jpl/opera/adt/tropo/r2/calval/documents/
.
Download the container tarball and sample data files to a location on your local machine. This location will be referred to throughout this instructions as ${TROPO_DIR}
.
The first step in running the TROPO image is to load it into Docker via the following command:
docker load -i ${TROPO_DIR}/dockerimg_tropo_calval_0.2.tar
This should add the Docker image to your local repository with the name cae-artifactory-fn.jpl.nasa.gov:16001/gov/nasa/jpl/opera/adt/opera/tropo
and the tag calval_0.2
. Retag the image as opera/tropo:calval_0.2
for easier docker commands.
Once the delivery_data_tropo.tar
file is downloaded to your local machine, unpack it to ${TROPO_DIR}/delivery_data_tropo
:
cd ${TROPO_DIR}
tar xvf delivery_data_tropo_calval.tar
This will create following directories within ${TROPO_DIR}/delivery_data_tropo
:
- input_data/
- golden_output/
- output/
- configs/
where input_data
contains sample input data, golden_output
contains the reference output product used for validation, output
acts as a scratch directory and will ultimately contain the SAS produced product, and configs
contains the runconfig.yaml
.
The runconfig.yaml
contains two types of configurable parameters that change the workflow behavior without changing the product result: worker_settings
and output_options
.
Configurable worker_settings
pertain to dask settings:
-
n_workers
: this parameter controls the number of processes spawned by Python for the dask parallel processing of the input data. When the SAS is run on a larger machine with more CPUs, a larger value of n_workers will result in a faster runtime. -
threads_per_worker
: The number of threads that each dask-worker during the SAS run. -
max_memory
: Maximum size (in GiB) that would be allocated to each dask-worker. A larger value of block_shape or n_worker can result in more memory usage by the SAS. -
block_shape
: Chunk size of block that will be run in parallel processing with dask. Specified as tuple, chunks size per rows and columns might affect the SAS performance
Configurable output_options
pertain to the output product itself:
-
date_fmt
: the date format used in the output filename -
output_heights
: an array of height levels determining the number of heights in the output product. Reducing the number of heights in the array could lower the peak in RAM memory consumption -
compression_kwargs
: compression parameters as dict (compression_flag: to turn on or off compression, zlib: compression algorithm, complevel: compression level for zlib, ranging from 0- no compression to 9 -maximum compression, shuffle: HDF5 shuffle filter, which de-interlaces a block of data before zlib compression by reordering the bytes)
NOTE: no config changes were required in order to successfully run on opera-dev-pge
.
The SAS requires read/write permissions for the ${TROPO_DIR}
directory along with all subdirectories contained within:
cd ${TROPO_DIR}
chmod -R 777 * .
In order to execute the SAS, the ${TROPO_DIR}/delivery_data_tropo
directory will be mounted into the container instance as Docker Volumes.
Change directory into the <TROPO_DIR>
directory.
cd <TROPO_DIR>/
We're now ready to execute the TROPO CalVal delivery. Run the following command to kick off execution with the test assets and copy the output products to the mounted volume:
NOTE: Docker requires absolute paths for volume arguments, so assuming you are running from within <TROPO_DIR>
, the $(pwd)
command can be utilized as a shortcut.
docker run --rm -u $(id -u):$(id -g) \
-v $(pwd):/home/ops \
opera/tropo:calval_0.2 \
opera_tropo run configs/runconfig.yaml
You should see the following almost instantaneously:
mkdir -p failed for path /.config/matplotlib: [Errno 13] Permission denied: '/.config'
Matplotlib created a temporary cache directory at /tmp/matplotlib-y_a34pd5 because there was an issue with the default path (/.config/matplotlib); it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
โญโโโHerbieโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ WARNING: Unable to create config file โ
โ /.config/herbie/config.toml โ
โ Herbie will use standard default settings. โ
โ Consider setting env variable HERBIE_CONFIG_PATH. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
The container will output progress messages as it runs:
[INFO|run|L83] 2025-04-08T20:43:37+0000: Calculating TROPO delay
[INFO|checks|L47] 2025-04-08T20:43:39+0000: Performing checkup of input file
[INFO|run|L167] 2025-04-08T20:44:28+0000: Estimating ZTD delay for 20240215T12.
Culminating in something like the following:
[INFO|main|L77] 2025-04-08T20:59:57+0000: Output file: output/OPERA_L4_TROPO-ZENITH_20240215T120000Z_20250324T211825Z_HRES_v0.2.nc
[INFO|browse_image|L82] 2025-04-08T21:00:16+0000: Creating browse image of ZTD at 800m altitude.
[INFO|main|L81] 2025-04-08T21:00:18+0000: Output browse image: output/OPERA_L4_TROPO-ZENITH_20240215T120000Z_20250324T211825Z_HRES_v0.2.png
[INFO|main|L83] 2025-04-08T21:00:18+0000: Product type: OPERA_TROPO
[INFO|main|L84] 2025-04-08T21:00:18+0000: Product version: 0.2
[INFO|main|L86] 2025-04-08T21:00:18+0000: Maximum memory usage: 4.44 GB
[INFO|main|L87] 2025-04-08T21:00:18+0000: RAIDER version: 0.5.3
[INFO|main|L88] 2025-04-08T21:00:18+0000: Current running opera_tropo version: 0.2.post1.dev6+g05fdd86.d20250324
NOTE: Execution time was roughly 18 minutes.
The SAS will have created various log files within the mounted ${TROPO_DIR}/delivery_data_tropo
volume along with the output netCDF file and quick look png in ${TROPO_DIR}/delivery_data_tropo/output/
along with a log file:
-rw-r--r-- 1 marlis cloud-user 2.1G Apr 8 20:59 OPERA_L4_TROPO-ZENITH_20240215T120000Z_20250324T211825Z_HRES_v0.2.nc
-rw-r--r-- 1 marlis cloud-user 4.2M Apr 8 21:00 OPERA_L4_TROPO-ZENITH_20240215T120000Z_20250324T211825Z_HRES_v0.2.png
-rw-r--r-- 1 marlis cloud-user 2.4K Apr 8 21:00 run_tropo.log
NOTE: The 20250324T211825Z
portion of the sample output filename is the production time, and will be different for each execution of the container. All other portions of the file name should match.
For this Acceptance Test, ensure the same set of expected files has been created on the local machine.
To ensure the tropo
container is executing as expected, we can also run its built-in Python script to compare the TROPO products generated by the SAS with the expected "golden dataset" product by providing paths to the "golden dataset" product and the SAS produced product using the following command:
docker run --rm -u $(id -u):$(id -g) --volume $(pwd):/home/ops opera/tropo:calval_0.2 opera_tropo validate golden_output/golden_output_20240215T12.nc output/OPERA_L4_TROPO-ZENITH_20240215T120000Z_20250324T211825Z_HRES_v0.2.nc
NOTE: Since the output product filename contains production time information (as described above), the path to the output product will differ from this example.
For this Acceptance Test, the validation script is expected to pass. A passing report should include information that the SAS produced product matches the "golden dataset" product:
[INFO|validate|L76] 2025-04-08T21:17:34+0000: Test Dataset dimensions
[INFO|validate|L80] 2025-04-08T21:17:34+0000: Test Dataset Variable names
[INFO|validate|L84] 2025-04-08T21:17:34+0000: Test Dataset Global attributes
[INFO|validate|L90] 2025-04-08T21:17:34+0000: Test Dataset Coordinate attributes
[INFO|validate|L96] 2025-04-08T21:17:34+0000: Test Dataset Variable Data values
[INFO|validate|L99] 2025-04-08T21:19:45+0000: โ
Datasets golden_output_20240215T12.nc and OPERA_L4_TROPO-ZENITH_20240215T120000Z_20250324T211825Z_HRES_v0.2.nc match!