TROPO Interface Acceptance Testing Instructions - nasa/opera-sds-pge GitHub Wiki

This page contains instructions for performing Acceptance Testing for the TROPO Interface delivery from the OPERA-ADT team. These instructions pertain to the latest version of the Interface release, currently SAS v0.1. These instructions assume the user has access to the JPL FN-Artifactory and has Docker installed on their local machine.

Acquiring the TROPO Interface Docker Image

The image is currently hosted on JPL FN-Artifactory, which requires JPL VPN access and JPL credentials. You may also need to be added to the gov.nasa.jpl.opera.adt organization.

Once you have access, the container tarball delivery is available under general/gov/nasa/jpl/opera/adt/tropo/r1/interface/dockerimg_tropo_interface_0.1.tar.

The "golden dataset" is available under general/gov/nasa/jpl/opera/adt/tropo/r1/interface/delivery_data_tropo.zip.

Optional documentation is available under general/gov/nasa/jpl/opera/adt/tropo/r1/interface/documents/.

Download container tarball and sample data files to a location on your local machine. This location will be referred to throughout this instructions as ${TROPO_DIR}.

Loading the image into Docker

The first step in running the TROPO image is to load it into Docker via the following command:

docker load -i ${TROPO_DIR}/dockerimg_tropo_interface_0.1.tar

This should add the Docker image to your local repository with the name opera/tropo and the tag interface_0.1.

Preparing the test data

Once the delivery_data_tropo.zip file is downloaded to your local machine, unpack it to ${TROPO_DIR}/delivery_data_tropo:

mkdir ${TROPO_DIR}/delivery_data_tropo
cd ${TROPO_DIR}
unzip delivery_data_tropo.zip -d delivery_data_tropo

This will create following directories within ${TROPO_DIR}:

  • input_data/
  • golden_output/
  • output/
  • configs/

where input_data contains sample input data, golden_output contains the reference output product used for validation, output acts as a scratch directory and will ultimately contain the SAS produced product, and configs contains the runconfig.yaml.

The runconfig.yaml contains two types of configurable parameters that change the workflow behavior without changing the product result: worker_settings and output_options.

Configurable worker_settings pertain to dask settings:

  • n_workers: this parameter controls the number of processes spawned by Python for the dask parallel processing of the input data. When the SAS is run on a larger machine with more CPUs, a larger value of n_workers will result in a faster runtime.
  • threads_per_worker: The number of threads that each dask-worker during the SAS run.
  • max_memory: Maximum size (in GiB) that would be allocated to each dask-worker. A larger value of block_shape or n_worker can result in more memory usage by the SAS.
  • block_shape: Chunk size of block that will be run in parallel processing with dask. Specified as tuple, chunks size per rows and columns might affect the SAS performance

Configurable output_options pertain to the output product itself:

  • date_fmt: the date format used in the output filename
  • output_heights: an array of height levels determining the number of heights in the output product. Reducing the number of heights in the array could lower the peak in RAM memory consumption
  • compression_kwargs: compression parameters as dict (compression_flag: to turn on or off compression, zlib: compression algorithm, complevel: compression level for zlib, ranging from 0- no compression to 9 -maximum compression, shuffle: HDF5 shuffle filter, which de-interlaces a block of data before zlib compression by reordering the bytes)

NOTE: no config changes were required in order to successfully run on opera-dev-pge.

The SAS requires read/write permissions for the ${TROPO_DIR}/delivery_data_tropo directory along with all subdirectories contained within:

cd ${TROPO_DIR}/delivery_data_tropo
chmod -R 777 * .

In order to execute the SAS, the ${TROPO_DIR}/delivery_data_tropo directory will be mounted into the container instance as Docker Volumes.

Executing the TROPO container on the sample test data

Change directory into the <TROPO_DIR> directory.

cd <TROPO_DIR>/

We're now ready to execute the TROPO Interface delivery. Run the following command to kick off execution with the test assets and copy the output products to the mounted volume:

NOTE: Docker requires absolute paths for volume arguments, so assuming you are running from within <TROPO_DIR>, the $(pwd) command can be utilized as a shortcut.

docker run --rm -u $(id -u):$(id -g) \
  -v $(pwd)/delivery_data_tropo:/home/ops \
  opera/tropo:interface_0.1 \
  opera_tropo run configs/runconfig.yaml

You should see the following almost instantaneously:

mkdir -p failed for path /.config/matplotlib: [Errno 13] Permission denied: '/.config'
Matplotlib created a temporary cache directory at /tmp/matplotlib-gxgdgzew because there was an issue with the default path (/.config/matplotlib); it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
 โ•ญโ”€โ–Œโ–ŒHerbieโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
 โ”‚ WARNING: Unable to create config file               โ”‚
 โ”‚            /.config/herbie/config.toml               โ”‚
 โ”‚ Herbie will use standard default settings.           โ”‚
 โ”‚ Consider setting env variable HERBIE_CONFIG_PATH.    โ”‚
 โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

The container will output progress messages as it runs:

[INFO|run|L75] 2025-02-13T21:46:19+0000: Calculating TROPO delay
[INFO|run|L96] 2025-02-13T21:46:21+0000: Dask server link: http://127.0.0.1:8787/status
[INFO|run|L108] 2025-02-13T21:46:21+0000: Rechunking input_data/20190613/D06130600061306001.zz.nc
[INFO|run|L151] 2025-02-13T21:46:32+0000: Estimating ZTD delay, mem usage 15581.98 GB

Culminating in something like the following:

[INFO|loggin_setup|L72] 2025-02-13T22:08:08+0000: Total elapsed time for opera_tropo.run.tropo: 21.82 minutes (1309.28 seconds)
[INFO|main|L65] 2025-02-13T22:08:08+0000: Output file: output/OPERA_L4_TROPO_20190613T060000Z_20250208T180402Z_HRES_0.1_v0.1.nc
[INFO|main|L68] 2025-02-13T22:08:08+0000: Output browse image: output/OPERA_L4_TROPO_20190613T060000Z_20250208T180402Z_HRES_0.1_v0.1.png
[INFO|main|L71] 2025-02-13T22:08:18+0000: Product type: OPERA_TROPO
[INFO|main|L72] 2025-02-13T22:08:18+0000: Product version: 0.1
[INFO|main|L74] 2025-02-13T22:08:18+0000: Maximum memory usage: 49.81 GB
[INFO|main|L75] 2025-02-13T22:08:18+0000: Config file RAIDER version: 0.5.3
[INFO|main|L76] 2025-02-13T22:08:18+0000: Current running opera_tropo version: 0.1.post1.dev0+ga208ab4.d20250210
[INFO|loggin_setup|L72] 2025-02-13T22:08:18+0000: Total elapsed time for opera_tropo.main.run: 22.01 minutes (1320.34 seconds)

NOTE: Execution time was roughly 22 minutes.

The SAS will have created various log files within the mounted ${TROPO_DIR}/delivery_data_tropo volume along with the output netCDF file and quick look png in ${TROPO_DIR}/delivery_data_tropo/output/:

-rwxrwxrwx 1 marlis cloud-user 2.6G Feb 13 22:08 OPERA_L4_TROPO_20190613T060000Z_20250208T180402Z_HRES_0.1_v0.1.nc
-rwxrwxrwx 1 marlis cloud-user 4.5M Feb 13 22:08 OPERA_L4_TROPO_20190613T060000Z_20250208T180402Z_HRES_0.1_v0.1.png

NOTE: The 20250208T180402Z portion of the sample output filename is the production time, and will be different for each execution of the container. All other portions of the file name should match.

For this Acceptance Test, ensure the same set of expected files has been created on the local machine.

Executing the Quality Assurance test within the TROPO container

To ensure the tropo container is executing as expected, we can also run its built-in Python script to compare the TROPO products generated by the SAS with the expected "golden dataset" product by providing paths to the "golden dataset" product and the SAS produced product using the following command:

docker run --rm -u $(id -u):$(id -g) --volume $(pwd)/delivery_data_tropo:/home/ops opera/tropo:interface_0.1 opera_tropo validate golden_output/golden_output_20190613T06.nc output/OPERA_L4_TROPO_20190613T060000Z_20250208T180402Z_HRES_0.1_v0.1.nc

NOTE: Since the output product filename contains production time information (as described above), the path to the output product will differ from this example.

For this Acceptance Test, the validation script is expected to pass. A passing report should include information that the SAS produced product matches the "golden dataset" product:

[INFO|validate|L29] 2025-02-13T22:50:43+0000: Test Dataset dimensions
[INFO|validate|L33] 2025-02-13T22:50:43+0000: Test Dataset Variable names
[INFO|validate|L37] 2025-02-13T22:50:43+0000: Test Dataset Global attributes
[INFO|validate|L41] 2025-02-13T22:50:43+0000: Test Dataset Coordinate attributes
[INFO|validate|L47] 2025-02-13T22:50:43+0000: Test Dataset Variable Data values
[INFO|validate|L50] 2025-02-13T22:52:16+0000: โœ… Datasets golden_output_20190613T06.nc and OPERA_L4_TROPO_20190613T060000Z_20250208T180402Z_HRES_0.1_v0.1.nc match!
โš ๏ธ **GitHub.com Fallback** โš ๏ธ