Release Lifecycle Notes - art-daq/artdaq GitHub Wiki

Release Lifecycle Notes artdaq Release Testing and Acceptance

One of the key features of any DAQ system is stability. To ensure that artdaq performs as stably as possible, an exhaustive battery of tests should be performed on each release.
The artdaq team may release testing or integration releases from time to time, clearly labeled as such. Users accept that using a release that is not marked “production” on the Release Notes page may result in data loss, lower stability, or other undesired effects.

artdaq releases are tested in multiple stages: Unit testing, Integration testing, and Acceptance testing.

Unit Testing

These tests are performed to validate the basic low-level functionality of artdaq classes. Generally, it is performed before a release is ever tagged, and re-run every time the release is built.

TODO: Calculate code coverage of unit tests and improve/add tests as necessary

A. Make sure that artdaq builds without errors or warnings
B. Make sure that all artdaq packages pass their built-in test suites

Integration Testing

These tests are performed to validate some level of DAQ functionality, but the system is run in “ideal” conditions, and stress tests are not performed at this stage.
A release that does not pass or has not passed these tests may be labeled as “testing”.

A. Check that quick-mrb-start.sh functions properly — run without parameters
B. Perform transfer_driver tests (See transfer_driver tests below):

  1. Large fragments (100 MB) x 10000, record rate for Shmem, TCP, MPI
  2. Small fragments (5 KB) x 1000000, record rate for Shmem, TCP, MPI (Originally 1K fragments)
    C. Perform artdaqDriver tests:
  3. test1: 10,000 1 MB events, record time
  4. test2: 1,000,000 1 KB events, record time
  5. test3: 10,000 1 MB events without disk-writing, record time
  6. test4: 10,000 1 MB events with Binary disk-writing to /dev/null, record time (new for v2_03_00, run for v2_02_01)
    D. Run quick-mrb-start.sh —run-demo
  7. Make sure the demo runs as expected
  8. Make sure that the output data file is created
    a. Run rawEventDump.fcl over the data file
    b. Run toyDump.fcl over the data file
  9. Store data file in Redmine as version reference
    E. Run the DAQInterface example configurations
  10. Make sure each example runs as expected
  11. Make sure the output data file is created
  12. Run verification FCL jobs on data file
    F. Test version reference data files from Redmine — note if version incompatibility exists
    G. Test previous version of artdaq with current reference data files — note if data files are not backwards-compatible
  13. Run quick-mrb-start.sh —tag [previous version tag] in new directory
  14. See compatibility test notes

Acceptance Testing

These tests are performed to verify the performance of the integrated artdaq release in conditions as similar as possible to actual experiments using artdaq.
Various stresses will be placed on the system to ensure that it continues to perform well when subject to CPU, disk, network, and memory constraints.
The request, routing, and filtering systems should all be throughly tested as well.
A release that does not pass or has not passed these tests may be labeled as “integration”.

A. CPU-bound performance tests (currently using protodune_mock_system_with_delay configuration on ironwork with 5 BRs and 5 EBs)

  1. Perform single-run tests with long duration, ensure that system remains stable for at least 1 hour
  2. Perform multi-run tests with short duration, ensure that system remains stable through at least 120 runs (Current configuration is to have 3 runs per system instance, DAQInterface remains running throughout)
    B. Large system tests (currently using protodune_mock_system_with_delay on mu2edaq cluster)
    C. Large protoDUNE-like system (120 BRs, 16 EBs, across all available mu2edaq nodes)
    D. TODO: Add more tests
    E. Deployment tests (all available experiments)
  3. Install release in testing area on experiment computing, run experiment DAQ through new release

Unlike the previous stages of testing, passing or failing the acceptance tests is to some degree a value decision that must be made by the group before giving a release the “production” label.
Any issues identified during Acceptance testing that do not result in a release failing should be documented in Redmine and ideally resolved by the next release.

⚠️ **GitHub.com Fallback** ⚠️