Instructions for setting up a software area for fddaq‐v4.x.y development - DUNE-DAQ/daqconf GitHub Wiki

Instructions for setting up a Far Detector software area for v4.x.y development based on a recent nightly build

24-Apr-2024 - Work in progress (the steps below been verified to work)

Reference information:

Here are the suggested steps:

  1. create a new software area based on the latest nightly build (see step 1.iv for the exact dbt-create command to use)

    1. The steps for this are based on the latest instructions for daq-buildtools

    2. As always, you should verify that your computer has access to /cvmfs/dunedaq.opensciencegrid.org

    3. If you are using one of the np04daq computers, and need to clone packages, add the following lines to your .gitconfig file (once you do this, there will be no need to activate the web proxy each time you log in, and this means that you won't forget to disable it...):

      [http]
        proxy = http://np04-web-proxy.cern.ch:3128
        sslVerify = false
      
    4. Here are the steps for creating the new software area:

      cd <directory_above_where_you_want_the_new_software_area>
      source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh
      setup_dbt latest
      dbt-create -n NFD_PROD4_240424_A9 <work_dir>  # NFD_PROD4_240424_C8 for SL7/CS8
      cd <work_dir>
      
    5. Please note that if you are following these instructions on a computer on which the DUNE-DAQ software has never been run before, there are several system packages that may need to be installed on that computer. These are mentioned in this script. To check whether a particular one is already installed, you can use a command like yum list libzstd and check whether the package is listed under Installed Packages.

  2. add any desired repositories to the /sourcecode area. An example is provided here.

    1. decide if you want the very latest code, or a more stable set of packages that has been verified to work.

      Run this command to select the very latest code

      export use_very_latest_dunedaq_code=1
      

      or this one to select a more stable set of packages, depending on your choice.

      export use_recent_verified_dunedaq_code=1
      
    2. clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)

      # change directory to the "sourcecode" subdir, if possible and needed
      if [[ -d "sourcecode" ]]; then
          cd sourcecode
      fi
      # double-check that we're in the correct subdir
      current_subdir=`echo ${PWD} | xargs basename`
      if [[ "$current_subdir" != "sourcecode" ]]; then
          echo ""
          echo "*** Current working directory is not \"sourcecode\", skipping repo clones"
      else
          # finally, do the repo clone(s)
          git clone https://github.com/DUNE-DAQ/daqconf.git -b production/v4
          git clone https://github.com/DUNE-DAQ/daqsystemtest.git -b production/v4
          git clone https://github.com/DUNE-DAQ/dfmodules.git -b production/v4
          git clone https://github.com/DUNE-DAQ/fddaqconf.git -b production/v4
          if [[ "$use_very_latest_dunedaq_code" == "" ]]; then
              cd daqconf ; git checkout 397b444b78; cd ..
              cd daqsystemtest ; git checkout 695dfa94c56; cd ..
              cd dfmodules ; git checkout bd61f366c; cd ..
              cd fddaqconf ; git checkout 0575bb1ade; cd ..
          fi
          cd ..
      fi
      
      
  3. setup the work area and build the software

    dbt-workarea-env
    dbt-build -j 20
    dbt-workarea-env
    
    
  4. prepare a daqconf.json file, such as the one shown here. This sample includes a playback data file that is appropriate for the WIBEth data type. (Please note the additional comments on this sample file that are included below!)

    {
      "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": true,
        "connectivity_service_host": "localhost",
        "connectivity_service_port": 15432
      }, 
      "daq_common": {
        "data_rate_slowdown_factor": 1
      },
      "detector": {
        "clock_speed_hz": 62500000,
        "offline_data_stream": "cosmics"
      },
      "readout": {
        "use_fake_cards": true,
        "default_data_file": "asset://?label=WIBEth&subsystem=readout"
      },
      "trigger": {
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}]
      },
      "hsi": {
        "random_trigger_rate_hz": 1.0
      }
    }

    A few notes on the sample file shown above:

    • The "use/start_connectivity_service" parameters aren't strictly needed, since their default value is "true". Ditto, the "connectivity_service_host/port". However, all of these are included so that people can use them for reference.
    • A port offset is applied to the "connectivity_service_port" by nanorc, so we don't all need to use different numbers, as long as we use different partition numbers when running nanorc, e.g. 'nanorc --partition-number 2 ...')
    • If you want to use an existing, externally-started Connectivity Service instance, such as the one on the np04 cluster, you would set "use_connectivity_service" to true, and "start_connectivity_service" to false.
    Another option (the initial config, but with the ConnSvc disabled)
    {
      "boot": {
        "use_connectivity_service": false,
        "start_connectivity_service": false
      }, 
      "daq_common": {
        "data_rate_slowdown_factor": 1
      },
      "detector": {
        "clock_speed_hz": 62500000
      },
      "readout": {
        "use_fake_cards": true,
        "default_data_file": "asset://?label=WIBEth&subsystem=readout"
      },
      "trigger": {
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}]
      },
      "hsi": {
        "random_trigger_rate_hz": 1.0
      }
    }
  5. prepare a data-readout map file (e.g. my_dro_map.json), listing the detector streams (true or fake) that you want to run with. This sample specifies parameter values that are appropriate for WIBEth data:

    [
        {
            "src_id": 100,
            "geo_id": {
                "det_id": 3,
                "crate_id": 1,
                "slot_id": 0,
                "stream_id": 0
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 0,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.0",
                "rx_mac": "00:00:00:00:00:00",
                "rx_ip": "0.0.0.0",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        },
        {
            "src_id": 101,
            "geo_id": {
                "det_id": 3,
                "crate_id": 1,
                "slot_id": 0,
                "stream_id": 1
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 0,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.0",
                "rx_mac": "00:00:00:00:00:00",
                "rx_ip": "0.0.0.0",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        }
    ]
  6. Generate a configuration, e.g.:

    fddaqconf_gen -c ./daqconf.json --detector-readout-map-file ./my_dro_map.json my_test_config
    
  7. nanorc --partition-number <num> <config name> <partition name> boot conf start_run --run-type TEST <run number> wait 60 stop_run scrap terminate

    • e.g. nanorc --partition-number 2 my_test_config ${USER}-test boot conf start_run --run-type TEST 111 wait 60 stop_run scrap terminate
    • or, you can simply invoke nanorc --partition-number 2 my_test_config ${USER}-test by itself and input the commands individually
  8. When you return to working with the software area after logging out, the steps that you'll need to redo are the following:

    • cd <work_dir>
    • source ./env.sh
    • dbt-build # if needed
    • dbt-workarea-env # if needed
  9. For reference, here are additional sample daqconf.json and dro_map.json files that illustrate various types of running with WIBEth data. These can be mixed and matched with the samples above to generate demo systems of various levels of complexity.

    Sample daqconf.json for running with TPG
    {
      "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": true,
        "connectivity_service_host": "localhost",
        "connectivity_service_port": 15432
      }, 
      "daq_common": {
        "data_rate_slowdown_factor": 1
      },
      "detector": {
        "clock_speed_hz": 62500000,
        "offline_data_stream": "cosmics"
      },
      "readout": {
        "use_fake_cards": true,
        "default_data_file": "asset://?checksum=dd156b4895f1b06a06b6ff38e37bd798",
        "generate_periodic_adc_pattern": true,
        "emulated_TP_rate_per_ch": 1,
        "enable_tpg": true,
        "tpg_threshold": 500,
        "tpg_algorithm": "SimpleThreshold"
      },
      "trigger": {
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}],
        "trigger_activity_plugin": ["TriggerActivityMakerPrescalePlugin"],
        "trigger_activity_config": [ {"prescale": 25} ],
        "trigger_candidate_plugin": ["TriggerCandidateMakerPrescalePlugin"],
        "trigger_candidate_config": [ {"prescale": 100} ],
        "mlt_merge_overlapping_tcs": false
      },
      "dataflow": {
        "apps": [ { "app_name": "dataflow0" } ],
        "enable_tpset_writing": true,
        "token_count": 20
      },
      "hsi": {
        "random_trigger_rate_hz": 1.0
      }
    }
    Sample dro_map.json for three Readout Apps (3 separate processes), each with two streams of data (i.e. two DataLinkHandler modules)
    [
        {
            "src_id": 100,
            "geo_id": {
                "det_id": 3,
                "crate_id": 1,
                "slot_id": 0,
                "stream_id": 0
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 0,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.0",
                "rx_mac": "00:00:00:00:00:00",
                "rx_ip": "0.0.0.0",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        },
        {
            "src_id": 101,
            "geo_id": {
                "det_id": 3,
                "crate_id": 1,
                "slot_id": 0,
                "stream_id": 1
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 0,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.0",
                "rx_mac": "00:00:00:00:00:00",
                "rx_ip": "0.0.0.0",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        },
        {
            "src_id": 102,
            "geo_id": {
                "det_id": 3,
                "crate_id": 2,
                "slot_id": 0,
                "stream_id": 0
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 1,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.1",
                "rx_mac": "00:00:00:00:00:01",
                "rx_ip": "0.0.0.1",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        },
        {
            "src_id": 103,
            "geo_id": {
                "det_id": 3,
                "crate_id": 2,
                "slot_id": 0,
                "stream_id": 1
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 1,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.1",
                "rx_mac": "00:00:00:00:00:01",
                "rx_ip": "0.0.0.1",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        },
        {
            "src_id": 104,
            "geo_id": {
                "det_id": 3,
                "crate_id": 3,
                "slot_id": 0,
                "stream_id": 0
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 2,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.2",
                "rx_mac": "00:00:00:00:00:02",
                "rx_ip": "0.0.0.2",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        },
        {
            "src_id": 105,
            "geo_id": {
                "det_id": 3,
                "crate_id": 3,
                "slot_id": 0,
                "stream_id": 1
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 2,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.2",
                "rx_mac": "00:00:00:00:00:02",
                "rx_ip": "0.0.0.2",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        }
    ]
    Sample daqconf.json for running with several trigger sources (FakeHSI, TPG, and RandomTriggerCandidateMaker)
    {
      "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": true,
        "connectivity_service_host": "localhost",
        "connectivity_service_port": 15432
      }, 
      "daq_common": {
        "data_rate_slowdown_factor": 1
      },
      "detector": {
        "clock_speed_hz": 62500000
      },
      "readout": {
        "use_fake_cards": true,
        "default_data_file": "asset://?checksum=dd156b4895f1b06a06b6ff38e37bd798",
        "generate_periodic_adc_pattern": true,
        "emulated_TP_rate_per_ch": 1,
        "enable_tpg": true,
        "tpg_threshold": 500,
        "tpg_algorithm": "SimpleThreshold"
      },
      "trigger": {
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}],
        "trigger_activity_plugin": ["TriggerActivityMakerPrescalePlugin"],
        "trigger_activity_config": [ {"prescale": 25} ],
        "trigger_candidate_plugin": ["TriggerCandidateMakerPrescalePlugin"],
        "trigger_candidate_config": [ {"prescale": 100} ],
        "mlt_merge_overlapping_tcs": false,
        "use_random_maker": true,
        "rtcm_timestamp_method": "kTimeSync",
        "rtcm_time_distribution": "kUniform",
        "rtcm_trigger_interval_ticks": 62500000,
        "mlt_use_readout_map": true,
          "mlt_td_readout_map": [
              {
                  "candidate_type": 4,
                  "time_before": 100,
                  "time_after": 200
              },
              {
                  "candidate_type": 5,
                  "time_before": 300,
                  "time_after": 400
              }
          ]
      },
      "dataflow": {
        "apps": [ { "app_name": "dataflow0" } ],
        "enable_tpset_writing": true,
        "token_count": 20
      },
      "hsi": {
        "random_trigger_rate_hz": 1.0
      }
    }
    Sample daqconf.json for running with 16 Dataflow Apps
    {
      "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": true,
        "connectivity_service_host": "localhost",
        "connectivity_service_port": 15432
      }, 
      "daq_common": {
        "data_rate_slowdown_factor": 1
      },
      "detector": {
        "clock_speed_hz": 62500000
      },
      "readout": {
        "use_fake_cards": true,
        "default_data_file": "asset://?label=WIBEth&subsystem=readout"
      },
      "trigger": {
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}]
      },
      "dataflow": {
          "apps":
          [
              { "app_name": "dataflow0" },
              { "app_name": "dataflow1" },
              { "app_name": "dataflow2" },
              { "app_name": "dataflow3" },
              { "app_name": "dataflow4" },
              { "app_name": "dataflow5" },
              { "app_name": "dataflow6" },
              { "app_name": "dataflow7" },
              { "app_name": "dataflow8" },
              { "app_name": "dataflow9" },
              { "app_name": "dataflow10" },
              { "app_name": "dataflow11" },
              { "app_name": "dataflow12" },
              { "app_name": "dataflow13" },
              { "app_name": "dataflow14" },
              { "app_name": "dataflow15" }
          ]
      },
      "hsi": {
        "random_trigger_rate_hz": 8.0
      }
    }
  10. For further reference, here are daqconf.json and dro_map.json files for emulated DuneWIB electronics

    Sample daqconf.json for DuneWIB
    {
      "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": true,
        "connectivity_service_host": "localhost",
        "connectivity_service_port": 15432
      }, 
      "daq_common": {
        "data_rate_slowdown_factor": 10
      },
      "detector": {
        "clock_speed_hz": 62500000
      },
      "readout": {
        "use_fake_cards": true,
        "data_files": [
          {"detector_id": 3, "data_file": "asset://?label=DuneWIB&subsystem=readout"}
        ]
      },
      "trigger": {
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}]
      },
      "hsi": {
        "random_trigger_rate_hz": 1.0
      }
    }
    Another option, with DuneWIB, Trigger Primitive generation enabled, and multiple Dataflow apps
    {
      "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": true,
        "connectivity_service_host": "localhost",
        "connectivity_service_port": 15432
      }, 
      "dataflow": {
        "enable_tpset_writing": true,
        "apps": [
           { "app_name": "dataflow0" },
           { "app_name": "dataflow1" }
        ]
      },
      "daq_common": {
        "data_rate_slowdown_factor": 10
      },
      "detector": {
        "clock_speed_hz": 62500000
      },
      "readout": {
        "enable_tpg": true,
        "tpg_threshold": 500,
        "use_fake_cards": true,
        "data_files": [
          {"detector_id": 3, "data_file": "asset://?label=DuneWIB&subsystem=readout"}
        ]
      },
      "trigger": {
        "trigger_activity_plugin": ["TriggerActivityMakerPrescalePlugin"],
        "trigger_activity_config": [ {"prescale": 1000} ],
        "trigger_candidate_plugin": ["TriggerCandidateMakerPrescalePlugin"],
        "trigger_candidate_config": [ {"prescale": 100} ],
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}]
      },
      "hsi": {
        "random_trigger_rate_hz": 1.0
      }
    }
    Sample dro_map.json for DuneWIB
    [
        {
            "src_id": 100,
            "geo_id": {
                "det_id": 3,
                "crate_id": 1,
                "slot_id": 0,
                "stream_id": 0
            },
            "kind": "flx",
            "parameters": {
                "protocol": "full",
                "mode": "fix_rate",
                "host": "localhost",
                "card": 0,
                "slr": 0,
                "link": 0
            }
        },
        {
            "src_id": 101,
            "geo_id": {
                "det_id": 3,
                "crate_id": 1,
                "slot_id": 0,
                "stream_id": 1
            },
            "kind": "flx",
            "parameters": {
                "protocol": "full",
                "mode": "fix_rate",
                "host": "localhost",
                "card": 0,
                "slr": 0,
                "link": 1
            }
        }
    ]

    An example hardware map file from the Vertical Drift Coldbox can be found here.

  11. For reference, here are daqconf.json and dro_map.json files for VD TDE (vertical drift, top detector electronics)

    Sample daqconf.json for VD TDE
    {
      "boot": {
        "use_connectivity_service": true,
        "start_connectivity_service": true,
        "connectivity_service_host": "localhost",
        "connectivity_service_port": 15432
      },
      "daq_common": {
        "data_rate_slowdown_factor": 1
      },
      "detector": {
        "clock_speed_hz": 62500000
      },
      "readout": {
        "use_fake_cards": true,
        "default_data_file": "asset://?checksum=759e5351436bead208cf4963932d6327"
      },
      "trigger": {
        "ttcm_input_map": [{"signal": 1, "tc_type_name": "kTiming",
                            "time_before": 1000, "time_after": 1000}]
      },
      "hsi": {
        "random_trigger_rate_hz": 1.0
      }
    }
    Sample dro_map.json for VD TDE
    [
        {
            "src_id": 100,
            "geo_id": {
                "det_id": 11,
                "crate_id": 1,
                "slot_id": 0,
                "stream_id": 0
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 0,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.0",
                "rx_mac": "00:00:00:00:00:00",
                "rx_ip": "0.0.0.0",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        },
        {
            "src_id": 101,
            "geo_id": {
                "det_id": 11,
                "crate_id": 1,
                "slot_id": 1,
                "stream_id": 1
            },
            "kind": "eth",
            "parameters": {
                "protocol": "udp",
                "mode": "fix_rate",
                "rx_iface": 0,
                "rx_host": "localhost",
                "rx_pcie_dev": "0000:00:00.0",
                "rx_mac": "00:00:00:00:00:00",
                "rx_ip": "0.0.0.0",
                "tx_host": "localhost",
                "tx_mac": "00:00:00:00:00:00",
                "tx_ip": "0.0.0.0"
            }
        }
    ]

Notes about the use of localhost in daqconf.json and dro_map.json files

Starting with dunedaq-v4.0.0, when we specify a hostname of "localhost" in a daqconf.json or dro_map.json file, that hostname is resolved at configuration time, using the name of the host on which the configuration is generated. This is handled by the code in the daqconf package, and it is done to prevent problems in situations in which some of the hosts are fully specified and some are simply listed as localhost. Such a mixed system can be problematic since the meaning of "localhost" will be different depending on when, and on which host, it is resolved. To prevent such problems, localhost is now fully resolved at configuration time.

This has ramifications that should be noted, however. Previously, when localhost-only system configurations were run with nanorc, the DAQ processes would be started on the host on which nanorc was run. With the new functionality, however, the DAQ processes that had a hostname of "localhost" will always be run on the computer on which the configruation was generated, independent of where nanorc is run.

Useful DBT and Spack commands for software areas

The following is current as of the daq-buildtools v7.5.0 release (latest as of Jan-17-2024). These and other commands can be found in this section of the daq-buildtools documentation :

  • dbt-info release: tells you if it's a far detector or near detector release, what its name is (e.g. FD23-11-06), what the name of the base release is, and where the release is located in cvmfs.
  • dbt-info package <dunedaq_package_name>: tells you info about the DUNE DAQ package whose name you provide it (git commit hash of its code, etc.). Passing "all" as the package name gives you info for all the DUNE DAQ packages.
  • dbt-info sourcecode: will tell you the branch each of the repos in your work area is on, as well as whether the code on the branch has been edited (indicated by an *)
  • dbt-info external <external_package_name>: external is same as the package option, except you use it when you want info not on a DUNE DAQ package but an external package (e.g., boost)
  • spack info fddaq # prints out the packages that are included in the fddaq bundle for the current software area
  • spack info dunedaq # prints out the packages that are included in the dunedaq (common) bundle for the current software area

Instructions for using the HDF5LIBS_TestDumpRecord utility

This utility can be used to print out information from the HDF5 raw data files. To invoke it use

  • HDF5LIBS_TestDumpRecord <filename>

Getting an overview of the HDF5 file structure

h5dump-shared -H <filename>

Dumping the binary content of a certain block from HDF5 file

This is another use of the h5dump-shared utility. This case uses the following command-line arguments:

  • the HDF5 path of the block we want to dump (-d )
  • the output binary file name (-o <output_file>)
  • the HDF5 file to be dumped

An example is:

h5dump-shared -d /TriggerRecord00001.0000/RawData/Detector_Readout_0x00000000_WIB -bLE -o dataset1.bin swtest_run002252_0000_dataflow0_datawriter_0_20221102T192809.hdf5

Once you have the binary file, you can examine it with tools like Linux od (octal dump), for example

od -x dataset1.bin

Sample integration tests

There are several integration tests available in the integtest directory of the daqsystemtest package. To run them, we suggest adding the daqsystemtest package to your software area (if not already done), cd sourcecode/daqsystemtest/integtest, and cat the README file to view the suggestions listed within it. (Those suggestions are along the lines of running a test with a command like pytest -s minimal_system_quick_test.py --nanorc-option partition-number <your_fav_num_1-9>.)

Monitoring the system

When running with nanorc, metrics reports appear in the info_*.json files that are produced (e.g. info_dataflow_<portno>.json). We can collate these, grouped by metric name, using python -m opmonlib.info_file_collator info_*.json (default output file is opmon_collated.json).

It is also possible to monitor the system using a graphic interface.

Notes on nanorc port offsets, including automatically-started ConnectivityService instances

From Pierre on 05-Apr-2023:

Steps to enable and view TRACE debug messages

Here are suggested steps for enabling and viewing debug messages in the TRACE memory buffer:

  • set up your software area, if needed (e.g. cd <work_dir>; source ./dbt-env.sh ; dbt-workarea-env)
  • export TRACE_FILE=$DBT_AREA_ROOT/log/${USER}_dunedaq.trace
    • this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with nanorc.
  • run the application using the nanorc commands described above
    • this populates the list of already-enabled TRACE levels so that you can view them in the next step
  • run tlvls
    • this command outputs a list of all the TRACE names that are currently known, and which levels are enabled for each name
    • TRACE names allow us to group related messages, and these names typically correspond to the name of the C++ source file
    • the bitmasks that are relevant for the TRACE memory buffer are the ones in the "maskM" column
  • enable levels with tonM -n <TRACE NAME> <level>
    • for example, tonM -n DataWriter DEBUG+5 (where "5" is the level that you see in the TLOG_DEBUG statement in the C++ code)
  • re-run tlvls to confirm that the expected level is now set
  • re-run the application
  • view the TRACE messages using tshow | tdelta -ct 1 | more
    • note that the messages are displayed in reverse time order

A couple of additional notes:

  • For debug statements in our code that look like TLOG_DEBUG(5) << "test, test";, we would enable the output of those messages using a shell command like tonM -n <TRACE_NAME> DEBUG+5. A couple of notes on this...
    • when we look at the output of the bitmasks with the tlvls command, bit #5 is going to be offset by the number of bits that TRACE and ERS reserve for ERROR, WARNING, INFO, etc. messages. At the moment, the offset appears to be 8, so the setting of bit "DEBUG+5" corresponds to setting bit #13.
    • when we view the messages with tshow, one of the columns in its output shows the level associated with the message (the column heading is abbreviated as "lvl"). Debug messages are prefaced with the letter "D", and they include the number that was specified in the C++ code. So, for our example of level 5, we would see "D05" in the tshow output for the "test, test" messages.
  • There are many other TRACE 'commands' that allow you to enable and disable messages. For example,
    • tonMg <level> enables the specified level for all TRACE names (the "g" means global in this context)
    • toffM -n <TRACE NAME> <level> disables the specified level for the specified TRACE name
    • toffMg <level> disables the specified level for all TRACE names
    • tlvlM -n <TRACE name> <mask> explicitly sets (and un-sets) the levels specified in the bitmask
⚠️ **GitHub.com Fallback** ⚠️