Instructions for setting up a development software area - DUNE-DAQ/daqconf Wiki

Instructions for setting up a development software area

23-Jun-2022 - Work in progress (the following steps have been verified to work)

[Here are some reference links: NP04 computer inventory, Working Group task lists, DUNE DAQ Software Style Guide, v3.0.0 tag collector, Github Project(Beta) View of v3.0.0 Issues, v3.0.0 Test Spreadsheet, software package tagging instructions

  1. create a new software area based on the 23-Jun nightly build (see step 1.iv for the exact dbt-create command to use)
    1. The steps for this are based on the latest instructions for daq-buildtools
    2. As always, you should verify that your computer has access to /cvmfs/
    3. If you are using one of the np04daq computers, enable the web proxy:
      source ~np04daq/bin/
    4. Here are the steps for creating the new software area:
      cd <directory_above_where_you_want_the_new_software_area>
      source /cvmfs/
      setup_dbt latest-spack
      dbt-create -c -n N22-06-23 <work_dir>  # works for both C7 and CS8
      cd <work_dir>
    5. Please note that if you are following these instructions on a computer on which the DUNE-DAQ software has never been run before, there are several system packages that may need to be installed on that computer. These are mentioned in this script. To check whether a particular one is already installed, you can use a command like yum list libzstd and check whether the package is listed under Installed Packages.
  2. add any desired repositories to the /sourcecode area. An example is provided here.
    1. clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)
      # change directory to the "sourcecode" subdir, if possible and needed
      if [[ -d "sourcecode" ]]; then
          cd sourcecode
      # double-check that we're in the correct subdir
      current_subdir=`echo ${PWD} | xargs basename`
      if [[ "$current_subdir" != "sourcecode" ]]; then
          echo ""
          echo "*** Current working directory is not \"sourcecode\", skipping repo clones"
          # finally, do the repo clone(s)
          git clone -b develop
          git clone -b develop
          cd ..
  3. setup the work area and build the software
    git clone -b develop
    cd nanorc; pip install -U .; cd ..
    dbt-build -j 20
  4. download a raw data file (CERNBox link) and put it into ./ (if you put the data anywhere else you'll need to specify that location when you run the confgen scripts below).
    • e.g. curl -o frames.bin -O
  5. daqconf_multiru_gen --host-ru localhost <other_options> <subdir name to use for generated config files>
    • e.g. daqconf_multiru_gen --host-ru localhost --latency-buffer-size 200000 -d ./frames.bin -o . -s 10 mdapp_5proc
  6. nanorc <config name> boot ${USER}-test init conf start <run number> wait 60 stop scrap terminate
    • e.g. nanorc mdapp_5proc boot ${USER}-test init conf start 111 wait 60 stop scrap terminate
    • or, you can simply invoke nanorc mdapp_5proc by itself and input the commands individually
    • 🔺Please Note:🔺 On the np04 DAQ cluster, the HTTP proxy must be disabled in order to get nanorc to run correctly. This can be done with source ~np04daq/bin/ -u (which runs unset HTTPS_PROXY; unset HTTP_PROXY; unset https_proxy; unset http_proxy).
    • 🔺Also Please Note:🔺 On lxplus, you may need to use the "--kerberos" option to nanorc in order to get the DAQ applications to boot (e.g. nanorc --kerberos <other options and arguments>).
  7. You can generate a configuration with multiple processes where supported by adding command-line options:
    1. multiple readout processes: --host-ru
      • daqconf_multiru_gen --host-ru localhost --host-ru localhost [...] <other_options> mdapp_Nproc
      • nanorc mdapp_Nproc ...
    2. multiple dataflow processes: --host-df
      • daqconf_multiru_gen --host-df localhost --host-df localhost [...] <other_options> mdapp_Nproc
      • nanorc mdapp_Nproc ...
  8. When you return to working with the software area after logging out, the steps that you'll need to redo are the following:
    • cd <work_dir>
    • source ./
    • dbt-workarea-env
    • dbt-build # if needed

Instructions for using the script

This script can be used to print out information from the HDF5 raw data files. To invoke it use

  • -f <filename> -p all

To see the list of available command-line options to the script use

  • -h

Dumping the binary content of a certain block from HDF5 file

To do that, there is a dedicated script that creates a binary file: h5dump-shared. It requires as an input:

  • the path of the block we need to dump - -d
  • The output binary file name - -o
  • the HDF5 file to be dumped

An example is:

h5dump-shared -d TriggerRecord00001/TPC/APA000/Link00 -bLE -o old.bin swtest_run000500_0000_eflumerf_20210512T133557.hdf5 

Sample integration tests

There are a few integration tests available in the integtest directory of the dfmodules package. To run them, we suggest adding the dfmodules package to your software area, rebuilding your area, cd sourcecode/dfmodules/integtest, and cat the README file to view the suggestions listed within it. (Those suggestions are along the lines of downloading an appropriate input data file and running a test with a command like pytest -s --frame-file $PWD/frames.bin.)

Monitoring the system

When running with nanorc, metrics reports appear in the info_*.json files that are produced (e.g. info_dataflow_<portno>.json). We can collate these, grouped by metric name, using python -m opmonlib.info_file_collator info_*.json (default output file is opmon_collated.json).

It is also possible to monitor the system using a graphic interface.

Steps to enable and view TRACE debug messages

Here are suggested steps for enabling and viewing debug messages in the TRACE memory buffer:

  • set up your software area, if needed (e.g. cd <work_dir>; source ./ ; dbt-workarea-env)
  • export TRACE_FILE=$DBT_AREA_ROOT/log/${USER}_dunedaq.trace
    • this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with nanorc.
  • run the application using the nanorc commands described above
    • this populates the list of already-enabled TRACE levels so that you can view them in the next step
  • run tlvls
    • this command outputs a list of all the TRACE names that are currently known, and which levels are enabled for each name
    • TRACE names allow us to group related messages, and these names typically correspond to the name of the C++ source file
    • the bitmasks that are relevant for the TRACE memory buffer are the ones in the "maskM" column
  • enable levels with tonM -n <TRACE NAME> <level>
    • for example, tonM -n DataWriter DEBUG+5 (where "5" is the level that you see in the TLOG_DEBUG statement in the C++ code)
  • re-run tlvls to confirm that the expected level is now set
  • re-run the application
  • view the TRACE messages using tshow | tdelta -ct 1 | more
    • note that the messages are displayed in reverse time order

A couple of additional notes:

  • For debug statements in our code that look like TLOG_DEBUG(5) << "test, test";, we would enable the output of those messages using a shell command like tonM -n <TRACE_NAME> DEBUG+5. A couple of notes on this...
    • when we look at the output of the bitmasks with the tlvls command, bit #5 is going to be offset by the number of bits that TRACE and ERS reserve for ERROR, WARNING, INFO, etc. messages. At the moment, the offset appears to be 8, so the setting of bit "DEBUG+5" corresponds to setting bit #13.
    • when we view the messages with tshow, one of the columns in its output shows the level associated with the message (the column heading is abbreviated as "lvl"). Debug messages are prefaced with the letter "D", and they include the number that was specified in the C++ code. So, for our example of level 5, we would see "D05" in the tshow output for the "test, test" messages.
  • There are many other TRACE 'commands' that allow you to enable and disable messages. For example,
    • tonMg <level> enables the specified level for all TRACE names (the "g" means global in this context)
    • toffM -n <TRACE NAME> <level> disables the specified level for the specified TRACE name
    • toffMg <level> disables the specified level for all TRACE names
    • tlvlM -n <TRACE name> <mask> explicitly sets (and un-sets) the levels specified in the bitmask

Persistent trouble delivering commands to processes on the NP04 DAQ cluster

This is likely because the HTTP proxy environmental variables have not been unset. Please try

  • source ~np04daq/bin/ -u (which runs unset HTTPS_PROXY; unset HTTP_PROXY; unset https_proxy; unset http_proxy)

and remember to re-set these env vars when you next want to interact with systems on the wider web, e.g. GitHub.

⚠️ ** Fallback** ⚠️