z.Instructions for setting up a v3.1.0 software area - DUNE-DAQ/daqconf GitHub Wiki

Instructions for setting up a v3.1.0 software area

28-Jul-2022 - Work in progress (the following steps have been verified to work)

Reference links:

  1. create a new software area based on the most recent dunedaq-v3.1.0 candidate release (see step 1.iv for the exact dbt-create command to use)
    1. The steps for this are based on the latest instructions for daq-buildtools
    2. As always, you should verify that your computer has access to /cvmfs/dunedaq.opensciencegrid.org
    3. If you are using one of the np04daq computers, enable the web proxy:
      source ~np04daq/bin/web_proxy.sh
      
    4. Here are the steps for creating the new software area:
      cd <directory_above_where_you_want_the_new_v3.1_software_area>
      source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh
      setup_dbt dunedaq-v3.1.0
      dbt-create -c dunedaq-v3.1.0 <work_dir>
      cd <work_dir>
      
    5. Please note that if you are following these instructions on a computer on which the MiniDAQ has never been run before, there are several system packages that may need to be installed on that computer. These are mentioned in this script. To check whether a particular one is already installed, you can use a command like yum list libzstd and check whether the package is listed under Installed Packages.
  2. add any desired repositories to the /sourcecode area. An example is provided here.
    1. clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)
      # change directory to the "sourcecode" subdir, if possible and needed
      if [[ -d "sourcecode" ]]; then
          cd sourcecode
      fi
      # double-check that we're in the correct subdir
      current_subdir=`echo ${PWD} | xargs basename`
      if [[ "$current_subdir" != "sourcecode" ]]; then
          echo ""
          echo "*** Current working directory is not \"sourcecode\", skipping repo clones"
      else
          # finally, do the repo clone(s)
          git clone https://github.com/DUNE-DAQ/daqconf.git -b dunedaq-v3.1.0
          cd ..
      fi
      
      
  3. setup the work area, install the correct version of nanorc, and build the software
    dbt-workarea-env
    dbt-build
    
    
  4. download a raw data file (CERNBox link) and put it into ./ (if you put the data anywhere else you'll need to specify that location when you run the confgen scripts below).
    • e.g. curl -o frames.bin -O https://cernbox.cern.ch/index.php/s/0XzhExSIMQJUsp0/download
  5. daqconf_multiru_gen --host-ru localhost <other_options> <subdir name to use for generated config files>
    • e.g. daqconf_multiru_gen --host-ru localhost --latency-buffer-size 200000 -d ./frames.bin -o . -s 10 mdapp_5proc
  6. nanorc <config name> ${USER}-test boot conf start_run <run number> wait 60 stop_run scrap terminate
    • e.g. nanorc mdapp_5proc ${USER}-test boot conf start_run 111 wait 60 stop_run scrap terminate
    • or, you can simply invoke nanorc mdapp_5proc by itself and input the commands individually
    • 🔺Please Note:🔺 On the np04 DAQ cluster, the HTTP proxy must be disabled in order to get nanorc to run correctly. This can be done with source ~np04daq/bin/web_proxy.sh -u (which runs unset HTTPS_PROXY; unset HTTP_PROXY; unset https_proxy; unset http_proxy).
    • 🔺Also Please Note:🔺 On lxplus, you may need to use the "--kerberos" option to nanorc in order to get the DAQ applications to boot (e.g. nanorc --kerberos <other options and arguments>).
  7. You can generate a configuration with multiple processes where supported by adding command-line options:
    1. multiple readout processes: --host-ru
      • daqconf_multiru_gen --host-ru localhost --host-ru localhost [...] <other_options> mdapp_Nproc
      • nanorc mdapp_Nproc ...
    2. multiple dataflow processes: --host-df
      • daqconf_multiru_gen --host-df localhost --host-df localhost [...] <other_options> mdapp_Nproc
      • nanorc mdapp_Nproc ...
  8. When you return to working with the software area after logging out, the steps that you'll need to redo are the following:
    • cd <work_dir>
    • source ./dbt-env.sh
    • dbt-workarea-env
    • dbt-build # if needed

Instructions for using the hdf5_dump.py script

This script can be used to print out information from the HDF5 raw data files. To invoke it use

  • hdf5_dump.py -f <filename> -p all

To see the list of available command-line options to the script use

  • hdf5_dump.py -h

Dumping the binary content of a certain block from HDF5 file

To do that, there is a dedicated script that creates a binary file: h5dump-shared. It requires as an input:

  • the path of the block we need to dump - -d
  • The output binary file name - -o
  • the HDF5 file to be dumped

An example is:

h5dump-shared -d TriggerRecord00001/TPC/APA000/Link00 -bLE -o old.bin swtest_run000500_0000_eflumerf_20210512T133557.hdf5 

Sample integration tests

There are a few integration tests available in the integtest directory of the dfmodules package. To run them, we suggest adding the dfmodules package to your software area, rebuilding your area, cd sourcecode/dfmodules/integtest, and cat the README file to view the suggestions listed within it. (Those suggestions are along the lines of downloading an appropriate input data file and running a test with a command like pytest -s minimal_system_quick_test.py --frame-file $PWD/frames.bin.)

Monitoring the system

When running with nanorc, metrics reports appear in the info_*.json files that are produced (e.g. info_dataflow_<portno>.json). We can collate these, grouped by metric name, using python -m opmonlib.info_file_collator info_*.json (default output file is opmon_collated.json).

It is also possible to monitor the system using a graphic interface.

Steps to enable and view TRACE debug messages

Here are suggested steps for enabling and viewing debug messages in the TRACE memory buffer:

  • set up your software area, if needed (e.g. cd <work_dir>; source ./dbt-env.sh ; dbt-workarea-env)
  • export TRACE_FILE=$DBT_AREA_ROOT/log/${USER}_dunedaq.trace
    • this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with nanorc.
  • run the application using the nanorc commands described above
    • this populates the list of already-enabled TRACE levels so that you can view them in the next step
  • run tlvls
    • this command outputs a list of all the TRACE names that are currently known, and which levels are enabled for each name
    • TRACE names allow us to group related messages, and these names typically correspond to the name of the C++ source file
    • the bitmasks that are relevant for the TRACE memory buffer are the ones in the "maskM" column
  • enable levels with tonM -n <TRACE NAME> <level>
    • for example, tonM -n DataWriter DEBUG+5 (where "5" is the level that you see in the TLOG_DEBUG statement in the C++ code)
  • re-run tlvls to confirm that the expected level is now set
  • re-run the application
  • view the TRACE messages using tshow | tdelta -ct 1 | more
    • note that the messages are displayed in reverse time order

A couple of additional notes:

  • For debug statements in our code that look like TLOG_DEBUG(5) << "test, test";, we would enable the output of those messages using a shell command like tonM -n <TRACE_NAME> DEBUG+5. A couple of notes on this...
    • when we look at the output of the bitmasks with the tlvls command, bit #5 is going to be offset by the number of bits that TRACE and ERS reserve for ERROR, WARNING, INFO, etc. messages. At the moment, the offset appears to be 8, so the setting of bit "DEBUG+5" corresponds to setting bit #13.
    • when we view the messages with tshow, one of the columns in its output shows the level associated with the message (the column heading is abbreviated as "lvl"). Debug messages are prefaced with the letter "D", and they include the number that was specified in the C++ code. So, for our example of level 5, we would see "D05" in the tshow output for the "test, test" messages.
  • There are many other TRACE 'commands' that allow you to enable and disable messages. For example,
    • tonMg <level> enables the specified level for all TRACE names (the "g" means global in this context)
    • toffM -n <TRACE NAME> <level> disables the specified level for the specified TRACE name
    • toffMg <level> disables the specified level for all TRACE names
    • tlvlM -n <TRACE name> <mask> explicitly sets (and un-sets) the levels specified in the bitmask

Persistent trouble delivering commands to processes on the NP04 DAQ cluster

This is likely because the HTTP proxy environmental variables have not been unset. Please try

  • source ~np04daq/bin/web_proxy.sh -u (which runs unset HTTPS_PROXY; unset HTTP_PROXY; unset https_proxy; unset http_proxy)

and remember to re-set these env vars when you next want to interact with systems on the wider web, e.g. GitHub.

⚠️ **GitHub.com Fallback** ⚠️