Setting up a fddaq‐v4.1.1 software area - DUNE-DAQ/daqconf GitHub Wiki
19-Sep-2023
Reference information:
- the
fddaq-v4.1.1
Far Detector software release is based on thedunedaq-v4.1.1
release of the common DAQ software packages.
More reference information (click on the triangle to view the details)
- the list of the FD software package versions that are included in the release is available here
- the list of the common software package versions that are included in the release is available here
- suggested Spack commands to learn about the characteristics of an existing software area are available as part of the daq-buildtools documentation here
- the Tag Collector spreadsheet that was used for this release is here
- the test-tracking spreadsheet that was used for the fddaq-v4.1.0 release is here
The steps for creating and using the software area:
-
create a new software area based on the v4.1.1 release build (see step 1.iv for the exact
dbt-create
command to use)-
The steps for this are based on the latest instructions for daq-buildtools
-
As always, you should verify that your computer has access to /cvmfs/dunedaq.opensciencegrid.org
-
If you are using one of the np04daq computers, and need to clone packages, add the following lines to your .gitconfig file (no need to activate proxy globally, so you won't forget to disable it...):
[http] proxy = http://np04-web-proxy.cern.ch:3128 sslVerify = false
-
Here are the steps for creating the new software area:
cd <directory_above_where_you_want_the_new_software_area> source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh setup_dbt fddaq-v4.1.1 dbt-create fddaq-v4.1.1 <work_dir> # use optional "-c" argument to clone pyvenv in work area cd <work_dir>
-
Please note that if you are following these instructions on a computer on which the DUNE-DAQ software has never been run before, there are several system packages that may need to be installed on that computer. These are mentioned in this script. To check whether a particular one is already installed, you can use a command like
yum list libzstd
and check whether the package is listed underInstalled Packages
.
-
-
add any desired repositories to the /sourcecode area. An example is provided here.
- clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)
# change directory to the "sourcecode" subdir, if possible and needed if [[ -d "sourcecode" ]]; then cd sourcecode fi # double-check that we're in the correct subdir current_subdir=`echo ${PWD} | xargs basename` if [[ "$current_subdir" != "sourcecode" ]]; then echo "" echo "*** Current working directory is not \"sourcecode\", skipping repo clones" else # finally, do the repo clone(s) git clone https://github.com/DUNE-DAQ/daqconf.git -b dunedaq-v4.1.1 git clone https://github.com/DUNE-DAQ/daq-systemtest.git -b dunedaq-v4.1.1 git clone https://github.com/DUNE-DAQ/dfmodules.git -b dunedaq-v4.1.1 cd .. fi
- clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)
-
setup the work area and build the software
dbt-workarea-env dbt-build -j 20 dbt-workarea-env
-
prepare a
daqconf.json
file, such as the one shown here. This sample includes parameter values that select the WIBEth data type. (Please note the additional comments on this sample file that are included below!){ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "daq_common": { "data_rate_slowdown_factor": 1 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "default_data_file": "asset://?label=WIBEth&subsystem=readout" }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
A few notes on the sample file shown above:
- The "use/start_connectivity_service" parameters aren't strictly needed, since their default value is "true". Ditto, the "connectivity_service_host/port". However, all of these are included so that people can use them for reference.
- A port offset is applied to the "connectivity_service_port" by nanorc, so we don't all need to use different numbers, as long as we use different partition numbers when running nanorc, e.g.
'nanorc --partition-number 2 ...'
) - If you want to use an existing, externally-started Connectivity Service instance, such as the one on the np04 cluster, you would set "use_connectivity_service" to true, and "start_connectivity_service" to false.
Another option (the initial config, but with the ConnSvc disabled)
{ "boot": { "use_connectivity_service": false, "start_connectivity_service": false }, "daq_common": { "data_rate_slowdown_factor": 1 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "default_data_file": "asset://?label=WIBEth&subsystem=readout" }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
-
prepare a data-readout map file (e.g. my_dro_map.json), listing the detector streams (true or fake) that you want to run with, e.g.:
[ { "src_id": 100, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 0 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 101, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 1 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } } ]
-
Generate a configuration, e.g.:
daqconf_multiru_gen -c ./daqconf.json --detector-readout-map-file ./my_dro_map.json my_test_config
-
nanorc <config name> <partition name> boot conf start_run <run number> wait 60 stop_run scrap terminate
- e.g.
nanorc my_test_config ${USER}-test boot conf start_run 111 wait 60 stop_run scrap terminate
- or, you can simply invoke
nanorc my_test_config ${USER}-test
by itself and input the commands individually
- e.g.
-
When you return to working with the software area after logging out, the steps that you'll need to redo are the following:
cd <work_dir>
source ./env.sh
-
dbt-build
# if needed -
dbt-workarea-env
# if needed
-
For reference, here are
daqconf.json
anddro_map.json
files for emulated DuneWIB electronicsSample daqconf.json for DuneWIB
{ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "daq_common": { "data_rate_slowdown_factor": 10 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "data_files": [ {"detector_id": 3, "data_file": "asset://?label=DuneWIB&subsystem=readout"} ] }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
Another option, with DuneWIB, Trigger Primitive generation enabled, and multiple Dataflow apps
{ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "dataflow": { "enable_tpset_writing": true, "apps": [ { "app_name": "dataflow0" }, { "app_name": "dataflow1" } ] }, "daq_common": { "data_rate_slowdown_factor": 10 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "enable_tpg": true, "tpg_threshold": 500, "use_fake_cards": true, "data_files": [ {"detector_id": 3, "data_file": "asset://?label=DuneWIB&subsystem=readout"} ] }, "trigger": { "trigger_activity_config": {"prescale":1000}, "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
Sample dro_map.json for DuneWIB
[ { "src_id": 100, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 0 }, "kind": "flx", "parameters": { "protocol": "full", "mode": "fix_rate", "host": "localhost", "card": 0, "slr": 0, "link": 0 } }, { "src_id": 101, "geo_id": { "det_id": 3, "crate_id": 1, "slot_id": 0, "stream_id": 1 }, "kind": "flx", "parameters": { "protocol": "full", "mode": "fix_rate", "host": "localhost", "card": 0, "slr": 0, "link": 1 } } ]
An example hardware map file from the Vertical Drift Coldbox can be found here.
-
For reference, here are
daqconf.json
anddro_map.json
files for VD TDE (vertical drift, top detector electronics)Sample daqconf.json for VD TDE
{ "boot": { "use_connectivity_service": true, "start_connectivity_service": true, "connectivity_service_host": "localhost", "connectivity_service_port": 15432 }, "daq_common": { "data_rate_slowdown_factor": 1 }, "detector": { "clock_speed_hz": 62500000 }, "readout": { "use_fake_cards": true, "default_data_file": "asset://?checksum=759e5351436bead208cf4963932d6327" }, "trigger": { "trigger_window_before_ticks": 1000, "trigger_window_after_ticks": 1000 }, "hsi": { "random_trigger_rate_hz": 1.0 } }
Sample dro_map.json for VD TDE
[ { "src_id": 100, "geo_id": { "det_id": 11, "crate_id": 1, "slot_id": 0, "stream_id": 0 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } }, { "src_id": 101, "geo_id": { "det_id": 11, "crate_id": 1, "slot_id": 1, "stream_id": 1 }, "kind": "eth", "parameters": { "protocol": "udp", "mode": "fix_rate", "rx_iface": 0, "rx_host": "localhost", "rx_mac": "00:00:00:00:00:00", "rx_ip": "0.0.0.0", "tx_host": "localhost", "tx_mac": "00:00:00:00:00:00", "tx_ip": "0.0.0.0" } } ]
Starting with dunedaq-v4.0.0, when we specify a hostname of "localhost" in a daqconf.json or dro_map.json file, that hostname is resolved at configuration time, using the name of the host on which the configuration is generated. This is handled by the code in the daqconf
package, and it is done to prevent problems in situations in which some of the hosts are fully specified and some are simply listed as localhost. Such a mixed system can be problematic since the meaning of "localhost" will be different depending on when, and on which host, it is resolved. To prevent such problems, localhost is now fully resolved at configuration time.
This has ramifications that should be noted, however. Previously, when localhost-only system configurations were run with nanorc
, the DAQ processes would be started on the host on which nanorc
was run. With the new functionality, however, the DAQ processes that had a hostname of "localhost" will always be run on the computer on which the configruation was generated, independent of where nanorc
is run.
This utility can be used to print out information from the HDF5 raw data files. To invoke it use
HDF5LIBS_TestDumpRecord <filename>
h5dump-shared -H <filename>
This is another use of the h5dump-shared
utility. This case uses the following command-line arguments:
- the HDF5 path of the block we want to dump (-d )
- the output binary file name (-o <output_file>)
- the HDF5 file to be dumped
An example is:
h5dump-shared -d /TriggerRecord00001.0000/RawData/Detector_Readout_0x00000000_WIB -bLE -o dataset1.bin swtest_run002252_0000_dataflow0_datawriter_0_20221102T192809.hdf5
Once you have the binary file, you can examine it with tools like Linux od
(octal dump), for example
od -x dataset1.bin
There are a few integration tests available in the integtest directory of the dfmodules package. To run them, we suggest adding the dfmodules package to your software area, rebuilding your area, cd sourcecode/dfmodules/integtest
, and cat the README file to view the suggestions listed within it. (Those suggestions are along the lines of running a test with a command like pytest -s minimal_system_quick_test.py --nanorc-option partition-number <your_fav_num_1-9>
.)
When running with nanorc, metrics reports appear in the info_*.json
files that are produced (e.g. info_dataflow_<portno>.json
). We can collate these, grouped by metric name, using python -m opmonlib.info_file_collator info_*.json
(default output file is opmon_collated.json
).
It is also possible to monitor the system using a graphic interface.
From Pierre on 05-Apr-2023:
- for nano04rc: port_offset = 0 + partition_number * 500 https://github.com/DUNE-DAQ/nanorc/blob/develop/src/nanorc/__main_np04__.py#LL77C6-L77C45
- for nanorc: port_offset = 0 + partition_number * 500 https://github.com/DUNE-DAQ/nanorc/blob/develop/src/nanorc/cli.py#L108
- for nanotimingrc: port_offset = 300 + partition_number * 500 https://github.com/DUNE-DAQ/nanorc/blob/develop/src/nanorc/__main_timing__.py#L69
Here are suggested steps for enabling and viewing debug messages in the TRACE memory buffer:
- set up your software area, if needed (e.g.
cd <work_dir>; source ./dbt-env.sh ; dbt-workarea-env
) -
export TRACE_FILE=$DBT_AREA_ROOT/log/${USER}_dunedaq.trace
- this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with
nanorc
.
- this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with
- run the application using the
nanorc
commands described above- this populates the list of already-enabled TRACE levels so that you can view them in the next step
- run
tlvls
- this command outputs a list of all the TRACE names that are currently known, and which levels are enabled for each name
- TRACE names allow us to group related messages, and these names typically correspond to the name of the C++ source file
- the bitmasks that are relevant for the TRACE memory buffer are the ones in the "maskM" column
- enable levels with
tonM -n <TRACE NAME> <level>
- for example,
tonM -n DataWriter DEBUG+5
(where "5" is the level that you see in theTLOG_DEBUG
statement in the C++ code)
- for example,
- re-run
tlvls
to confirm that the expected level is now set - re-run the application
- view the TRACE messages using
tshow | tdelta -ct 1 | more
- note that the messages are displayed in reverse time order
A couple of additional notes:
- For debug statements in our code that look like
TLOG_DEBUG(5) << "test, test";
, we would enable the output of those messages using a shell command liketonM -n <TRACE_NAME> DEBUG+5
. A couple of notes on this...- when we look at the output of the bitmasks with the
tlvls
command, bit #5 is going to be offset by the number of bits that TRACE and ERS reserve for ERROR, WARNING, INFO, etc. messages. At the moment, the offset appears to be 8, so the setting of bit "DEBUG+5" corresponds to setting bit #13. - when we view the messages with
tshow
, one of the columns in its output shows the level associated with the message (the column heading is abbreviated as "lvl"). Debug messages are prefaced with the letter "D", and they include the number that was specified in the C++ code. So, for our example of level 5, we would see "D05" in thetshow
output for the "test, test" messages.
- when we look at the output of the bitmasks with the
- There are many other TRACE 'commands' that allow you to enable and disable messages. For example,
-
tonMg <level>
enables the specified level for all TRACE names (the "g" means global in this context) -
toffM -n <TRACE NAME> <level>
disables the specified level for the specified TRACE name -
toffMg <level>
disables the specified level for all TRACE names -
tlvlM -n <TRACE name> <mask>
explicitly sets (and un-sets) the levels specified in the bitmask
-