Setting up a fddaq‐v5.2.0 development area - DUNE-DAQ/daqconf GitHub Wiki
Instructions for setting up a Far Detector software area for v5.2.0 development based on a recent nightly build
07-Oct-2024 - 🔺 Work in progress! 🔺 Steps 1-8 in the first section have been verified to work. The Reference Information, and the remaining sections, will be (re)verified soon.
Reference information:
- general development: software development workflow, DUNE DAQ Software Style Guide
- suggested Spack commands to learn about the characteristics of an existing software area are available here
- an introduction to the "assets" system, which we use to store files that are not code, is here
- testing: NP04 computer inventory
- other: Working Group task lists, List of DUNE-DAQ GitHub teams and repos
- Main grafana dashboard
- Tag Collector
- OKS System Description
- DBE Editor Documentation
Here are the suggested steps:
-
create a new software area based on the latest nightly build (see step 1.iv for the exact
dbt-create
command to use)-
The steps for this are based on the latest instructions for daq-buildtools
-
As always, you should verify that your computer has access to /cvmfs/dunedaq.opensciencegrid.org
-
If you are using one of the np04daq computers, and need to clone packages, add the following lines to your .gitconfig file (once you do this, there will be no need to activate the web proxy each time you log in, and this means that you won't forget to disable it...):
[http] proxy = http://np04-web-proxy.cern.ch:3128 sslVerify = false
-
Here are the steps for creating the new software area:
cd <directory_above_where_you_want_the_new_software_area> source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh setup_dbt latest_v5 dbt-create -n NFD_DEV_241007_A9 [work_dir_name] # work_dir_name is optional cd <work_dir_name if you specified one, or NFD_DEV_241007_A9 otherwise>
- Please note that if you are following these instructions on a computer on which the DUNE-DAQ software has never been run before, there are several system packages that may need to be installed on that computer. These are mentioned in this script. To check whether a particular one is already installed, you can use a command like
yum list libzstd
and check whether the package is listed underInstalled Packages
.
-
-
add any desired repositories to the /sourcecode area. Some examples are provided in this section.
If you want to be able to modify the
test-session
configuration below, or if you are updating theappmodel
schema you will need to clone theappmodel
package. In order to run the unit tests mentioned below, you will need to clone thedfmodules
package. To just run the integration tests or thetest-session
as defined in the release, you will not need to clone any packages.-
decide if you want the very latest code, or a more stable set of packages that has been verified to work.
Run this command to select the very latest code
export use_very_latest_dunedaq_code=1
or this one to select a more stable set of packages.
export use_recent_verified_dunedaq_code=1
-
clone the repositories (the following block has some extra directory checking; it can all be copy/pasted into your shell window)
# change directory to the "sourcecode" subdir, if possible and needed if [[ -d "sourcecode" ]]; then cd sourcecode fi # double-check that we're in the correct subdir current_subdir=`echo ${PWD} | xargs basename` if [[ "$current_subdir" != "sourcecode" ]]; then echo "" echo "*** Current working directory is not \"sourcecode\", skipping repo clones" else # finally, do the repo clone(s) # We always get appmodel so that we can look at the configurations # If you want to run the dfmodules unit tests, clone dfmodules as well # appmodel and dfmodules are used as examples git clone https://github.com/DUNE-DAQ/appmodel.git -b develop git clone https://github.com/DUNE-DAQ/dfmodules.git -b develop git clone https://github.com/DUNE-DAQ/daqsystemtest.git -b develop if [[ "$use_very_latest_dunedaq_code" == "" ]]; then cd appmodel; git checkout 8eec2c9197; cd .. cd dfmodules; git checkout 8678c63293; cd .. cd daqsystemtest; git checkout 42af9d7ffb3; cd .. fi cd .. fi
-
-
setup the work area and build the software. NB: even if you haven't checked out any packages the
dbt-build
is necessary to install the rte script passed to the applications started bydrunc
source env.sh dbt-build -j 20 dbt-workarea-env
-
The
appmodel
repository contains a sample configuration for a small test system. It can be exercised using the following steps:# from your Linux shell command line... drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config # from within the drunc shell... # Note that it is best to use a different run number each time that you "start". boot conf start 101 enable-triggers # wait for a few seconds disable-triggers drain-dataflow stop-trigger-sources stop scrap terminate exit # Or, you can run everything in one Linux shell command: drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config boot wait 5 conf wait 3 start 101 enable-triggers wait 10 disable-triggers drain-dataflow stop-trigger-sources stop scrap terminate # after you exit `drunc`, you should wait for ~30 seconds for controller processes to exit before starting another session
-
dfmodules
contains unit tests which have been updated to use OKS, they can be run withdbt-unittest-summary.sh
-
Integration tests can be run straight from the release with
pytest -s $DUNE_DAQ_RELEASE_SOURCE/daqsystemtest/integtest/minimal_system_quick_test.py
-
If developing
drunc
ordruncschema
, after these are cloned runpip install
in the correspondingsourcecode
subdirectories. Rundbt-workarea-env
in the root of the working directory. -
When you return to working with the software area after logging out, the steps that you'll need to redo are the following:
cd <work_dir> source ./env.sh dbt-build # if needed dbt-workarea-env # if needed
Look here.
-
The current example configuration is here: https://github.com/DUNE-DAQ/appmodel/blob/develop/test/config/test-session.data.xml.
-
It consists of several files in both appmodel/test/config (configuration-specific) and appmodel/config/appmodel (somewhat more generic)
-
test-session.data.xml
is the main entry point for this configuration; it includes the other data files and defines the high-level objects like the Session, the DetectorConfig, and some of the Service entries. -
hosts.data.xml
defines the hosts and processing resources -
wiecconfs.data.xml
defines the "front end" electronics configuration -
ru-segment.data.xml
defines the readout applications (ru-01, ru-02, ru-03) -
df-segment.data.xml
defines the dataflow applications (df-01, df-02, df-03 and dfo-01) and their module configurations except -
trigger-segment.data.xml
defines the trigger applications (tc-maker-1, mlt and hsi-to-tc-app) and its module configurations -
data-store-params.data.xml
defines the output file configuration for the dataflow apps -
fsm.data.xml
defines the state machine and the supported transition commands -
connections.data.xml
defines the Network and Queue connection rules used to generate appropriate endpoints in the SmartDaqApplications -
moduleconfs.data.xml
contains DAQ Module configuration objects for readout, dataflow, and trigger
-
-
These files are placed in the
install/appmodel/share/test/config
directory upon build (via thedaq_install()
CMake command, based on their location inappmodel/test/config
), where OKS can find them -
To create a new configuration, we are currently limited to copying files from an existing configuration, but tools are in development to generate the readout map from an existing one.
-
All OKS configurations are loaded from files relative to the current working director and the paths listed in
DUNEDAQ_DB_PATH
. If you are preparing a configuration in a separate directory you may want to prepend it to the list inDUNEDAQ_DB_PATH
. By default it is set bydbt
to include the install directory of any packages you have in sourcecode that have a config or schema directory with .xml files, followed by the packages from the release. -
You can test the SmartDaqApplication generation of modules and connections using
listApps test-session test/config/test-session.data.xml generate_modules_test test-session <app_name> test/config/test-session.data.xml # <app_name> is one of the apps listed by listApps
- The data and schema XML files can be edited manually or by running the graphical dbe editors
dbe_main
andschemaeditor
. To enable use of the dbe editors, you must first run the commandspack load dbe
thendbe_main -f /path/to/file
for editing data files orschemaeditor -f <path to file>
for editing schema files.
NB: Spack loading the dbe
package updates your environment in ways that may affect the running of other commands. It is recomemned to do this in a separate shell or window.
- Log into the EHN1 DAQ cluster.
- setup a work area with a fresh nightly (NFD_DEV_240701_A9++) as above
- Clone the EHN1 config repository and source its
setup_db_path.sh
script - Start
drunc
as above butboot sessions/crp4-session.data.xml crp4-oks-session
- You will be able to look at monitoring metrics and errors using grafana (v5 dashboards).
- Setup your area:
# optionally you can have listrev locally doing cd sourcecode # optional git clone https://github.com/DUNE-DAQ/listrev.git -b develop # optional cd - # optional dbt-workarea-env dbt-build
- Launch
drunc-unified-shell
as in thetest-session
above - In the drunc shell, boot the
lr-session
boot config/lrSession.data.xml lr-session
- Start a run, wait a while and stop the run
-
grep Exiting log_*lr-session_listrev*
Will show the reported statistics. - The example is targeted at 100 Hz, so the expected number of messages seen by ReversedListValidator should be at least 100 times the run duration. There should be three lists in each message (from the three generators), so it should report 300 times the run duration for the number of lists.
- Messages are round-robined to the two reversers, so each should see 50run_duration messages and 150run_duration lists. They should have approximately equal values for the reported counters.
- Generators should generate 100*run_duration lists and send all (or almost all) of them.
-
dbt-info release
# prints out the release type and name, and the base release name (version) -
dbt-info package <dunedaq_package_name>
# prints out the package version and commit hash used by the release -
dbt-info sourcecode
# prints out the branch names of source repos under sourcecode, and marks those with local changes with "*" -
spack find --loaded -N <external_package_name>
, e.g.spack find --loaded -N boost
# prints out the version of the specified external package that is in use in the current software area -
spack info fddaq
# prints out the packages that are included in thefddaq
bundle for the current software area -
spack info dunedaq
# prints out the packages that are included in thedunedaq
(common) bundle for the current software area
Also see here.
When running with drunc, metrics reports appear in the info_*.json
files that are produced, one for each application (e.g. info_df-01.json
). We can collate these, grouped by metric name, using python -m opmonlib.info_file_collator info_*.json
(default output file is opmon_collated.json
).
It is also possible to monitor the system using a graphic interface.
Here are suggested steps for enabling and viewing debug messages in the TRACE memory buffer:
-
set up your software area, if needed (e.g.
cd <work_dir>; source ./dbt-env.sh ; dbt-workarea-env
) -
export TRACE_FILE=$DBT_AREA_ROOT/log/${USER}_dunedaq.trace
- this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with
nanorc
.
- this tells TRACE which file on disk to use for its memory buffer, and in this way, enables TRACE in your shell environment and in subsequent runs of the system with
-
edit your the OKS session that you are using to add the TRACE_FILE env var to the
drunc
environment:-
the highlighted lines in the following snippet of a config.data.xml file should be added (with the appropriate value for TRACE_FILE for your software area)
Session snippet: (expand this section to see the snippet)
<obj class="Session" id="mdapp-basic-session"> <attr name="use_connectivity_server" type="bool" val="1"/> <attr name="connectivity_service_interval_ms" type="u32" val="2000"/> <attr name="data_request_timeout_ms" type="u32" val="1000"/> <attr name="data_rate_slowdown_factor" type="u32" val="1"/> <rel name="environment"> <ref class="Variable" id="session-env-session-name-0"/> <ref class="Variable" id="session-env-session-name-1"/> <ref class="Variable" id="session-env-ers-verb"/> <ref class="Variable" id="session-env-ers-info"/> <ref class="Variable" id="session-env-ers-warning"/> <ref class="Variable" id="session-env-ers-error"/> <ref class="Variable" id="session-env-ers-fatal"/> <ref class="Variable" id="session-env-connectivity-server"/> <ref class="Variable" id="session-env-connectivity-port"/> <ref class="Variable" id="daqapp-cli-configuration"/> <ref class="Variable" id="session-env-trace_file"/> ### add a line like this ### </rel> <rel name="disabled"> </rel> <rel name="segment" class="Segment" id="root-segment"/> <rel name="infrastructure_applications"> <ref class="ConnectionService" id="local-connection-server"/> <ref class="OpMonService" id="local-opmon-application-service"/> </rel> <rel name="detector_configuration" class="DetectorConfig" id="dummy-detector"/> </obj> ... <obj class="Variable" id="session-env-ers-warning"> <attr name="name" type="string" val="DUNEDAQ_ERS_WARNING"/> <attr name="value" type="string" val="erstrace,throttle,lstdout"/> </obj> <obj class="Variable" id="session-env-trace_file"> ### add these 4 lines, also ### <attr name="name" type="string" val="TRACE_FILE"/> <attr name="value" type="string" val="/home/nfs/biery/dunedaq/16SepFDv5Test_0858/log/biery_dunedaq.trace"/> </obj> <obj class="Variable" id="session-env-session-name-0"> <attr name="name" type="string" val="DUNEDAQ_PARTITION"/> <attr name="value" type="string" val="mdapp-basic-session"/> </obj>
-
-
run the application using the
nanorc
commands described above- this populates the list of already-enabled TRACE levels so that you can view them in the next step
-
run
tlvls
- this command outputs a list of all the TRACE names that are currently known, and which levels are enabled for each name
- TRACE names allow us to group related messages, and these names typically correspond to the name of the C++ source file
- the bitmasks that are relevant for the TRACE memory buffer are the ones in the "maskM" column
-
enable levels with
tonM -n <TRACE NAME> <level>
- for example,
tonM -n DataWriter DEBUG+5
(where "5" is the level that you see in theTLOG_DEBUG
statement in the C++ code)
- for example,
-
re-run
tlvls
to confirm that the expected level is now set -
re-run the application
-
view the TRACE messages using
tshow | tdelta -ct 1 | more
- note that the messages are displayed in reverse time order
A couple of additional notes:
- For debug statements in our code that look like
TLOG_DEBUG(5) << "test, test";
, we would enable the output of those messages using a shell command liketonM -n <TRACE_NAME> DEBUG+5
. A couple of notes on this...- when we look at the output of the bitmasks with the
tlvls
command, bit #5 is going to be offset by the number of bits that TRACE and ERS reserve for ERROR, WARNING, INFO, etc. messages. At the moment, the offset appears to be 8, so the setting of bit "DEBUG+5" corresponds to setting bit #13. - when we view the messages with
tshow
, one of the columns in its output shows the level associated with the message (the column heading is abbreviated as "lvl"). Debug messages are prefaced with the letter "D", and they include the number that was specified in the C++ code. So, for our example of level 5, we would see "D05" in thetshow
output for the "test, test" messages.
- when we look at the output of the bitmasks with the
- There are many other TRACE 'commands' that allow you to enable and disable messages. For example,
-
tonMg <level>
enables the specified level for all TRACE names (the "g" means global in this context) -
toffM -n <TRACE NAME> <level>
disables the specified level for the specified TRACE name -
toffMg <level>
disables the specified level for all TRACE names -
tlvlM -n <TRACE name> <mask>
explicitly sets (and un-sets) the levels specified in the bitmask
-