Message Logging and TRACEing - art-daq/artdaq GitHub Wiki
{{TOC}}
Many of the issues/concepts mentioned here also apply to metrics.
{{collapse(A few notes introductory from Kurt)
There are a number of issues described at the link below, but one of the
primary ones is to gracefully handle the trade-off between generating
lots of messages and maintaining the performance of the code. To help in
this area, we have tools and recommendations for including messages in
artdaq code that provide the following features:
- Suppression of Debug messages during normal running
- Logging of messages to a circular memory buffer
- Throttling of high rates of identical messages
- Dynamic enabling/disabling of messages in running system
As of 04-Jan-2018, this page doesn’t have many examples of using these facilities. Those might be useful additions. For example, it might be good to describe
- what one would do to add a few levels of debug statements to a piece of code, use those during development, and leave them in the code (disabled) for production running
- how one would enable the messages found in someone else’s code while
the system is running.
}}
Artdaq programs and modules will run in an environment where the Message
Facility has been
initialized by artdaq::configureMessageFacility. Basically, this happens
first.
When modules include artdaq/DAQdata/Globals.hh, they get TRACE wrappers
around calls to the
message facility fundamentals. This header includes tracemf.h;
non-artdaq code may include tracemf.h or trace.h directly.
#include "artdaq/DAQdata/Globals.hh"
#define TRACE_NAME "BoardReaderName" # recommended/optional default name for TLOG(lvl) and TRACE(lvl,...)
OR, less commonly,
#include "tracemf.h"
#define TRACE_NAME "BoardReaderName" # recommended/optional default name for TLOG(lvl) and TRACE(lvl,...)
A note on TRACE_NAME definition is [[TRACE_NAME
definition|here]].
In this environment, we recommend writing logging/tracing/debugging
messages using one of the two
major styles:
TLOG_ERROR("facility") << message << TLOG_ENDL; # see Limitations section below
OR
TLOG(<level:3-63>) << message;
In the above, “ERROR” can be replaced with WARNING, INFO, or DEBUG.
So, the full list of available macros is:
Streamer style:
TLOG_ERROR("facility") # non-maskable
TLOG_WARNING("facility") # non-maskable
TLOG_INFO("facility") # non-maskable in debug build only
TLOG_DEBUG("facility") # non-maskable in debug build only
TLOG(lvl [,"facility"]) # intended for higher/dynamically enablable debug levels, facility optional
Printf style:
TRACE(lvl,msg,...) # uses TRACE_NAME as facility or "TRACE" is TRACE_NAME not defined
TRACEN("facility",lvl,msg,...)
The “lvl” (or “level”) can be a number in the range 0 to 63 (but usually in the range 4 to 63 as 0,1,2,3 are ERROR,WARNGING,INFO,DEBUG respectively).
Using #define TRACE_NAME in header files is discouraged as it can
cause
a conflict with a #define TRACE_NAME in the non-header file from which
they are included.
When “TRACES” are used in header files (in template or inline code), we
recommend
using the variants which include the name/facility in them:
TLOG(<level:3-63>,"my_template_class") << message;
or
TRACEN("my_template_class",lvl,msg,...)
TRACEs can dynamically be configured to send messages to a circular
memory buffer where they
can be displayed using the “tshow” display utility.
If the messages go only to memory, argument formatting is delayed, which
significantly reduce the
impact (CPU time) of the tracing.
Additionally, memory TRACEing does not involve any system calls (getting time is a virtual syscall). This allows for maximal debug statements with minimal impact on application performance. With the write system call that is involved with file, console or network destinations, program flow can be interrupted/changed and backpressure to the application may result.
Process ID (pid), thread ID, CPU (core), and microsecond TOD information are stored in the trace memory.
Configuring the memory buffer and the enabling of the additional debug/logging levels can the done via
- additional artdaq trace fhicl configuration
- environment variables
- command line utility assuming memory buffer tracing has been activated
For more details on TRACEing, see the trace: wiki page.
When the TRACE UPS product is setup, and the appropriate TRACE_* env.
var. is set, the “tshow” command can be used to display the TRACE
entries. tshow is often used with the tdelta unix filter to add delta
time (delta microseconds) and to convert the time number into a more
human readable format.
Example tshow output:
export TRACE_FILE=/tmp/trace_buffer_daquser
The first time a thread executes a TLOG_*( name ), a name-to-tid
lookup occurs. The resulting tid is stored for subsequent executions.
This means the following code:
std::string name=std::string("label")+x;
TLOG_DEBUG(name) << "x=" << x;
executed 3 times with values of x equal to 1, 2, and 3, will produce
messages with names as follows:
1st execution: name: “label1”, message: x=1
2nd execution: name: “label1”, message: x=2
3rd execution: name: “label1”, message: x=3
For the impatient, the quickest way to do this is to set the debug_cout parameter to “true” in the FHiCL file that you are using (Step 1, Option 1 in the ‘More details’ below). This should get you 95% of the way there.
{{collapse(More details on enabling debug messages in artdaqDriver)
The first step is to enable debug messages at the MessageFacility level,
and there are a couple of ways to do this. The first way is easier, but
the second allows you to add timestamps to the messages, which are
sometimes helpful.
Option 1: set the debug_cout parameter to “true” in the FHiCL file that you use to run artdaqDriver. (This parameter should exist at the top level in your FHiCL document. Here is an example.)
Option 2: create a temporary MessageFacility FHiCL file with severity set to DEBUG and tell artdaqDriver to include that FHiCL. To do this, create a file with the following contents:
console: {
type: "cout"
threshold: "DEBUG" # if "INFO" then DEBUG message are not sent
noTimeStamps: false
}
Then, set the ARTDAQ_LOG_FHICL environmental variable to point to that file:
export ARTDAQ_LOG_FHICL=`pwd`/<filename>
The second step is to examine, and possibly change, the TRACE settings for the computer that you are running on. To do this, you can use the following steps:
- ‘export TRACE_NAME=TRACE’ # this enables TRACE commands for the shell that you are using
- ‘ttids’ # this displays the current TRACE levels that are enabled for all of the TRACE names that have been defined
- you can enable or disable the printout of various debug messages by
changing the “level-S” bit masks. (“S” in this case means “slow”
messaging channel, i.e. the console).
- for example, ‘tlvlS -n _SharedMemoryEventManager 0x0’ will turn off all messages to the console from the SharedMemoryEventManager (run ‘ttids’ again to see the effect of the tlvlS command)
- a different way to change the level masks for all TRACE names is to ‘export TRACE_LVLS=’ before running artdaqDriver; either: a) export TRACE_LVLS=; artdaqDriver -c <fhicl_config_file>, or b) TRACE_LVLS= artdaqDriver -c <fhicl_config_file>
- more information on controlling TRACE output is available in the
TRACE User’s
Guide
}}
First, there are 2 groups of “Debug” messages:
# TLOG’s with “DEBUG” e.g.
TLOG(TLVL_DEBUG) or possibly TLOG(3)
- TLOG’s with the level higher than 3 i.e 4-63
It is highly likely, that the TRACEing environment that the artdaq
processes are run in, by default, will not have the either of the 2
Debug groups enabled.
In this case, DEBUG messages (specifically group 1) will not be seen in
the logs/console, unless running a debug build (not the prof build).
If you want to see DEBUG messages when running prof builds, you have to
have the “trace level” associated TLVL_DEBUG set (on), which can be
done:
- by setting environment variable TRACE_LVLS=–1 or 0xf or any 64 bit number which has the fourth bit (b3) set, or
- by having the trace environment enabled and having the fourth bit in the appropriate “tlvls slow mask” set (see the “More details” section above).
In some cases, we want to see all of the messages, even though some fraction of them might naturally be suppressed by the rate-limiting code that is enabled by default. To temporarily turn off the rate limitation, we can use the following command:
- ‘export TRACE_LIMIT_MS=0,0,0’ # the first zero effectively turns off rate limiting
- (the three values are the ‘limit count’, the ‘ON milliseconds’, and the ‘OFF milliseconds’)
To re-enable the default behavior, use
- ‘export TRACE_LIMIT_MS=10,500,1500’
PLEASE NOTE that changing these settings affects all applications that use the current TRACE environment, including any in the running DAQ. As such, there are probably better ways to modify our environment when using a more offline-focused application, such as an event dump program. In that case, it is probably better to unset the TRACE env vars that are set in the current environment (e.g. ‘unset TRACE_NAME’) and thereby avoid affecting other applications.