Engage Logging - rallytac/pub GitHub Wiki

Engage Logging

Engage's logging subsystem is a high-performance component that allows developers and administrators alike to review activity of an Engage-based process for administratives and troubleshooting purposes.

Output Targets

The logger has two primary output targets: the operating system's standard console as well as Syslog or the Syslog-like logging subsystem offered by the OS.

Linux

On most Linux systems, Syslog output (and very often console output too) ends up in /var/log or some other place where logging is configured for.

Windows

On Windows systems, Engage outputs to the console as well as the Windows Debugger. You can use a tool like Microsoft's DebugView tool to monitor this output in realtime.

Mac OSX

On OSX, Engage outputs to the operating system's Console subsystem.

Android

Android's output targets are very similar to Linux described above (Android is, after all, a Linux system). In addition, Engage's output can be monitored with the Android Logcat tool.

iOS

... work is underway ...

Log Levels

Each log message is classified by a logging level. There are 5 levels:

Value Name Indicator Comments
0 Fatal F A fatal error has occurred - processing cannot continue. Fatals must be addressed right away.
1 Error E An error has occurred - processing will continue. Errors should be addressed as soon as possible.
2 Warning W This is a warning that should be addressed soon.
3 Informational I The message is informational in nature.
4 Debug D Debugging.

In normal production circumstances you should have your logging level set to 2 so that only warnings and other problems end up in the log.

If you're troubleshooting issues, setting the level to 3 gives you more information with minimal impact to the normal operation of the software.

If you need to really get into the weeds on a problem, set your level to 4 but be aware that this produces an enormous amount of logging that will likely impact performance.

Message Tags

Log messages are tagged with a short string usually indicating where the message was being generated. Most often these tags denote a software component inside the Engage software such as InboundProcessor (the thing that handles inbound network traffic), Group (an instance of an Engage group), Engine (the core Engine), and so on. There are hundreds of these components and really only useful to C++ developers working on the software.

Format

Each log line is formatted as follows:

date time [pid/object] indicator/tag: message

  • date : Date in YYYY:MM:DD format in Universal Coordinated Time (UTC).
  • time : Time in hh:mm:ss:lll format (UTC) where lll is milliseconds.
  • pid : Operating system process identifier (version 1.193 onward).
  • object : Internal object memory pointer.
  • indicator : Indicator of the logging level as described above.
  • tag: The message tag as described above.
  • message: The message itself.

For example: Here's a log line indicating an error (E) reported by the Leaf component regarding a failure to connnect to myrp.rallytac.com:7443.

2020:12:06 11:50:01.970 [0x70beb024f0-                ] E/Leaf: {0x70d06f7a00} connect() failed for myrp.rallytac.com:7443

This log line (2 milliseconds later) is debug in nature (D) from Leaf indicating that it will wait for 1369 milliseconds before it retries that connection.

2020:12:06 11:50:01.972 [0x70beb024f0-                ] D/Leaf: {0x70d06f7a00} pause for 1369 before next connection attempt

Processing & Analysis

With this knowledge in hand, you can now perform processing of log output such as forwarding certain levels or tagged components on to centralized logging systems. This is usually done with an agent of some sort provided by these logging systems.

Also, using tools such as grep you can analyze output from logs and realtime streams with some pretty straightforward regular expressions. For example: let's say we have a log file with thousands of lines.

If we only want to see fatals (type F), our grep command-line would like something like this.

$ grep -E "(\] [F]/)" the_big_log_file.log

Here we're telling grep to only show us lines that contain "] F/" (quotes NOT included of course). Note the \ before the ]. This is important because ] means something special to grep. By preceding it with \ we're telling grep to treat ] as something to be searched for and not part of the regular expression syntax.

But, maybe we want to see errors as well. So, just add E to the list of items we give to grep.

$ grep -E "(\] [F,E]/)" the_big_log_file.log

How about ANY log line output by Engage (assuming the log contains logging from other applications too). There's a good bet that no (or few) other apps will output logging in a similar format (at least with the pattern of ] <something>/):

So, let's include F, E, W, I, and D indicators as follows:

$ grep -E "(\] [F,E,W,I,D]/)" the_big_log_file.log

Don't forget -E on the grep command-line. This tells grep that the search argument is a extended regular expression.

Now, often you might find that your log contains the output from multiple instances of the Engage-based process - and you only want to see the output from a particular instance.

For example: Here's a log snippet from Android which captured output from two instances of an application running the Engage Engine. Each instance has a process ID (PID) which, here, are 3879 and 24168. If our regular expression is simply "(\] [E]/)", we're going to end up with errors from both processes. And that makes things more complex for us.

$ grep -E "(\] [E]/)" the_big_log_file.log

12-06 11:18:31.916  3879 29983 E Group   : 2020:12:06 11:18:31.916 [0x70ba6d04f0-                ] E/Group: {0x70c1a79e00} not RTP
12-06 11:20:51.994  3879 30003 E Leaf    : 2020:12:06 11:20:51.994 [0x70b09d74f0-                ] E/Leaf: {0x70b8dcc200} connect() failed for myrp.rallytac.com:7443
12-06 11:44:24.142  3879 29983 E Group   : 2020:12:06 11:44:24.142 [0x70ba6d04f0-                ] E/Group: {0x70c178af00} not RTP
12-06 11:45:00.698  3879 29983 E Group   : 2020:12:06 11:45:00.699 [0x70ba6d04f0-                ] E/Group: {0x70c178af00} not RTP
12-06 11:47:28.235  3879 29983 E Group   : 2020:12:06 11:47:28.236 [0x70ba6d04f0-                ] E/Group: {0x70c178af00} not RTP
12-06 11:48:55.072 24168 24195 E Engine  : 2020:12:06 11:48:55.072 [0x70c1ac14f0-                ] E/Engine: directory '/tmp/com.rallytac.engage/engine/timelines' does not exist
12-06 11:50:01.970 24168 24344 E Leaf    : 2020:12:06 11:50:01.970 [0x70beb024f0-                ] E/Leaf: {0x70d06f7a00} connect() failed for myrp.rallytac.com:7443
12-06 11:55:30.028 24168 24324 E Group   : 2020:12:06 11:55:30.028 [0x70b90054f0-                ] E/Group: {0x70bd8bac00} not RTP
12-06 11:56:03.847 24168 24324 E Group   : 2020:12:06 11:56:03.847 [0x70b90054f0-                ] E/Group: {0x70bd8bac00} not RTP

In this particular output from Android, each line has some info provided by Android, followed by the output from Engage.

12-06 11:18:31.916  3879 29983 E Group   : 2020:12:06 11:18:31.916 [0x70ba6d04f0-                ] E/Group: {0x70c1a79e00} not RTP
----------------------------------------   ---------------------------------------------------------------------------------------
            Android Piece                                                  Engage Piece

You'll notice that the process ID of 3879 is the third column of information of the Android piece. What's useful is that 3879 is surrounded by spaces so we can use that to filter our output. Ideally we'd do with with some sort of AND operator in our regular expression but, at least in the case of most grep implementations, there is no AND operator.

But, we can still simulate an AND by being a little tricky with our expression. We do this by telling grep that we want 3879 (surrounded by at least one space), following by any number of occurrences of the Engage logging (\] [E]/) we're looking for.

So, if we're looking for only Engage errors from process 3879, our expression is as follows:

"\ 3879\ .*(\] [E]/)"

Translated, this means: "Filter for lines that contain 3879 with at least one space (\ ) on either side, and (.) (it's not really AND) any number of occurences (*) of our Engage search expression."

Here goes:

$ grep -E "\ 3879\ .*(\] [E]/)" the_big_log_file.log

12-06 11:18:31.916  3879 29983 E Group   : 2020:12:06 11:18:31.916 [0x70ba6d04f0-                ] E/Group: {0x70c1a79e00} not RTP
12-06 11:20:51.994  3879 30003 E Leaf    : 2020:12:06 11:20:51.994 [0x70b09d74f0-                ] E/Leaf: {0x70b8dcc200} connect() failed for myrp.rallytac.com:7443
12-06 11:44:24.142  3879 29983 E Group   : 2020:12:06 11:44:24.142 [0x70ba6d04f0-                ] E/Group: {0x70c178af00} not RTP
12-06 11:45:00.698  3879 29983 E Group   : 2020:12:06 11:45:00.699 [0x70ba6d04f0-                ] E/Group: {0x70c178af00} not RTP
12-06 11:47:28.235  3879 29983 E Group   : 2020:12:06 11:47:28.236 [0x70ba6d04f0-                ] E/Group: {0x70c178af00} not RTP

If we want to see errors for process 24168, have that process ID in the command-line:

$ grep -E "\ 24168\ .*(\] [E]/)" the_big_log_file.log

12-06 11:48:55.072 24168 24195 E Engine  : 2020:12:06 11:48:55.072 [0x70c1ac14f0-                ] E/Engine: directory '/tmp/com.rallytac.engage/engine/timelines' does not exist
12-06 11:50:01.970 24168 24344 E Leaf    : 2020:12:06 11:50:01.970 [0x70beb024f0-                ] E/Leaf: {0x70d06f7a00} connect() failed for myrp.rallytac.com:7443
12-06 11:55:30.028 24168 24324 E Group   : 2020:12:06 11:55:30.028 [0x70b90054f0-                ] E/Group: {0x70bd8bac00} not RTP
12-06 11:56:03.847 24168 24324 E Group   : 2020:12:06 11:56:03.847 [0x70b90054f0-                ] E/Group: {0x70bd8bac00} not RTP

That period (.) in the expression, while not really an AND, gives us the semblance of AND. Don't forget to include it. Also, make sure that things like spaces have \ preceding them so that grep knows to treat that space as a literal to search for.

⚠️ **GitHub.com Fallback** ⚠️