Structured logging - input-output-hk/iohk-monitoring-framework GitHub Wiki

Overview

Structured logging essentially allows forwarding Haskell data values to JSON log files, provided that the data type has a ToObject instance.

To enable this one needs to:

pass a suitable data value to a suitable trace
enable structured logging in configuration

Code requirements

The basic prerequisite is, as mentioned above, having a data type with a ToObject instance.

Once that is satisfied the actual logging is as simple as:

     traceWith (toLogObject' StructuredLogging MaximalVerbosity tr) (Pet "bella" 8)

..where the base trace has type (Data.String.IsString a, ToObject a, Transformable a IO Pet) => Trace m a -- which can be satisfied by Trace IO Text.

An example is available at https://github.com/input-output-hk/iohk-monitoring-framework/blob/master/examples/complex/Main.lhs#L394.

Kind-tagging the serialised messages

The ToObject instance defines the entirety of the structured payload, and so unless you explicitly include a message designator field in the serialisation, you'll have trouble interpreting the structured log.

A commonly used designator field name is "kind".

Configuration

Let's assume that we want a JSON log file with structured payload -- that's easier for querying than text files.

The configuration, then, consists of two parts -- 1) scribe setup and 2) scribe selection:

setupScribes:
  - scKind: FileSK
    scName: "log.json"
    scFormat: ScJson

defaultScribes:
  - - FileSK
    - "log.json"

Note, that the second part can be similarly done using mapScribes, which would allow a fine-grain choice of what gets logged in a structured manner.

Querying structured logs

In this section we'll be using the jq command (https://stedolan.github.io/jq/manual/), so you'll need that installed.

Let's assume that you've got a JSON log file, as per the above description.

Basic filtering

JSON-encoded messages in the logfile will fall in two types -- those with a data field, and the rest. That data field would contain the structured values, and will only exist in messages which were emitted by code set up in a way we described above.

Given that knowledge, and if you've followed the "Kind-tagging the serialised messages" section above, then you'll be able to easily select messages of a particular kind:

$ jq '.data | select(.kind == "TraceNoLedgerView" and .slot >= 23287)' logs/node-0.json
{
  "kind": "TraceNoLedgerView",
  "slot": 23287
}
{
  "kind": "TraceNoLedgerView",
  "slot": 23288
}

This jq pipeline (because it has a | and because that stands for a conceptual pipeline) means the following:

take every JSON record from the logfile
strip everything, but the data payload
select only those records, which have kind equal to TraceNoLedgerView AND slot satisfies the condition.

If you're, instead, interested in common fields, you could skip "narrowing" with the .data spec:

jq 'select(.data.kind == "TraceNoLedgerView" and .data.slot >= 23287 and .thread == "49")' logs/node-0.json

Converting to CSV

jq can be asked to emit CSV from the resultant data stream, one row per array (which means you have to convert to an array first):

jq '.data | select(.kind == "TraceMempoolValidateTxs") ' logs/node-0.log | jq -r ' [ .delta_t, .txsAccepted, .txsRejected, .txsRemoved ] | @csv' > test.csv