Metrics API - zig-o11y/opentelemetry-sdk GitHub Wiki

The Metrics API in the Zig OpenTelemetry SDK, located in the src/api/metrics module, provides a robust and type-safe interface for instrumenting applications to collect telemetry data.

It is designed to align with the OpenTelemetry Metrics API Specification. It facilitates the creation, recording, and management of metrics such as counters, gauges, and histograms.

This document outlines the main components of the Metrics API, their implementation details, and the rationale behind the architectural decisions, emphasizing how they reflect the OpenTelemetry specification while leveraging Zig’s systems programming capabilities. Overview The metrics module defines instrumentation decoupled from the SDK’s processing and exporting mechanisms. This separation adheres to the OpenTelemetry specification’s modular design, allowing developers to instrument code without direct dependencies on the underlying metric collection or export logic.

The primary components of the Metrics API include:

  • MeterProvider: The entry point for creating instruments.

  • Instrument: Container type for instruments such as Counter, UpDownCounter, Gauge, Histogram, and others for recording measurements.

  • Measurement: A structure capturing metric data points for processing.

Each component is implemented to balance the flexibility required by OpenTelemetry with Zig’s emphasis on compile-time safety and performance. The following sections detail these components, their design rationale, and their alignment with the specification.

MeterProvider

This component is the entry point for the creation of instruments. Through the getMeter method, we can get a Meter.

It stores a collection of Meters used to create and manage instruments. The MeterProvider instance must be shut down by calling the shutdown() method to clear up all the resources attached to the MeterProvider and the meters it owns.

Users are expected to have one or many MeterProviders, it is preferable for resource usage matters to have only one and pass it around the program, or pass around Meters that are owned by it. A MeterProvider should be registered with one or more MetricReaders (part of the SDK), that will collect the metric data points and export them in various ways.

A Meter keeps a collection of Instrument types, allowing to fetch the data points collected by instruments when data need to be exported. When data points are fetched, we need to normalize them by applying an "Aggregation" function, keeping each time series identified by a unique set of attributes (key/value metadata) distinct.

Instrument

The Instrument container exists to abstract the connection between multiple types of instruments and the Meter storing them. A Meter is supposed to store all kinds of supported instruments (synchronous and asynchronous), so the Instrument struct exists to keep within the same concrete type the possible instruments' implementations.

In instrument.zig, we store all implementations of instruments. They are all based on the same concept of recording contiguous measurements and keeping them in memory until a collection cycle happens. The purpose of Instrument is to dispatch the collection of data points across all instruments (using polymorphism) via the getInstrumentsData method.

Each instrument implementation allows a generic representation of the metric's data points. For example, Gauge(T) allow defining data points for a u16, u32, u64 (and their signed versions) as well as f32 and f64 measurement. Practically, users of the API are allowed to build a Gauge(u16) that will record values between 0 and 2^16-1. This allows storing the data points in less memory, when the user knows they won't need bigger space for a given metric.

Measurements

In measurements.zig we define the structs to keep track of the raw data points and their metadata as recorded by instruments.

DataPoint(T) is a generic type that holds the value: T of the data point being recorded as well as the metadata: timestamps, attributes (key/value pairs identifiers for a data point), and possibly more. The purpose of DataPoint is to be stored in instruments state between collection cycles, representing the values of the metrics before they are normalized or aggregated.

The OTLP protocol used to send data over the network restricts the concrete types for numbers to i64 and f64, so these are preferred as conversion target when data points are extracted from instruments.

The Measurments struct represents a collection of data points that are exported out of an instruments, which is in turn attached to a Meter. This type is used to move data from collection to exporting components in the SDK (MetricReader, MetricExporter).