Operations Monitoring - osama1998H/Moca GitHub Wiki
Health checks, Prometheus metrics, OpenTelemetry tracing, profiling, and CLI diagnostics.
Moca now exposes an observability surface across the server and CLI:
- health endpoints for liveness and readiness
- Prometheus metrics from a dedicated registry
- OpenTelemetry tracing over OTLP/gRPC
-
moca doctorfor project, infrastructure, and per-site diagnostics -
moca dev benchfor live microbenchmarks -
moca dev profileforpprofcapture and SVG flamegraphs
GET /healthGET /health/readyGET /health/live
Prometheus metrics are enabled by default and served from observability.metrics.path, which defaults to /metrics.
The server registers 13 framework metrics covering:
- HTTP request count and latency
- document operation count and latency
- cache hits and misses
- queue throughput and duration
- Kafka publish counts and consumer lag
- active WebSocket connections
- database query duration and active pool usage
Only Moca's custom registry is exposed on the metrics endpoint; default Go process metrics are not added automatically.
When development.enable_pprof: true is set, the server exposes:
GET /debug/pprof/GET /debug/pprof/cmdlineGET /debug/pprof/profileGET /debug/pprof/symbolGET /debug/pprof/trace
development:
enable_pprof: true
observability:
metrics:
enabled: true
path: /metrics
tracing:
enabled: true
exporter: otlp
endpoint: localhost:4317
insecure: true
sample_rate: 1.0Notes:
-
observability.metrics.enableddefaults to enabled when omitted -
observability.metrics.pathdefaults to/metrics - tracing is disabled unless
observability.tracing.enabled: true - tracing defaults to
exporter: otlp,endpoint: localhost:4317,insecure: true, andsample_rate: 1.0
moca doctor runs project, infrastructure, and optional per-site checks. Depending on flags and configuration it can validate:
- config parsing
- installed Moca version
- local disk space
- PostgreSQL
- Redis
- Kafka
- Meilisearch
- object storage
- site schema existence
- site search indexes
- queue backlog / DLQ status
Useful flags:
-
--site <site>adds site-specific checks -
--verboseshows details such as latency and version info -
--jsonreturns machine-readable output -
--fixexists for future auto-remediation hooks but does not yet repair the current check set
Fetches the raw Prometheus text output from http://localhost:<port>/metrics. This is the quickest way to verify that the default metrics endpoint is reachable locally.
Runs synthetic latency benchmarks against a live site's PostgreSQL and Redis connections:
readwritequerycache-
all(default)
Results include min/max/mean plus p50, p95, p99, ops/sec, and error count.
Downloads a pprof profile from the running server and either:
- saves the raw
.pb.gz, or - generates an SVG flamegraph with
go tool pprof
cpu and block profiles honor --duration; mem, goroutine, and mutex are snapshots.
The repository's docker-compose.yml now includes Jaeger for local trace inspection:
- OTLP gRPC ingest:
localhost:4317 - Jaeger UI:
http://localhost:16686
This is intended for development and test environments; production deployment guidance still lives under the deployment docs.
-
moca monitor liveis still a placeholder command and does not launch a TUI dashboard -
moca monitor metricsassumes the default/metricspath and does not follow a customobservability.metrics.path - the wiki does not yet ship Grafana dashboard templates or alert rules
- tracing is available, but exporters other than OTLP are not documented because the current server wiring targets OTLP only