Skip to content

Observability

ZaneOps ships with two optional observability stacks you can deploy on demand:

  • OpenTelemetry tracing: trace the HTTP requests served by ZaneOps down to the SQL and Redis queries they trigger, in Grafana.
  • Temporal UI: inspect the Temporal workflows behind deployments, scheduled jobs and other background tasks.

Run these from your installation directory (e.g. /var/www/zaneops), the same place you run make deploy.

  1. Deploy the tracing stack (zane-otel):

    Terminal window
    make deploy-otel
  2. Uncomment these variables in your .env file:

    .env
    OTEL_TRACES_ENABLED="true"
    OTEL_EXPORTER_OTLP_ENDPOINT="http://zane.tempo:4317"

    Optionally set the Grafana admin credentials (defaults are admin / change-me-please):

    .env
    GRAFANA_ADMIN_USER="admin"
    GRAFANA_ADMIN_PASSWORD="a-strong-password"
  3. Re-deploy ZaneOps so the app and workers start emitting traces:

    Terminal window
    make deploy

To stop: run make stop-otel, comment back the OTEL_* variables and run make deploy again.

Terminal window
make deploy-temporal-ui

This deploys the zane-temporal stack and connects to the existing Temporal server, so no .env change is required. To stop: make stop-temporal-ui.

The tracing stack instruments the HTTP requests made to ZaneOps. For each request you get a span tree that includes the SQL queries and Redis (Valkey) queries it ran, both on their own and scoped to the request that triggered them. This makes it easy to spot slow endpoints, N+1 queries or unexpected cache access.

List of traced HTTP requests in Grafana Tempo

It is made of two services, deployed as the zane-otel stack on the existing zane overlay network:

ServiceImageRole
zane-tempografana/tempoReceives and stores traces (OTLP gRPC on :4317, HTTP on :4318)
zane-grafanagrafana/grafanaUI to explore traces, pre-provisioned with Tempo and Loki datasources

The Temporal UI lets you inspect the workflows and activities that ZaneOps runs in the background (deployments, scheduled jobs, cleanup tasks…). It’s useful to follow a deployment step by step or to understand why a background task failed.

Screenshot of the temporal UI dashboard Screenshot of the temporal UI dashboard showing a workflow for a deployment

It’s a single service, deployed as the zane-temporal stack:

ServiceImageRole
temporal-uitemporalio/uiWeb UI connected to the Temporal server at zane.temporal:7233

By default, neither Grafana nor the Temporal UI publishes a port: they’re only reachable from inside the zane overlay network. This is intentional, since these dashboards expose sensitive internals and should not be open to the public internet.

How you access them depends on how long you plan to keep them around:

DashboardStackService aliasInternal portDefault published port
Grafana (tracing)zane-otelzane-grafana30003004
Temporal UIzane-temporalzane.temporal_ui80808080

If you only need the dashboard for a short debugging session, uncomment the published port in the corresponding stack file and re-deploy the stack.

For Grafana, edit docker-stack.prod-otel.yaml:

docker-stack.prod-otel.yaml
zane-grafana:
# ...
ports:
- "3004:3000"

Then re-deploy and open http://<SERVER_IP>:3004:

Terminal window
make deploy-otel