Observability & SLOs
End-to-end observability with dashboards, metrics, logs and traces mapped to what the business cares about: SLOs. Tooling includes Prometheus, Grafana, Loki/ELK, Tempo/Jaeger and OpenTelemetry.
- Service health: golden signals, SLI/SLO design and error budgets
- Dashboards for product and platform teams with drill-downs
- Alert strategy that is actionable and reduces noise