RootScout automatically traces service dependencies, correlates telemetry and code changes, and surfaces the root cause of incidents — so your on-call team can fix instead of search.
The Problem
Every minute your on-call engineer spends reading logs across 20 services is a minute your users are experiencing downtime. Traditional approaches are slow, noisy, and error-prone.
When an alert fires, engineers must manually:
RootScout does the legwork instantly:
How It Works
Think of a building on fire. The old approach sends a detective to search every room. RootScout hands the detective a blueprint first — so they go straight to the source.
Services emit OpenTelemetry traces, metrics, and logs to RootScout's OTLP endpoints in real time.
Spans are parsed to automatically construct a live service dependency graph — nodes, edges, and health status.
When an alert fires, BFS traversal identifies only the related services. No irrelevant noise sent to the LLM.
Recent GitHub PRs and commits are automatically attached — correlating deployments with the outage window.
A structured prompt is sent to an LLM. The model reasons step-by-step and returns a JSON RCA report.
The RCA is posted to Slack with root cause, confidence, and recommended actions.
Features
RootScout combines industry-standard observability protocols with AI analysis into a single, production-ready platform.
Built automatically from OTel traces using NetworkX. Tracks service health, latency, and error rates in real time.
Native ingestion of OpenTelemetry traces, metrics, and logs. Drop-in compatible with any OTel-instrumented stack.
Supports Google Gemini and Anthropic Claude. Structured Chain-of-Thought prompting with JSON-formatted RCA output.
Webhook-based ingestion of push events and pull requests. Recent deployments are automatically correlated with incidents.
Real-time alerts with severity indicators. Incidents are automatically posted to your configured channel with actionable context.
Production-ready REST API with OTLP collector endpoints, GitHub webhooks, and background processing. Docker-ready.
Graph-scoped BFS traversal means the LLM only sees relevant services — reducing hallucinations and API costs.
JSON output with root cause service, confidence score, reasoning chain, and recommended actions.
Benchmark against 10 synthetic scenarios with ground-truth root causes and a rigorous three-axis scoring rubric.
Sample Slack alert delivered by RootScout:
Evaluation
RootScout ships with a rigorous evaluation suite — synthetic benchmarks with known root causes scored across three independent axes.
| Dataset | Strengths | Limitations | Best Model | Component match score | RCA cosine similarity score |
|---|---|---|---|---|---|
| OpenRCA Microsoft | Emulates real life production incidents | Missing codebase | Claude Opus 4.6 | 45% | 18% |
| RCAEvals | Has telemetry+ codebase present, deeper analysis for RCA | Doesn't emulate real-life incidents well | Claude Opus 4.6 | 56% | 28% |
| Synthetic data | Easy to generate, test different scenarios | Doesn't emulate real-life incidents that well | Claude Opus 4.6 | 100% | 91% |
Integrations
Built on open standards. If you already emit OTel, you're 90% of the way there.
Demo
Watch how RootScout automatically identifies the root cause of a live incident — from alert to resolution.