Case Study Financial Data

Eliminating 11 hours of weekly postmortem work per SRE

A financial data provider reduced the time each SRE spent on postmortem analysis from 11 hours per week to under 45 minutes — by replacing manual log triage with LLM-generated root cause analysis at full coverage.

11 hrs → 45 min

Per-incident postmortem time

88% → 0%

Recurring incidents from missed root causes

$2.1M

Annual senior engineering time recovered

100%

Log coverage for forensic completeness

34

Previously unidentified root cause patterns found in first 60 days

Postmortem quality score improvement

The Organisation

Global Financial Data Provider · Financial Data

The Challenge

Each production incident triggered a mandatory postmortem process that required senior SREs to manually review log data, reconstruct timelines, and identify root causes. With 88% of logs sampled away, this process was forensically incomplete — teams were drawing conclusions from partial evidence. The average postmortem took 11 hours of senior engineering time per incident, with recurring incidents often traced to missed root causes in the first analysis.

The Approach

LLM reasoning at full log coverage was connected to the postmortem workflow. For each incident, the model generated a complete causal timeline, identified contributing factors, and cross-referenced similar historical patterns — reducing the manual review phase to verification rather than discovery.

"We were drawing conclusions from 12% of the evidence. Some of our most persistent recurring incidents turned out to have had obvious root causes — they were just in the data we'd been dropping."

Principal Site Reliability Engineer

Key Finding

In the first 60 days of full coverage, 34 previously unidentified root cause patterns were surfaced across historical incident data. Seven of these explained recurring incident classes that the team had been managing symptomatically for over a year. Fixing the underlying causes eliminated 23% of total incident volume.

Results at a Glance
Per-incident postmortem time 11 hrs → 45 min
Recurring incidents from missed root causes 88% → 0%
Annual senior engineering time recovered $2.1M
Log coverage for forensic completeness 100%
Previously unidentified root cause patterns found in first 60 days 34
Postmortem quality score improvement
Get in Touch

Talk to us about your data.

Tell us about your event stream and we'll show you what full LLM reasoning coverage looks like for your environment.

Or book a call directly →