Skip to main content
GreenSlope

SLOs and burn rates

An SLO is a promise about how reliable a service is. A burn rate is how fast you're breaking that promise. Burn-rate alerts are how GreenSlope decides to page you.

If you've never run SLOs before, this page is enough to start. If you've run them before, skim for GreenSlope-specific defaults.

SLO: an objective with a time window

A Service Level Objective (SLO) is a target expressed over a time window. Examples:

Two things make an SLO an SLO, not just a threshold:

  1. It has a window (7, 14, 28 days — 28 is the GreenSlope default).
  2. It has an error budget: the amount of "not meeting the objective" you can afford before you're officially off-target.

For 99.9% over 28 days, the budget is (1 − 0.999) × 28 days = 40 minutes 38 seconds. That's how much total unavailability the service can rack up in a month and still meet the SLO.

Burn rate: how fast you're spending the budget

If your error budget is 40 minutes over 28 days, a "normal" burn rate is 1× the allowed rate — you'd exhaust the budget exactly at the end of the window.

A burn rate above 1× means you're spending faster than allowed. A burn rate of 14× means you'll exhaust the full 28-day budget in 2 days. That's the alerting signal.

Multi-window burn-rate alerting (Google's default, ours too)

A single threshold on error rate is a bad alert. Too noisy if tight; too slow to fire if loose. GreenSlope defaults to the multi-window burn-rate approach:

Alert severityShort windowLong windowBurn rate
Page (sev-1)5 min1 h≥ 14.4×
Page (sev-2)30 min6 h≥ 6×
Ticket2 h24 h≥ 3×
Ticket6 h72 h≥ 1×

An alert fires only when both windows cross the threshold simultaneously. The short window catches fast-moving incidents; the long window filters spikes that recover on their own. Together they page when it matters and shut up when it doesn't.

Default SLOs GreenSlope creates

When you add a service, we create two default SLOs you can keep or replace:

Both use the multi-window burn-rate alert table above. If the defaults don't match your workload, define your own in the dashboard under Services → SLOs.

What isn't an SLO

Two things teams commonly call SLOs that aren't:

If you can't write it down as "N% of X, over Y days", you don't have an SLO yet.

Related