Status — GreenSlope

Status

Live · updated 14 seconds ago

Real-time health of every GreenSlope service, region, and dependency. Probes run every 30 seconds from seven geographies. We don't silently fix things — if a probe fails, it shows up here within a minute, and the incident note stays up for a year.

Overall uptime

99.987%

Trailing 90 days · all services

Ingest p95

84 ms

Cross-region median, last hour

Open incidents

Last resolved 11 days ago

Next maintenance

May 3

03:00–03:30 UTC · EU-2 failover

§ 01Service status

Last 90 days · UTC

Every service we run, with 90 days of history.

ServiceStatusUptimeLatencyLast 90 days

GreenSlope Dashboard
app.greenslope.io
Web app — deploy tracking, incident loop, postmortems, admin.
Operational
Uptime99.992%
Latency142 ms
90 days agoToday
Ingestion API
ingest.greenslope.io
SDK event ingest across EU, UK, and US regions. The front door.
Operational
Uptime99.998%
Latency84 ms
90 days agoToday
Public REST API
api.greenslope.io
Read/write API for integrations, CI/CD, and the CLI.
Operational
Uptime99.989%
Latency112 ms
90 days agoToday
Auth & SSO
auth.greenslope.io
Login, session, SAML/SCIM. Shared by dashboard and API surfaces.
Operational
Uptime99.995%
Latency58 ms
90 days agoToday
Root-cause agent
agent.greenslope.io
Private inference endpoint that drafts root-cause analyses.
Operational
Uptime99.971%
Latency4.1 s
90 days agoToday
Alert & notification delivery
Slack · email · webhooks · PagerDuty
Outbound incident delivery. Retried across four providers.
Operational
Uptime99.984%
Latency1.3 s
90 days agoToday
Integration workers
GitHub · Linear · Jira · Datadog · Sentry
Inbound webhooks and outbound sync for linked systems.
Operational
Uptime99.96%
Latency640 ms
90 days agoToday
Documentation site
docs.greenslope.io
This is a static site on Cloudflare Pages.
Operational
Uptime100.000%
Latency19 ms
90 days agoToday
Marketing site
greenslope.io
The page you are on right now, served from the edge.
Operational
Uptime99.999%
Latency22 ms
90 days agoToday

OperationalDegradedOutageMaintenance

§ 02Ingest p95 latency by region

24h · 30s buckets

How fast your events get acknowledged, per region.

Probed from seven geographies into the nearest ingest endpoint. The tiny bump at −12h was a planned CDN cutover in EU-2 — safely under the 300ms SLO.

US-East
84 ms
EU-Central
79 ms
UK-London
73 ms
AP-Singapore
136 ms

§ 03Recent incidents

Last 90 days

Every incident, with timestamps and what we changed.

Nothing vanishes from this list. Post-mortems for anything major live in the changelog within five working days.

INC-0192MinorRoot-cause agent

Elevated latency on Root-cause agent (EU-1)

2026-04-10 14:12 UTC

Duration · 35 min

Summary

Queue backpressure on the EU-1 inference endpoint pushed p95 completion time from 4.2s to 11s. No completions were dropped; all drafts landed in Slack, just late.

14:47 UTC
Resolved
Queue has drained, p95 back under SLO for 20 minutes. Post-incident review scheduled for 2026-04-14.
14:29 UTC
Identified
Root cause is a slow upstream response from a single model replica. Traffic has been shifted to healthy replicas; p95 recovering.
14:12 UTC
Investigating
Elevated latency on the Root-cause agent in EU-1. Drafts are still being generated, but are arriving in Slack 2–6x slower than usual.

INC-0191MajorIngestion API

Dropped ingest from one US-East load balancer

2026-03-27 02:41 UTC

Duration · 27 min

Summary

A misconfigured health check on one of four US-East load balancers returned 5xx for ~1.8% of ingest traffic. SDK retries absorbed the failures — you likely didn't notice — but we counted it as a major because of the write path.

03:08 UTC
Resolved
Load balancer has been removed from rotation and replaced. All events in the retry window have been processed.
02:54 UTC
Identified
One of four US-East LBs is failing health checks intermittently. Isolating the instance.
02:41 UTC
Investigating
Ingest error rate in US-East is elevated. Investigating.

MNT-0039MaintenanceGreenSlope Dashboard

Scheduled Postgres major version upgrade (EU-2)

2026-03-15 02:00 UTC

Duration · 24 min (of 60 planned)

Summary

Primary database for EU-2 upgraded from Postgres 16 to 17. Read replicas served dashboard traffic throughout. Writes were queued and replayed after failover.

02:24 UTC
Completed
Upgrade complete, writes draining from queue, all dashboards serving. Finished ahead of the 60-minute window.
02:00 UTC
Started
Planned maintenance window begins. Dashboard is read-only on EU-2; US and UK regions unaffected.

INC-0190MinorAlert & notification delivery

Slack delivery delay on notification fan-out

2026-03-03 19:22 UTC

Duration · 29 min

Summary

Slack rate-limited our workspace-scoped app for 29 minutes during a large spike. Messages were queued and delivered in-order; duplicate suppression still worked.

19:51 UTC
Resolved
Backlog cleared. We've raised our token rotation cadence with Slack and expanded per-channel parallelism.
19:34 UTC
Monitoring
Rate limits have eased and the queue is draining. Messages are arriving in-order, delayed by up to 12 minutes.
19:22 UTC
Investigating
Slack delivery delayed. Customers report incident notifications arriving 5–10 minutes late.

INC-0189MajorAuth & SSO

Dashboard login intermittent (global)

2026-02-18 08:03 UTC

Duration · 38 min

Summary

A bad deploy of the session service returned 500 for ~6% of logins. Rolled back in 38 minutes. Existing sessions were not affected.

08:41 UTC
Resolved
Rolled back to the previous revision. All logins succeeding. A full post-mortem is published at /changelog/postmortem-0189.
08:14 UTC
Identified
Caused by an incorrect cache TTL in the new session service build. Rolling back.
08:03 UTC
Investigating
Elevated 5xx on auth.greenslope.io — some customers cannot sign in.

§ 04Scheduled maintenance

All times UTC

What's changing, and when.

We announce planned work at least 7 days in advance. Enterprise customers can pin a maintenance window to their region via the admin settings.

May 3EU-2 failover drill · read-only for up to 30 min03:00–03:30 UTC
May 12Kafka 3.9 → 3.10 rolling upgrade · ingest unaffected, staggered across regions02:00–06:00 UTC
May 20Dashboard session key rotation · users re-authenticate, SSO unaffected04:00–04:10 UTC
Jun 1Deprecation · /v1/events.legacy · SDKs >= 4.2 unaffectedRemoved end of day

Get notified

Subscribe once. Hear every time something changes.

Email, RSS, Slack, webhooks, or plain Atom — pick any combination. We notify on status transitions only, never on noise. You can scope subscriptions to the services you actually depend on.

All systems operational.

Every service we run, with 90 days of history.

How fast your events get acknowledged, per region.

Every incident, with timestamps and what we changed.

What's changing, and when.

Subscribe once. Hear every time something changes.