Skip to main content
GreenSlope

Status

Live · updated 14 seconds ago

Real-time health of every GreenSlope service, region, and dependency. Probes run every 30 seconds from seven geographies. We don't silently fix things — if a probe fails, it shows up here within a minute, and the incident note stays up for a year.

Overall uptime

99.987%

Trailing 90 days · all services

Ingest p95

84 ms

Cross-region median, last hour

Open incidents

0

Last resolved 11 days ago

Next maintenance

May 3

03:00–03:30 UTC · EU-2 failover

§ 01Service status
Last 90 days · UTC

Every service we run, with 90 days of history.

  • GreenSlope Dashboard

    app.greenslope.io

    Web app — deploy tracking, incident loop, postmortems, admin.

    Operational

    Uptime99.992%

    Latency142 ms

    90 days agoToday
  • Ingestion API

    ingest.greenslope.io

    SDK event ingest across EU, UK, and US regions. The front door.

    Operational

    Uptime99.998%

    Latency84 ms

    90 days agoToday
  • Public REST API

    api.greenslope.io

    Read/write API for integrations, CI/CD, and the CLI.

    Operational

    Uptime99.989%

    Latency112 ms

    90 days agoToday
  • Auth & SSO

    auth.greenslope.io

    Login, session, SAML/SCIM. Shared by dashboard and API surfaces.

    Operational

    Uptime99.995%

    Latency58 ms

    90 days agoToday
  • Root-cause agent

    agent.greenslope.io

    Private inference endpoint that drafts root-cause analyses.

    Operational

    Uptime99.971%

    Latency4.1 s

    90 days agoToday
  • Alert & notification delivery

    Slack · email · webhooks · PagerDuty

    Outbound incident delivery. Retried across four providers.

    Operational

    Uptime99.984%

    Latency1.3 s

    90 days agoToday
  • Integration workers

    GitHub · Linear · Jira · Datadog · Sentry

    Inbound webhooks and outbound sync for linked systems.

    Operational

    Uptime99.96%

    Latency640 ms

    90 days agoToday
  • Documentation site

    docs.greenslope.io

    This is a static site on Cloudflare Pages.

    Operational

    Uptime100.000%

    Latency19 ms

    90 days agoToday
  • Marketing site

    greenslope.io

    The page you are on right now, served from the edge.

    Operational

    Uptime99.999%

    Latency22 ms

    90 days agoToday
OperationalDegradedOutageMaintenance
§ 02Ingest p95 latency by region
24h · 30s buckets

How fast your events get acknowledged, per region.

Probed from seven geographies into the nearest ingest endpoint. The tiny bump at −12h was a planned CDN cutover in EU-2 — safely under the 300ms SLO.

  • US-East
    84 ms
  • EU-Central
    79 ms
  • UK-London
    73 ms
  • AP-Singapore
    136 ms
§ 03Recent incidents
Last 90 days

Every incident, with timestamps and what we changed.

Nothing vanishes from this list. Post-mortems for anything major live in the changelog within five working days.

INC-0192MinorRoot-cause agent

Elevated latency on Root-cause agent (EU-1)

2026-04-10 14:12 UTC

Duration · 35 min

Summary

Queue backpressure on the EU-1 inference endpoint pushed p95 completion time from 4.2s to 11s. No completions were dropped; all drafts landed in Slack, just late.

  1. 14:47 UTC

    Resolved

    Queue has drained, p95 back under SLO for 20 minutes. Post-incident review scheduled for 2026-04-14.

  2. 14:29 UTC

    Identified

    Root cause is a slow upstream response from a single model replica. Traffic has been shifted to healthy replicas; p95 recovering.

  3. 14:12 UTC

    Investigating

    Elevated latency on the Root-cause agent in EU-1. Drafts are still being generated, but are arriving in Slack 2–6x slower than usual.

INC-0191MajorIngestion API

Dropped ingest from one US-East load balancer

2026-03-27 02:41 UTC

Duration · 27 min

Summary

A misconfigured health check on one of four US-East load balancers returned 5xx for ~1.8% of ingest traffic. SDK retries absorbed the failures — you likely didn't notice — but we counted it as a major because of the write path.

  1. 03:08 UTC

    Resolved

    Load balancer has been removed from rotation and replaced. All events in the retry window have been processed.

  2. 02:54 UTC

    Identified

    One of four US-East LBs is failing health checks intermittently. Isolating the instance.

  3. 02:41 UTC

    Investigating

    Ingest error rate in US-East is elevated. Investigating.

MNT-0039MaintenanceGreenSlope Dashboard

Scheduled Postgres major version upgrade (EU-2)

2026-03-15 02:00 UTC

Duration · 24 min (of 60 planned)

Summary

Primary database for EU-2 upgraded from Postgres 16 to 17. Read replicas served dashboard traffic throughout. Writes were queued and replayed after failover.

  1. 02:24 UTC

    Completed

    Upgrade complete, writes draining from queue, all dashboards serving. Finished ahead of the 60-minute window.

  2. 02:00 UTC

    Started

    Planned maintenance window begins. Dashboard is read-only on EU-2; US and UK regions unaffected.

INC-0190MinorAlert & notification delivery

Slack delivery delay on notification fan-out

2026-03-03 19:22 UTC

Duration · 29 min

Summary

Slack rate-limited our workspace-scoped app for 29 minutes during a large spike. Messages were queued and delivered in-order; duplicate suppression still worked.

  1. 19:51 UTC

    Resolved

    Backlog cleared. We've raised our token rotation cadence with Slack and expanded per-channel parallelism.

  2. 19:34 UTC

    Monitoring

    Rate limits have eased and the queue is draining. Messages are arriving in-order, delayed by up to 12 minutes.

  3. 19:22 UTC

    Investigating

    Slack delivery delayed. Customers report incident notifications arriving 5–10 minutes late.

INC-0189MajorAuth & SSO

Dashboard login intermittent (global)

2026-02-18 08:03 UTC

Duration · 38 min

Summary

A bad deploy of the session service returned 500 for ~6% of logins. Rolled back in 38 minutes. Existing sessions were not affected.

  1. 08:41 UTC

    Resolved

    Rolled back to the previous revision. All logins succeeding. A full post-mortem is published at /changelog/postmortem-0189.

  2. 08:14 UTC

    Identified

    Caused by an incorrect cache TTL in the new session service build. Rolling back.

  3. 08:03 UTC

    Investigating

    Elevated 5xx on auth.greenslope.io — some customers cannot sign in.

§ 04Scheduled maintenance
All times UTC

What's changing, and when.

We announce planned work at least 7 days in advance. Enterprise customers can pin a maintenance window to their region via the admin settings.

  • May 3EU-2 failover drill · read-only for up to 30 min03:00–03:30 UTC
  • May 12Kafka 3.9 → 3.10 rolling upgrade · ingest unaffected, staggered across regions02:00–06:00 UTC
  • May 20Dashboard session key rotation · users re-authenticate, SSO unaffected04:00–04:10 UTC
  • Jun 1Deprecation · /v1/events.legacy · SDKs >= 4.2 unaffectedRemoved end of day

Get notified

Subscribe once. Hear every time something changes.

Email, RSS, Slack, webhooks, or plain Atom — pick any combination. We notify on status transitions only, never on noise. You can scope subscriptions to the services you actually depend on.