Troubleshooting
Pages here cover the things most likely to go wrong during setup and the terms the in-product Doctor page uses that aren't self-explanatory.
If your issue isn't here, email support@greenslope.io. Every support answer that isn't already in docs is a docs bug — we add the answer here and close the ticket.
First trace doesn't arrive
After finishing a quickstart, the dashboard's Live traces view is empty. Work through these in order.
1. Is the ingest key set?
echo $GREENSLOPE_INGEST_KEY
# Should print gs_ing_live_…If it's unset, the SDK may silently send requests without the header
and we return 401. In Node, enable the OTel debug logger to see:
import { diag, DiagConsoleLogger, DiagLogLevel } from "@opentelemetry/api"
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG)Look for 401 responses from ingest.greenslope.io.
2. Is the URL right?
The path is /v1/otel/v1/traces. The double /v1 is deliberate — don't
collapse. Wrong values and their failures:
| URL | What happens |
|---|---|
.../v1/traces | 404. Missing outer /v1/otel. |
.../v1/otel | 404. Missing inner /v1/traces. |
.../v1/otel/v1/logs | 404. Logs aren't accepted in V1. |
3. Did the SDK start before your app imports frameworks?
In Node, if express is imported before the OTel SDK starts,
auto-instrumentation can't patch it and you get zero spans. Symptoms:
the process runs fine, but only manual spans show up.
Fix: use node -r ./otel.js server.js or --import ./otel.js (ESM).
See the Node quickstart step 4.
4. Is a firewall blocking outbound 443?
Less common. Check:
curl -I https://ingest.greenslope.io/v1/otel/v1/traces
# Expect: HTTP/2 405 (POST-only endpoint, 405 on HEAD is correct)If you get a timeout or 5xx from curl, your network is blocking egress. Our Security page lists the hostnames that need to be allowlisted.
missing release data banner on the Doctor page
Shown when one of the three required resource attributes isn't present on recent spans. Common causes:
greenslope.release.idnot set. You setservice.versionand forgot the release ID. They're different; see Release attribution.- Release ID isn't stable. You're generating a random value at process start. Use a git SHA.
- Different release IDs across services that deploy together. Services that ship from the same commit should share a release ID.
Fix the instrumentation; wait 5–10 minutes for new spans to replace the old; banner clears.
Alert fires on every release
Your regression detector threshold is set too loose, or your SLO baseline hasn't stabilised (first 72 hours after service creation, the p95 baseline is still being learned).
- First 72 hours: expected. The banner on the Doctor page will say
learning baseline— regression alerts are suppressed during this window. - After 72 hours: tighten the regression threshold in Settings → Services → Regression detection. Default is 2× baseline error rate over a 10-minute window; drop to 3× if you run a noisy tail.
Slack alerts go to the wrong channel
Routes are top-to-bottom, first-match-wins. A more general rule above a more specific one will swallow alerts before the specific rule ever sees them.
Check the route list at Settings → Alerts → Routes. Reorder so the narrowest rule is first. A good default order:
severity = sev1- Per-service rules
- Environment-based rules (e.g.
environment = staging) - Fallback
*
Common Doctor-page terms
- Burn rate — how fast you're spending your error budget. See SLOs and burn rates.
- Baseline — the p95 latency and error rate measured during the service's first 72 hours, used as the comparison floor for regressions.
- Suspected release — the release ID whose introduction correlates with the current error spike within the change-event window.
- Quiet window — a period where we expect spans but haven't seen any; often a sign the SDK stopped reporting.
- Learning baseline — first 72 hours after service creation, regression detection paused.
Related