When was Grafana OnCall OSS archived and what broke?

The grafana/oncall OSS repository was archived on March 24, 2026, and is now read-only with security fixes limited to CVEs scoring CVSS 7.0 or higher. On the same date the Grafana Cloud Connection was disabled, so mobile push, SMS, and phone notifications that relied on it stopped working for self-hosted users. Existing schedules and escalation chains can keep running, but the cloud-backed paging path is gone.

Is Aurora a replacement for Grafana OnCall?

No. Grafana OnCall did alert routing, scheduling, and escalation, while Aurora is an open-source AI SRE that investigates incidents. They are complementary. You still need a routing layer such as Keep, ntfy with Twilio, or Grafana Cloud IRM, and Aurora sits on top of it to autonomously find root cause across AWS, Azure, GCP, OVH, Scaleway, and Kubernetes.

Does Keep have native on-call scheduling like Grafana OnCall?

Not natively. Keep (MIT, around 11.9k GitHub stars) is a strong open-source alert management and correlation platform, but in a public discussion a Keep team member who also created the original OnCall confirms it is a toolbox for alerts and that phone and SMS escalation still need a third-party service. Many teams pair Keep with Twilio or a minimal calling tool for the actual paging.

How do I restore push, SMS, and phone notifications after the archival?

For mobile push, self-host ntfy, an open-source service dual licensed under Apache 2.0 and GPLv2 with around 29.7k stars. For voice calls and SMS, the Grafana archival notice itself recommends bringing your own Twilio account credentials. Twilio is billed by usage and publishes no flat per-seat rate for this use, so budget by message and call volume.

What does Aurora cost and can it run air-gapped?

Aurora is free and open source under Apache 2.0, so there is no per-seat or per-investigation charge. Your only cost is the infrastructure to host it plus LLM API usage, and that drops to zero if you run local models with Ollama. It is self-hosted and air-gapped capable, which suits teams that must keep incident data on their own infrastructure.

Grafana OnCall Alternative: Open Source On-Call and AI SRE After the 2026 Archival

Q: How much does Grafana Cloud IRM cost as the official successor?

Grafana Cloud IRM is free for 3 active IRM users, then roughly 20 USD per active IRM user, with a 19 USD monthly platform fee on the Pro tier and a custom enterprise tier. An active IRM user is one included in schedules or escalation chains, so inactive accounts do not add cost. Migration from OSS uses Terraform or the OnCall API, moving integrations, then escalation chains, then routes, then schedules.

Key Takeaways

The grafana/oncall OSS repository was archived on March 24, 2026, and is now read-only, with security patches limited to CVEs scoring CVSS 7.0 or higher, per the Grafana OnCall OSS archival notice.

The Cloud Connection that powered mobile push, SMS, and phone notifications for OSS users was disabled on the same date, so self-hosted users must move push to ntfy, Pushover, or Gotify and bring their own Twilio account for calls and SMS.

Grafana OnCall was alert routing, scheduling, and escalation, not investigation, so any replacement only restores the routing layer. The paid successor is Grafana Cloud IRM, free for 3 active users and then roughly 20 USD per active IRM user.

Keep (MIT, around 11.9k GitHub stars) is the strongest open-source routing and correlation layer, though its maintainers confirm on-call scheduling is not a native feature and pairs with a calling tool.

Aurora (Apache 2.0) is a different layer: an open-source AI SRE that autonomously investigates incidents across AWS, Azure, GCP, OVH, Scaleway, and Kubernetes, and it sits on top of whatever routing you choose rather than replacing it.

If you self-hosted Grafana OnCall OSS, the March 2026 archival forced a decision. This guide confirms exactly what broke, lays out the open-source options for rebuilding the routing layer, and then explains where an AI investigation layer like Aurora fits. The honest framing up front: OnCall did routing and Aurora does investigation, so this is not a one-for-one swap.

What happened to Grafana OnCall OSS?

Grafana OnCall OSS was archived on March 24, 2026, and the repository is now read-only with no further feature development. According to the Grafana OnCall OSS archival notice, the project entered maintenance mode on March 11, 2025, and was fully archived a year later. Grafana Labs explained the timeline and reasoning in its maintenance-mode announcement, which coincided with the launch of Grafana Cloud IRM, the unified cloud product that merged OnCall and Incident.

Two concrete things broke for self-hosted users on the archival date:

Cloud-dependent notifications stopped. Mobile app push notifications, plus SMS and phone calls that relied on the Grafana Cloud Connection, are no longer supported for OSS users, per the archival notice. Your schedules and escalation chains keep running, but the paging channels that routed through Grafana Cloud went dark.
Security coverage narrowed. The codebase still receives fixes only for critical bugs and CVEs with a CVSS score of 7.0 or higher. Everything else is frozen.

The repository itself remains open source under AGPLv3 with roughly 3.9k stars, so you can keep running or even fork the archived code. But you would own all maintenance, and the cloud-backed notification path is gone for good.

What did Grafana OnCall actually do?

Grafana OnCall was an on-call management tool: scheduling, alert routing, escalation chains, and notification delivery. The Grafana OnCall OSS page describes it as calendar-based on-call schedules, automatic escalation chains with flexible routing to reach the right person during an outage, alert grouping to cut noise, and notifications over Slack, Telegram, voice, and SMS.

What it did not do was investigate. OnCall answered 'who gets paged and how do we escalate,' not 'why did this break and what do we fix.' That distinction matters when you pick a replacement, because the routing layer and the investigation layer are separate jobs. If you are rethinking the whole stack, our guide to open-source incident management covers how the pieces fit together.

What are the open-source replacements for the routing layer?

The honest answer is that no single open-source project is a drop-in clone of OnCall plus its cloud notifications, so most teams assemble two or three pieces: a routing and correlation engine, a notification transport, and optionally a managed path for phone and SMS. Here is how the realistic options compare.

Option	What it covers	License and cost	On-call scheduling	Notes
Keep	Alert management, correlation, de-noising, workflow automation	MIT, self-hosted free	Not native; workflows can route and escalate, calling needs a third party	Around 11.9k stars; strongest open-source single-pane-of-glass for alerts
ntfy	Push notification transport via HTTP pub-sub	Dual Apache 2.0 and GPLv2, self-hosted free	None; it is a delivery channel	Around 29.7k stars; replaces lost OSS push, self-hostable
Twilio (BYO account)	Voice calls and SMS for escalation	Commercial, usage-based; no flat per-seat figure published for this use	None; it is a delivery channel	The path the archival notice itself recommends for OSS calls and SMS
Grafana Cloud IRM	Scheduling, routing, escalation, incident response, built-in paging	Free for 3 active users, then about 20 USD per active IRM user plus a 19 USD monthly platform fee on Pro	Native and managed	The official paid successor; gives back cloud push, SMS, and voice

Keep: the open-source alert routing and correlation layer

Keep is the closest open-source project to a routing-layer replacement. It is MIT licensed, self-hostable for free, and carries roughly 11.9k GitHub stars. Keep ingests alerts from many sources, deduplicates and correlates them, and runs declarative YAML workflows that feel like GitHub Actions for your monitoring tools, including conditional routing by team, environment, or business hours.

Be precise about one thing: Keep does not ship native on-call scheduling and escalation the way OnCall did. In a public discussion of Keep as a Grafana OnCall alternative, a Keep team member who also created the original OnCall states that Keep is a toolbox for alerts focused on de-noising, correlation, and enrichment, and that for phone and SMS escalation you still need a third-party service. A common pattern they suggest is Keep for the single pane of glass plus a minimal calling tool for the actual paging.

ntfy and Twilio: restoring the notification channels

ntfy is a self-hosted, open-source push notification service, dual licensed under Apache 2.0 and GPLv2 with around 29.7k stars. It directly replaces the mobile push you lost when the Cloud Connection was disabled, with no third-party dependency. For voice and SMS, the Grafana archival notice itself points OSS users to bring their own Twilio credentials. Twilio is commercial and billed by usage, and there is no flat per-seat rate published for this specific routing use, so budget by message and call volume.

Grafana Cloud IRM: the paid official path

If you would rather not assemble pieces, Grafana Cloud IRM is the maintained successor that unifies on-call scheduling, alert routing, escalation, and incident response with built-in multi-channel paging. It is free for 3 active IRM users, then roughly 20 USD per active IRM user with a 19 USD monthly platform fee on Pro, and an enterprise tier with a custom annual commitment. An active IRM user is one included in schedules or escalation chains, so you pay for engineers who actually go on call. Migration from OSS uses Terraform or the OnCall API, and the migration guide recommends moving resources in order: integrations, then escalation chains, then routes, then schedules.

Where does Aurora fit, and does it replace OnCall?

No, Aurora does not replace Grafana OnCall, and it does not pretend to. Aurora is an open-source, Apache 2.0 AI SRE that handles the investigation layer, the part OnCall never touched. You still need a routing layer underneath it, whether that is Keep, ntfy and Twilio, Grafana Cloud IRM, PagerDuty, or anything else that can fire a webhook.

Here is the division of labor. Your routing layer decides who gets paged and escalates if they miss it. Aurora, triggered by the same alert, autonomously investigates why the incident is happening while the engineer is still reading the page.

Dimension	Grafana OnCall OSS (archived)	Aurora
Primary job	Alert routing, scheduling, escalation	Autonomous incident investigation and root cause analysis
License	AGPLv3, archived March 2026	Apache 2.0, actively developed
Deployment	Self-hosted, cloud notifications now disabled	Self-hosted, air-gapped capable, BYO-LLM
Multi-cloud reach	Routes alerts, does not query clouds	Queries AWS, Azure, GCP, OVH, Scaleway, and Kubernetes directly
Investigation vs correlation	Neither; it is routing	Multi-step agentic investigation with a Memgraph blast-radius graph
Write or execute actions	Sends notifications only	Runs kubectl, aws, az, and gcloud in sandboxed Kubernetes pods, human-gated for destructive steps
Pricing model	Free but unmaintained	Free, self-hosted; LLM cost only, and zero with local models
Self-host and air-gap	Self-host, no air-gap story for cloud paging	Self-host and air-gapped, with local inference via Ollama

What Aurora does after an alert arrives: its LangGraph-orchestrated agents query your cloud and Kubernetes APIs, execute read commands in sandboxed pods, build a Memgraph knowledge graph to estimate blast radius, generate a root-cause analysis and a postmortem you can export to Confluence, Notion, or SharePoint, and suggest code fixes or open a pull request. Destructive actions are always gated on a human approval. Aurora ingests alerts via webhook from eleven monitoring connectors, PagerDuty, Datadog, Grafana, New Relic, OpsGenie, Netdata, Dynatrace, Coroot, ThousandEyes, BigPanda, and incident.io, plus a Slack bot, so your migrated routing layer can hand off cleanly. For teams that need to keep incident data on their own infrastructure, our self-hosted AI SRE guide covers the air-gapped deployment in detail.

Aurora's differentiator against Kubernetes-only assistants is that it spans multiple clouds and actually executes investigation commands rather than only diagnosing. Against closed SaaS investigation tools, its differentiator is being open source, self-hosted, free, and vendor-neutral. If you are weighing the broader category, see how we frame adding AI investigation to a paging tool and how investigation works across clouds in multi-cloud incident management.

Which should you choose?

Choose based on which layer you are rebuilding. If you only need to restore routing, Keep plus ntfy plus a Twilio account is the most complete open-source assembly, and Grafana Cloud IRM is the lowest-effort paid path that gives back managed paging. None of those investigate incidents.

If your real pain after the archival is that investigation was always manual, add Aurora on top of whichever routing layer you land on. A practical stack is Keep for correlation and routing, ntfy or Twilio for delivery, and Aurora subscribed to the same alerts for autonomous root-cause analysis. The routing tools answer who and when. Aurora answers why and how to fix it, and it stays free and self-hosted while doing so.

Grafana OnCall Alternative: Open Source On-Call and AI SRE After the 2026 Archival

Key Takeaways

What happened to Grafana OnCall OSS?

What did Grafana OnCall actually do?

What are the open-source replacements for the routing layer?

Keep: the open-source alert routing and correlation layer

ntfy and Twilio: restoring the notification channels

Grafana Cloud IRM: the paid official path

Where does Aurora fit, and does it replace OnCall?

Which should you choose?

Frequently Asked Questions

Related Articles

Dynatrace Davis Alternative: Open Source AI Root Cause Analysis (2026)

Datadog Bits AI SRE Alternative: Open Source, Self-Hosted, Vendor-Neutral

BigPanda Alternative: Open Source AIOps Event Correlation (2026)

Try Aurora for Free