← Back to Blog
comparison
14 min read

HolmesGPT vs K8sGPT: A 2026 Head-to-Head Comparison for SRE Teams

HolmesGPT vs K8sGPT compared on scope, runtimes, LLM backends, MCP, operator mode, governance, and licensing. Every fact cited to GitHub or official docs.

By Noah Casarotto-Dinning, CEO at Arvo AI|

Key Takeaways

This is a strict comparison of two open-source projects that are often grouped together because both attach AI to Kubernetes work, both are CNCF Sandbox, and both are Apache 2.0. Past that, they target different problems with different runtimes, different backends, and different governance. Every claim below is cited to a primary source: the project's GitHub repository, its official docs site, or a CNCF page. No quote is paraphrased from third-party blog posts.

A note on bias. Arvo builds Aurora, a separate open-source AI SRE listed alongside HolmesGPT and K8sGPT in our three-way comparison. This page intentionally excludes Aurora from the main comparison except for a small section at the end.

We call the rubric used below the Open-Source AI SRE Decision Matrix. Six axes, each evaluated against the project's own primary documentation, no third-party claims. The six axes are: stated scope, execution model, continuous operation, LLM provider breadth, Model Context Protocol direction (host vs consume), and project governance. Every cell in the comparison table that follows maps back to one of these six axes.

What is HolmesGPT?

HolmesGPT describes itself as an "Open-source AI agent for investigating production incidents and finding root causes". Repository statistics on the project's about box in May 2026 show 2.5k stars, 347 forks, and Python at 84.5 percent of the codebase (github.com/HolmesGPT/holmesgpt).

Scope is cross-infrastructure: "Open-source SRE agent for investigating production incidents across any infrastructure - Kubernetes, VMs, cloud services, databases, and more" (holmesgpt.dev). The same point is made on the project repository: "No Kubernetes required: Works with any infrastructure - VMs, bare metal, cloud services, or containers" (github.com/HolmesGPT/holmesgpt).

Governance is shared between two entities. Origin attribution: "Originally created by Robusta.Dev, with major contributions from Microsoft". The project's legal entity is named on the docs site: "HolmesGPT a Series of LF Projects, LLC". CNCF acceptance is documented at "October 8, 2025 at the Sandbox maturity level" (cncf.io/projects/holmesgpt).

The latest release at time of writing is v0.30.1 on 20 May 2026 per the Releases page. The release notes for v0.30.1 mention Loki raw response handling on parse failure, a GitLab MCP entry in the datasource catalog, a Bash echo allowlist fix, and user_email persistence on chat requests.

What is K8sGPT?

K8sGPT describes itself as "a tool for scanning your Kubernetes clusters, diagnosing, and triaging issues in simple English. It has SRE experience codified into its analyzers and helps to pull out the most relevant information to enrich it with AI". Repository statistics on the project's about box in May 2026 show 7.8k stars, 996 forks, and Go at 98.9 percent of the codebase (github.com/k8sgpt-ai/k8sgpt).

Scope is explicitly Kubernetes. The project makes no claim of non-Kubernetes runtime support. The marketing site at k8sgpt.ai carries the tagline "K8sGPT - Giving Kubernetes Superpowers to Everyone."

Governance is community-led. The 7 June 2024 CNCF blog (Dotan Horovits) states: "unlike many popular projects, there is no company behind this project, and no business plan behind it" (CNCF Blog). CNCF acceptance is documented at "December 19, 2023 at the Sandbox maturity level" (cncf.io/projects/k8sgpt).

The latest release at time of writing is v0.4.33 on 13 May 2026 per the Releases page. Recent feature releases include v0.4.27 (mcp v2, 18 December 2025), v0.4.32 (Azure API type support and custom HTTP header, 22 April 2026), and v0.4.33 (analyze previous logs for restarted containers, 13 May 2026).

At a glance

DimensionHolmesGPTK8sGPT
LicenseApache 2.0Apache 2.0
CNCF statusSandbox, 8 October 2025Sandbox, 19 December 2023
Stars (May 2026)2.5k7.8k
Primary languagePython (84.5%)Go (98.9%)
Stated scope"Any infrastructure - VMs, bare metal, cloud services, or containers"Kubernetes clusters
Operating modelMulti-step investigation agent + optional 24/7 Operator ModeScanner CLI + k8sgpt-operator for continuous in-cluster runs
Default permission model"Read-only access and respects RBAC permissions"Diagnoses; anonymises sensitive data before AI calls
Write capabilityCan open GitHub PRs via the GitHub MCP integration in Operator ModeNone documented
MCP supportMCP-based integrations for AWS, Azure, GCP, GitHub, GitLab, Jenkins, Kubernetes Remediation, Sentry, Splunk, MariaDB, PrefectHosts an MCP server exposing 12 tools and 3 resources for Kubernetes operations
LLM providersAnthropic, OpenAI, Azure AI Foundry, AWS Bedrock, Google Vertex AI, Gemini, GitHub Copilot, GitHub Models, Ollama, OpenRouter, OpenAI-Compatible, Robusta AIAnthropic, OpenAI, Azure OpenAI, AWS Bedrock (and Bedrock Converse), Amazon SageMaker, Google Vertex AI, Google GenAI, Cohere, Groq, HuggingFace, IBM watsonx, Oracle OCI GenAI, Ollama, LocalAI, Custom REST
Latest release at writingv0.30.1, 20 May 2026v0.4.33, 13 May 2026
Founding entityOriginally Robusta.dev, major Microsoft contributionsCommunity-led, no commercial backer per June 2024 CNCF blog

What is the scope difference between HolmesGPT and K8sGPT?

This is the load-bearing axis on the Open-Source AI SRE Decision Matrix, and the easiest one for teams to get wrong.

K8sGPT is, by stated scope, a Kubernetes diagnostics tool. The pkg/analyzer folder ships analysers for around 29 Kubernetes resource types as of May 2026, with a documented "default" subset (Pod, PVC, ReplicaSet, Service, Event, Ingress, StatefulSet, Deployment, Job, CronJob, Node, MutatingWebhook, ValidatingWebhook, ConfigMap) and an extended set covering HPA, PDB, NetworkPolicy, Gateway, GatewayClass, HTTPRoute, Log, Storage, Security, plus OLM-related resources (CatalogSource, ClusterServiceVersion, Subscription, etc.). Every analyser is scoped to a Kubernetes resource type. A team running on bare VMs, on managed cloud services without Kubernetes, or on a mainframe is not the K8sGPT audience.

HolmesGPT rebuts the Kubernetes-only assumption directly: "No Kubernetes required: Works with any infrastructure - VMs, bare metal, cloud services, or containers". Its data-source catalogue, visible in the docs navigation, covers VM-era systems alongside Kubernetes-era ones: Bash, ClickHouse, MariaDB (via MCP), Confluence, Sentry, plus Kubernetes resources and Helm. The Operator Mode page also frames non-Kubernetes scope: "While the operator itself runs in Kubernetes, health checks can query any data source Holmes is connected to - VMs, cloud services, databases, SaaS platforms".

For SRE teams whose estate is entirely Kubernetes, this difference is academic. For teams that still run managed databases outside Kubernetes (RDS, Cloud SQL, Aurora), VM workloads, or third-party SaaS at incident-critical positions in the stack, K8sGPT cannot reach those resources without integration glue, and HolmesGPT can.

Can HolmesGPT or K8sGPT execute commands against my cluster?

Both projects ship a fundamentally read-shaped default. The phrasing differs.

HolmesGPT is explicit: "By design, HolmesGPT has read-only access and respects RBAC permissions. It is safe to run in production environments". The Operator Mode page describes how the read-only default is preserved while a separate write path opens: "Connect the GitHub MCP server so Holmes can open PRs to fix the problems it finds - not just report them". Writes do not happen against the cluster; they happen against the user's Git repository, where humans approve the change.

K8sGPT does not use the phrase "read-only" in its repository documentation, but its operational profile is similar: the tool scans cluster state through Kubernetes APIs and feeds analyser output to an LLM. Anonymisation happens before the LLM call: "the data is anonymized before being sent to the AI Backend... k8sgpt retrieves sensitive data (Kubernetes object names, labels, etc.). This data is masked when sent to the AI backend". The same primary source also notes that anonymisation "does not currently apply to events" and that certain fields (Describe, ObjectStatus, Replicas, ContainerStatus, Event Message, ReplicaStatus, Count) are not masked. The trade-off is openly disclosed.

How does continuous operation differ between the two operators?

Both projects have an in-cluster operator, and again the framing differs.

HolmesGPT's Operator Mode is a 24/7 background agent: "HolmesGPT runs in the background 24/7, spots problems before your customers notice, and messages you in Slack with the fix" (holmesgpt.dev/latest/operator). The docs note its architecture: "a lightweight kopf-based controller handles CRD orchestration and scheduling, while stateless Holmes API servers execute the actual checks." The same page carries an explicit "Holmes Operator - Alpha Release" warning, and includes a cost caution: "Begin with infrequent schedules (e.g., hourly or daily) and monitor usage before scaling up."

K8sGPT's operator (a separate repo, k8sgpt-ai/k8sgpt-operator) is a continuous scanner: "This Operator is designed to enable K8sGPT within a Kubernetes cluster... It will allow you to create a custom resource that defines the behaviour and scope of a managed K8sGPT workload." The default reconciliation interval is 30 seconds. Output goes to in-cluster Result CRDs, with optional Slack, Mattermost, and CloudEvents sinks. Prometheus and Grafana integration is exposed through ServiceMonitor and dashboard parameters (k8sgpt-operator docs).

Architecturally: HolmesGPT's Operator Mode is event-driven and incident-shaped (run on alert, run on schedule). K8sGPT's operator is poll-shaped (scan every 30 seconds, surface anomalies).

Which LLM providers does each tool support?

Both projects support multiple LLM backends. The lists do not overlap fully.

K8sGPT's source code at pkg/ai/iai.go registers 17 backends as of May 2026: openai, anthropic, localai, ollama, azureopenai, cohereai, amazonbedrock, amazonbedrockconverse, amazonsagemaker, googleai, noopai, huggingface, googlevertexai, ocigenai, customrest, ibmwatsonxai, groq.

HolmesGPT's docs site navigation enumerates: Anthropic, AWS Bedrock, Azure AI Foundry, Gemini, GitHub Copilot, GitHub Models, Google Vertex AI, Ollama, OpenRouter, OpenAI, OpenAI-Compatible, Robusta AI.

The two lists overlap heavily on the headline providers (Anthropic, OpenAI, Azure OpenAI, Bedrock, Google Vertex AI, Ollama) and diverge at the edges. K8sGPT's edge list leans enterprise: IBM watsonx, Oracle OCI GenAI, Cohere, Groq, HuggingFace, Amazon SageMaker, and a generic Custom REST endpoint. HolmesGPT's edge list leans developer-tooling: GitHub Copilot, GitHub Models, Azure AI Foundry, OpenRouter, Robusta AI, and an OpenAI-Compatible (LiteLLM proxy) catch-all. The right choice usually comes from the LLM the security team has already approved, not from this list.

How does each tool handle Model Context Protocol?

Both projects support MCP, and again the shape differs.

K8sGPT hosts an MCP server that the project ships: "K8sGPT provides a Model Context Protocol server that exposes Kubernetes operations as standardized tools for AI assistants." The server exposes "12 tools for cluster analysis, resource management, and debugging" and "3 resources for cluster information access," with "Stateless HTTP mode for one-off invocations" and "Full integration with Claude Desktop and other MCP clients." The MCP v2 feature lands in release v0.4.27 on 18 December 2025.

HolmesGPT consumes MCP servers as data sources rather than hosting one. The data-sources catalogue lists MCP-labelled integrations for AWS, Azure, GitHub, GitLab, Jenkins, GCP, Kubernetes Remediation, MariaDB, Prefect, Sentry, and Splunk. The docs navigation makes the consumption pattern explicit through entries like "MCP Servers" and "OAuth MCP Servers."

The implication: K8sGPT publishes cluster operations for Claude Desktop and other MCP clients to consume. HolmesGPT subscribes to MCP-published tools across third-party systems. Teams building MCP-shaped workflows will pick the direction that matches their existing investment.

Who governs each project, and how does that change the trust story?

The CNCF Sandbox label is identical on both projects. The economic shape behind each is not.

HolmesGPT is held under "HolmesGPT a Series of LF Projects, LLC", with origin attribution: "Originally created by Robusta.Dev, with major contributions from Microsoft". Robusta sells a managed SaaS product that integrates HolmesGPT, and Slack and Microsoft Teams integrations are flagged "Available via Robusta". This is a sponsored-open-source pattern.

K8sGPT is community-led. The June 2024 CNCF blog states: "unlike many popular projects, there is no company behind this project, and no business plan behind it" (CNCF Blog, 7 June 2024). The same post names production users: "Companies like Kubermatic, SpectroCloud, and Nethopper have enthusiastically embraced K8sGPT capabilities."

Neither shape is structurally better. Sponsored open source ships polish and integrations faster; community open source is harder to commercially deprecate. Match the governance to the team's risk model.

Release cadence and recent feature deltas

HolmesGPT shipped v0.30.1 on 20 May 2026, with notes for the release covering Loki raw-response handling on parse failure, a GitLab MCP datasource entry, a Bash echo allowlist fix, user_email persistence on chat requests, and documentation refinements (release tag).

K8sGPT's recent releases include v0.4.33 ("analyze previous logs for restarted containers," 13 May 2026), v0.4.32 ("add Azure API Type Support and add Custom HTTP Header," 22 April 2026), and v0.4.27 ("mcp v2," 18 December 2025) (Releases).

Both projects ship monthly or near-monthly. Neither has demonstrated a multi-month pause in the period documented.

What HolmesGPT and K8sGPT are NOT

Three misreadings of this comparison show up repeatedly in vendor briefings and procurement memos. Naming them in advance saves a procurement cycle.

  • Neither is an alerting platform. Alerts originate in Prometheus AlertManager, Grafana, Datadog, CloudWatch, or PagerDuty. HolmesGPT fetches alerts from "AlertManager, PagerDuty, OpsGenie, or Jira"; K8sGPT integrates downstream of Prometheus alert rules. Buying either tool does not solve "we have too many or too few alerts."
  • Neither is a full AIOps platform. AIOps is a 2017-era category built on statistical correlation and noise reduction. Both tools sit downstream of that layer: once an alert lands, the agent investigates. Teams running BigPanda, Moogsoft, Dynatrace Davis, or PagerDuty Intelligent Alert Grouping should not expect either project to replace those products.
  • Neither is a managed SaaS by default. Both are open-source projects requiring self-hosting. Robusta sells a managed product around HolmesGPT, which is the closest commercial offering. K8sGPT has no commercial entity behind it per the June 2024 CNCF blog. A team that needs a vendor SOC 2 report against the open-source binary itself will not find one.
  • K8sGPT is not a multi-cloud reasoning tool. Its analysers map one-to-one to Kubernetes resource types. A managed RDS, a Datadog dashboard, or an OVH Bare Metal instance is invisible to K8sGPT's analysers.
  • HolmesGPT is not a deterministic rules engine. Its agent loop uses LLM tool-calling, which means investigation paths are non-deterministic and depend on the LLM provider and prompt context. Teams that need bit-for-bit reproducible incident analysis should match expectations to the agent pattern, not against a runbook executor.

When should I choose HolmesGPT vs K8sGPT?

Pick HolmesGPT when:

  • The estate spans more than Kubernetes (VMs, managed databases, SaaS platforms at incident-critical positions).
  • The LLM choice is GitHub Copilot, GitHub Models, OpenRouter, or Robusta AI (HolmesGPT-specific).
  • The team wants a 24/7 background agent that can post to Slack and open GitHub PRs through MCP integration. Note that Operator Mode is marked as an Alpha release at time of writing.
  • The team values an explicit, project-documented "read-only access and respects RBAC" guarantee.
  • A managed SaaS option (via Robusta) is acceptable or attractive.

Pick K8sGPT when:

  • The estate is Kubernetes-first or Kubernetes-only.
  • The team wants a Go binary that runs as a CLI and an in-cluster operator out of the box.
  • The LLM choice is IBM watsonx, Oracle OCI GenAI, Cohere, Groq, HuggingFace, or Amazon SageMaker (K8sGPT-specific).
  • The team plans to publish cluster operations to MCP clients (Claude Desktop, custom tooling) rather than to consume external MCP services.
  • The team wants documented anonymisation of cluster object names and labels before LLM calls.
  • The team prefers a community-governed project with no commercial entity behind it.

The two are not directly substitutable for most teams. They are adjacent tools that can plausibly run alongside one another in a Kubernetes-heavy estate.

Where Aurora fits

Aurora by Arvo AI is a separate Apache 2.0 project at github.com/Arvo-AI/aurora. Compared to the two projects above, Aurora ships multi-cloud investigation (AWS, Azure, GCP, OVH, Scaleway, Kubernetes), a Memgraph-backed infrastructure dependency graph, hybrid (BM25 plus vector) RAG over runbooks and postmortems via Weaviate, and sandboxed kubectl execution into an isolated "untrusted" namespace with a four-layer command-safety pipeline (input rail, SigmaHQ signature match, per-org policy, LLM safety judge).

A team can run all three. The most common pattern in 2026 design-partner conversations is K8sGPT for continuous in-cluster posture, HolmesGPT or Aurora for incident investigation, and Aurora for the multi-cloud and remediation-staging path that K8sGPT does not target. For the full three-way comparison see Open-Source AI SRE: Aurora vs HolmesGPT vs K8sGPT.

Where this guide fits

holmesgpt
k8sgpt
ai sre
open source
cncf
kubernetes
site reliability engineering
agentic ai
comparison

Frequently Asked Questions

Try Aurora for Free

Open source, AI-powered incident management. Deploy in minutes.