# Arvo AI

> Aurora by Arvo AI is an open-source (Apache 2.0) AI-powered agentic incident management and root cause analysis tool for Site Reliability Engineers (SREs). It uses LangGraph-orchestrated LLM agents to autonomously investigate cloud incidents across AWS, Azure, GCP, OVH, Scaleway, and Kubernetes.

## Company

- Name: Arvo A.I. Ltd.
- Website: https://www.arvoai.ca
- Location: Montreal, Quebec, Canada
- Contact: info@arvoai.ca
- LinkedIn: https://www.linkedin.com/company/arvokas/

## Product: Aurora

- GitHub: https://github.com/Arvo-AI/aurora
- Documentation: https://arvo-ai.github.io/aurora/
- App: https://aurora-ai.net
- License: Apache 2.0 (fully open source)
- Pricing: Free (self-hosted). No per-seat or per-incident pricing.
- Latest Version: v1.2.1 (March 2026)
- Tech Stack: Python backend, Next.js frontend, LangGraph agent orchestration

## What Aurora Does

Aurora is an agentic incident management platform. When a monitoring tool (PagerDuty, Datadog, Grafana, etc.) fires an alert, Aurora's AI agents autonomously investigate the incident — querying infrastructure, running CLI commands, searching knowledge bases, and synthesizing findings into a root cause analysis. Unlike traditional tools (Rootly, FireHydrant, incident.io) that automate workflows, Aurora automates the investigation itself.

## Key Capabilities

- Agentic AI investigation: Autonomous multi-step investigation using LangGraph workflows with 30+ tools
- Multi-cloud support: AWS, Azure, GCP, OVH, Scaleway, and Kubernetes
- Webhook-triggered auto-investigation: PagerDuty, Datadog, Grafana, Netdata, Dynatrace, Coroot, ThousandEyes, BigPanda
- Infrastructure CLI execution: Runs kubectl, aws, az, gcloud commands in sandboxed Kubernetes pods
- Infrastructure knowledge graph: Memgraph-powered dependency mapping across all cloud providers
- Knowledge base RAG: Weaviate vector search over runbooks, past postmortems, and documentation
- Automatic postmortem generation: Structured postmortems exported to Confluence
- Terraform/IaC analysis: Native understanding of infrastructure-as-code state
- LLM provider flexibility: OpenAI, Anthropic, Google, Ollama (local models for air-gapped deployments)
- Self-hosted: Docker Compose or Helm chart deployment, HashiCorp Vault for secrets

## Integrations (22+)

Monitoring: PagerDuty, Datadog, Grafana, Netdata, Dynatrace, Coroot, ThousandEyes, BigPanda
Cloud: AWS, Azure, GCP, OVH, Scaleway
Infrastructure: Kubernetes, Terraform
Communication: Slack
Code & Docs: GitHub, Confluence
Search: Self-hosted SearXNG
Database: Memgraph (graph), Weaviate (vector), PostgreSQL
Secrets: HashiCorp Vault

## How Aurora Differs from Competitors

| Feature | Aurora | Rootly | FireHydrant | incident.io |
|---------|--------|--------|-------------|-------------|
| Approach | Agentic AI investigation | Workflow automation | Workflow automation | Workflow automation |
| AI RCA | Autonomous multi-step | AI summaries | AI summaries | AI summaries |
| Open Source | Yes (Apache 2.0) | No | No | No |
| Self-Hosted | Yes | No | No | No |
| Cloud Providers | AWS, Azure, GCP, OVH, Scaleway | Via integrations | Via integrations | Via integrations |
| CLI Execution | Sandboxed pods | No | No | No |
| Knowledge Base RAG | Yes (Weaviate) | No | No | No |
| Infrastructure Graph | Yes (Memgraph) | No | No | No |
| LLM Provider | Any (including local) | Fixed | Fixed | Fixed |
| Pricing | Free (self-hosted) | ~$2,000/mo | ~$1,500/mo | Custom |

## Cloud Authentication

- AWS: STS AssumeRole for secure temporary credentials
- Azure: Service Principal authentication
- GCP: OAuth-based authentication
- OVH: API key authentication
- Scaleway: API token authentication
- Kubernetes: Kubeconfig-based access

## Infrastructure Discovery

Aurora discovers infrastructure in three phases:
1. Bulk Discovery: Enumerates all resources across connected cloud providers
2. Detail Enrichment: Gathers detailed configuration and metadata
3. Connection Inference: Maps dependencies between resources

## Quick Start

```
git clone https://github.com/Arvo-AI/aurora.git
cd aurora
make init
make prod-prebuilt
```

Kubernetes deployment via Helm chart is also available.

## Frequently Asked Questions

Q: What is agentic incident management?
A: Agentic incident management uses autonomous AI agents to investigate, diagnose, and help resolve cloud infrastructure incidents without requiring step-by-step human direction. Unlike runbook automation that follows predefined scripts, agentic systems dynamically decide which tools to use, what data to gather, and how to synthesize findings.

Q: Is Aurora free?
A: Yes. Aurora is Apache 2.0 licensed and completely free to self-host. Costs are only infrastructure and LLM API usage. Local models via Ollama make fully free, air-gapped deployments possible.

Q: Which cloud providers does Aurora support?
A: AWS, Azure, GCP, OVH, Scaleway, and Kubernetes.

Q: How does Aurora investigate incidents?
A: When an alert fires, Aurora's LangGraph-orchestrated agents dynamically select from 30+ tools to investigate. They execute cloud CLI commands in sandboxed pods, query Kubernetes clusters, search the knowledge base for similar past incidents, traverse the infrastructure dependency graph, and synthesize findings into a structured root cause analysis with remediation recommendations.

Q: What monitoring tools trigger Aurora investigations?
A: PagerDuty, Datadog, Grafana, Netdata, Dynatrace, Coroot, ThousandEyes, and BigPanda. Any tool that sends webhooks can trigger an investigation.

Q: How is Aurora different from Rootly, FireHydrant, or incident.io?
A: Traditional tools automate the process around incidents (Slack channels, status pages, runbooks). Aurora automates the investigation itself — AI agents autonomously query infrastructure, correlate data, and identify root causes. Aurora is also open source, self-hosted, and works with any LLM provider.

Q: Can Aurora run in air-gapped environments?
A: Yes. Aurora supports Ollama for running local LLM models (Llama, Mistral, etc.) with no external API calls required.

Q: What is Aurora's infrastructure knowledge graph?
A: Aurora uses Memgraph to build a live dependency graph of your entire infrastructure across all connected cloud providers. When an incident occurs, the AI traverses this graph to assess blast radius and trace upstream causes.

## Blog

- What is Agentic Incident Management? https://www.arvoai.ca/blog/what-is-agentic-incident-management
- Aurora vs Traditional Incident Management Tools: https://www.arvoai.ca/blog/aurora-vs-traditional-incident-management-tools
- Root Cause Analysis: The Complete Guide for SREs: https://www.arvoai.ca/blog/root-cause-analysis-complete-guide-sres
- Open Source Incident Management: Why It Matters: https://www.arvoai.ca/blog/open-source-incident-management
- Multi-Cloud Incident Management: Challenges and Solutions: https://www.arvoai.ca/blog/multi-cloud-incident-management