Get a Head Start on Fixing Alerts with AI Investigation

Jul 29, 2023 | Programming

HolmesGPT – The Open Source On-Call DevOps Agent

Examples
Key Features
Installation
YouTube Demonstration

HolmesGPT is the only AI assistant that investigates incidents like a human does by looking at alerts and fetching missing data until it finds the root cause. Powered by OpenAI, Azure AI, AWS Bedrock, or any tool-calling LLM of your choice, including open-source models.

What Can HolmesGPT Do?

  • Investigate Incidents (AIOps) from various sources including PagerDuty, OpsGenie, Prometheus, and Jira.
  • Bidirectional Integrations to see investigation results inside your existing ticketing/incident management system.
  • Automated Triage: Use HolmesGPT as a first responder to flag critical alerts and prioritize them.
  • Alert Enrichment: Automatically add context to alerts such as logs and microservice health information for faster root cause determination.
  • Identify Cloud Problems by querying HolmesGPT regarding unhealthy infrastructure.
  • Runbook Automation in Plain English: Speed up responses by investigating according to provided runbooks.

See it in Action!

AI Alert Analysis

Examples

Kubernetes Troubleshooting

bash
holmes ask what pods are unhealthy in my cluster and why?

Prometheus Alert RCA (Root Cause Analysis)

Investigate Prometheus alerts right from Slack with the official Robusta integration.

Prometheus Alert RCA

Key Features

  • Connects to Existing Observability Data: Find new correlations without the need to gather new data.
  • Compliance Friendly: Run on-premise or in the cloud with your LLM.
  • Transparent Results: A log of the AI’s actions gives insight into its conclusions.
  • Extensible Data Sources: Custom data can be connected providing tool definitions.
  • Runbook Automation: Optionally provide runbooks to automate the AI’s investigation process.
  • Integrates with Existing Workflows: Connect Slack and Jira for seamless results delivery.

Installation

Prerequisite: Get an API key for a supported LLM.

Installation Methods:

  • Brew (Mac/Linux)
            sh
            brew tap robusta-dev/homebrew-holmesgpt
            brew install holmesgpt
            holmes --help
            
            
  • Prebuilt Docker Containers
            bash
            docker run -it --net=host \
            -v ~/.holmes:root/.holmes \
            -v ~/.aws:root/.aws \
            -v $HOME/.kubeconfig:root/.kubeconfig \
            us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes ask what pods are unhealthy and why?
            
            
  • Cutting Edge (Pip and Pipx)
            bash
            pipx install https://github.com/robusta-dev/holmesgpt/archive/refs/heads/master.zip
            holmes version
            
            

Troubleshooting

As with any software, installation and usage can sometimes come with issues. Here are some troubleshooting tips:

  • Ensure that you have all necessary dependencies installed, especially if running from source.
  • If your AI model is not responding, verify that your API key is active and correctly set up in your environment variables.
  • Double-check that your network configurations allow for proper API calls to the service you’ve chosen.
  • If any integration (like Slack, Jira, etc.) fails, ensure that your tokens and API keys are correct and that you’ve set the correct environment variables.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Customizing HolmesGPT

HolmesGPT can be tailored further to investigate specific issues by adding:

  • Custom Tools: More data capabilities through YAML-defined tools.
  • Custom Runbooks: Explicit instructions to enhance investigation efficiency.

Conclusion

By using HolmesGPT, teams can automate the investigation and resolution of alerts seamlessly, significantly reducing the time taken to identify issues. Its human-like reasoning combined with the ability to integrate into existing workflows makes it an invaluable asset for DevOps teams.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox