Skip to content

An SRE agent benchmark inspired by SWE-bench, focused on real-world Site Reliability Engineering tasks: incident response, infra changes, observability triage, and reliability improvements across Kubernetes environments.

License

Notifications You must be signed in to change notification settings

agentkube/SRE-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SRE-bench

We are building an SRE agent benchmark inspired by SWE-bench - an open and reproducible framework designed to evaluate agents on Kubernetes tasks: incident response, infra changes, observability triage, and reliability improvements. The repo will host modular scenarios (fault injectors, manifests, observability specs), an evaluation harness, and baseline agents.

The goal is to measure practical agent capabilities like time-to-diagnose, safe remediation rate, MTTR, and explainability.

Purpose

This repository also serves as:

  • Benchmarking platform for evaluating SRE agent performance
  • Agentkube POC environment for testing autonomous Kubernetes agents
  • Community-driven scenario library - users can contribute diverse scenarios to test their own agents

See scenario documentation for available test cases and contribution guidelines.

About

An SRE agent benchmark inspired by SWE-bench, focused on real-world Site Reliability Engineering tasks: incident response, infra changes, observability triage, and reliability improvements across Kubernetes environments.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages