About
This skill provides a comprehensive framework for Site Reliability Engineering (SRE) practices within Claude Code, enabling developers to define and track service performance accurately. It guides users through the process of establishing Service Level Indicators (SLIs), setting meaningful Service Level Objectives (SLOs), and managing error budgets to optimize the trade-off between innovation and reliability. With built-in Prometheus recording rules and Grafana dashboard structures, it helps teams implement automated monitoring and multi-window burn rate alerting to reduce noise and improve incident response times.