About
This skill provides a comprehensive framework for Site Reliability Engineering (SRE) practices within Claude Code. It enables developers to define availability and latency SLIs, set realistic SLO targets, and calculate error budgets to balance innovation with service stability. By providing ready-to-use Prometheus recording rules and multi-window burn rate alerts, it helps teams automate observability and implement data-driven decision-making for deployment freezes and reliability improvements.