About
This skill provides a comprehensive framework for Site Reliability Engineering (SRE) practices, enabling teams to balance innovation velocity with system stability. It offers structured guidance on defining measurable SLIs, setting realistic SLO targets, and calculating error budgets using industry-standard formulas. With built-in support for Prometheus recording rules and multi-window burn rate alerts, it helps developers transition from reactive monitoring to proactive reliability management, ensuring that services meet user expectations through data-driven performance tracking.