What monitoring tools does this Claude Code skill support?

It is designed to generate configurations and strategies for industry-standard tools including Prometheus, Grafana, ELK/Kibana, Jaeger, and alerting platforms like PagerDuty.

Does it provide guidance on log and trace management?

Yes, it defines structured log formats and recommends trace sampling strategies (head-based vs. tail-based) to balance visibility with storage costs.

Can this skill help with dashboard design?

Yes, it creates hierarchical dashboard specifications (Overview, Service, Component, Instance) following cognitive load best practices and the Golden Signals methodology.

How does it help reduce on-call alert fatigue?

The skill applies multi-window burn-rate alerting logic and hysteresis, ensuring that only significant threats to your SLOs trigger a page, while less urgent issues are routed to lower-priority channels.

Observability Designer

Name: Observability Designer
Author: borghei

byborghei

•

Analytics & Monitoring

Architects production-grade observability strategies using SLI/SLO frameworks, multi-window alerting, and golden signal monitoring.

The Observability Designer skill enables Claude to act as a Senior Site Reliability Engineer, helping you design and implement comprehensive monitoring systems. It provides a rigorous workflow for defining Service Level Indicators (SLIs) and Objectives (SLOs), calculating error budgets, and configuring multi-window burn-rate alerts that minimize on-call fatigue. By integrating metrics, logs, and distributed traces into a cohesive strategy, it ensures your production services are reliable, transparent, and manageable through best-of-breed tools like Prometheus, Grafana, and Jaeger.

Key Features

01Automated SLI/SLO framework design and error budget calculation

02Multi-window burn-rate alert optimization to reduce notification noise

03Hierarchical Grafana dashboard generation based on Golden Signals

0450 GitHub stars

05Automated actionable runbook generation for critical alerts

06Tail-based distributed tracing and structured logging strategy design

Use Cases

01Designing data-driven dashboards for SREs, developers, and stakeholders

02Establishing reliability standards and monitoring for a new microservice

03Reducing alert fatigue by tuning noisy Prometheus rules and thresholds

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add borghei/claude-skills observability-designer

For use in Claude.ai and ChatGPT

Download Skill