Discover Agent Skills for analytics & monitoring. Browse 47 skills for Claude, ChatGPT & Codex.
Configures and deploys OpenTelemetry pipelines to manage traces, metrics, and logs in Kubernetes environments.
Optimizes Python application speed and memory efficiency through advanced profiling, benchmarking, and implementation strategies.
Monitors active development in real-time to detect and prevent architectural drift and scope creep.
Monitors and analyzes application error rates across HTTP endpoints, databases, and background jobs to improve system reliability.
Automates the deployment and configuration of centralized logging infrastructure using ELK, Loki, or Splunk.
Analyzes infrastructure utilization and forecasts growth trends to provide proactive scaling recommendations and cost estimates.
Analyzes network request patterns and diagnoses latency bottlenecks to optimize application performance and communication efficiency.
Automates the deployment and configuration of centralized logging solutions like ELK, Loki, and Splunk for production environments.
Centralizes performance metrics from diverse applications and infrastructure into a unified monitoring and alerting system.
Automates the analysis and integration of logging, metrics, and tracing into existing software applications.
Analyzes and optimizes network request patterns to reduce latency and improve application performance.
Automates the deployment and configuration of production-ready monitoring stacks including Prometheus, Grafana, and Datadog.
Automates the configuration of uptime, transaction, and API monitoring to ensure application performance and availability.
Identifies and resolves memory leaks in code to improve application performance and stability.
Builds and manages production-ready Grafana dashboards for real-time observability and metric visualization.
Monitors real-time database health, detects long-running transactions, and identifies lock contention issues using proactive alerting.
Implements end-to-end request tracking across microservices using Jaeger and Tempo to identify performance bottlenecks and system dependencies.
Monitors and optimizes PostgreSQL and MySQL performance through real-time metrics, predictive alerts, and automated remediation.
Create and manage production-ready Grafana dashboards for real-time visualization of system, infrastructure, and application metrics.
Configures comprehensive Prometheus monitoring environments including metric collection, alerting rules, and service discovery.
Establishes measurable reliability targets using SLIs, SLOs, and error budgets to balance service stability with innovation velocity.
Verifies the integrity and availability of sovereign-memory systems to prevent context loss during AI workflows.
Monitors and optimizes application resource consumption including CPU, memory, and network I/O to improve performance and reduce costs.
Monitors PostgreSQL and MySQL health using real-time metrics and predictive alerts to ensure database performance and uptime.
Diagnoses and resolves application performance issues across CPU, memory, I/O, and database layers to optimize resource utilization.
Identifies and resolves software bottlenecks through systematic measurement and empirical optimization workflows.
Manages and reports software implementation progress through real-time task tracking and automated metric calculation.
Configures and enables OpenTelemetry monitoring for Claude Code to track metrics, logs, and traces in real-time.
Identifies and resolves software performance bottlenecks through systematic measurement and data-driven optimization.
Configures comprehensive Prometheus monitoring for infrastructure and applications through metrics collection, scraping, and alerting rules.
Scroll for more results...