Extends effective AI context capacity through strategic compaction, observation masking, and KV-cache optimization to reduce token costs and improve performance.
This skill provides a comprehensive framework for managing AI context windows, enabling Claude to handle complex, long-running tasks that would otherwise exceed token limits. It implements advanced techniques such as summary-based compaction, observation masking to elide verbose tool outputs, and KV-cache-friendly data ordering to maximize inference efficiency. By focusing on signal preservation over raw data retention, this skill allows agents to maintain high performance across massive codebases and extended conversations while significantly reducing operational latency and API costs.
Key Features
01Intelligent context compaction and summarization
02Observation masking for verbose tool outputs
03KV-cache optimization for cost and latency reduction
04Sub-agent partitioning for isolated context workflows
05Trigger-based optimization based on token utilization
0622,855 GitHub stars
Use Cases
01Managing long-running agent trajectories without performance degradation
02Analyzing massive documentation or codebases that exceed standard context limits
03Scaling production AI systems by minimizing token consumption and costs