Reduces LLM token usage by semantically compressing prompts while preserving meaning and core constraints.
Sponsored
Token Compressor is a sophisticated two-stage pipeline designed to optimize Large Language Model (LLM) workflows by significantly reducing token usage without compromising semantic intent. It employs a local LLM to rewrite prompts to their semantic minimum, focusing on preserving all conditionals and negations. This initial compression is then validated using embedding similarity; if the compressed prompt's cosine similarity to the original falls below a set threshold, the original prompt is used as a fallback, ensuring no critical meaning is lost. The result is shorter prompts, lower operational costs, and consistent LLM response quality.
Key Features
01Two-stage semantic compression pipeline
02LLM-based prompt rewriting (llama3.2:1b)
033 GitHub stars
04CLI, Python API, and MCP server integration
05Preserves conditionals and negations in prompts
06Embedding validation to prevent meaning loss
Use Cases
01Minimize LLM token consumption and associated costs
02Integrate into LLM applications for automatic prompt compression
03Optimize prompt length for models with token limits