01Persistent task queue for managing AI requests efficiently
02Dynamic model selection with allowlist and denylist capabilities
03Configurable and encrypted caching system to improve performance
04Comprehensive tools for Ollama interaction, including smart and speculative generation
050 GitHub stars
06Advanced prompt optimization and speculative decoding for AI models