01Zero-shot voice cloning from 3-10 second audio samples
02Multi-voice synthesis with 54+ voices, 19 emotions, and natural language style control
03Real-time streaming and SSML support for fine-grained speech control
04Paralinguistic tags (e.g., [laugh], [sigh]) and support for 23 languages
050 GitHub stars
06Integration as an MCP server, providing 40+ tools for AI agents