01Generate QA datasets from diverse documents (PDFs, etc.) with configurable settings
0228 GitHub stars
03Evaluate RAG systems using a simple 2-line CLI or a flexible Python API
04Ensure 100% data privacy with support for local LLMs like Ollama, requiring no cloud API keys
05Integrate with various LLM providers including OpenAI, Anthropic, DashScope, vLLM, and any OpenAI-compatible API
06Conduct detailed multi-metric evaluations across five diagnostic dimensions (Correctness, Completeness, Relevance, Conciseness, Faithfulness)