015 GitHub stars
02High-speed batch processing at 0.01s/page via PyMuPDF
03AI-powered structure preservation with Docling for academic layouts
04Optimized Markdown output specifically formatted for RAG systems
05100% on-device processing with no external API calls or data leaks
06Precision table and list extraction for complex document formats