01Preservation of document structure including markdown tables, headings, and lists
022 GitHub stars
03Natural sorting for image sequences to ensure coherent document flow
04Fast processing speeds optimized for local performance and sequential extraction
05Private batch OCR processing running entirely on local hardware via Ollama
06Multi-level verbosity logging for detailed extraction tracking and troubleshooting