01Automatic detection and conversion of mathematical formulas to LaTeX (OMML, MathML, MathJax, Unicode, OCR, LLM fallback)
02Automatic linking of text references (e.g., 'Figure 1') to extracted images, generating a figure index
03MCP Server integration for seamless document conversion workflows with AI assistants and web applications
04Intelligent extraction, automatic cropping, and rasterization of embedded images from various document types
05Support for a wide array of input formats including PDF, DOCX, PPTX, HTML, EPUB, Jupyter Notebooks, and images
060 GitHub stars