011 GitHub stars
02Semantic boundary detection using embedding similarity thresholds
03Late chunking and contextual retrieval for long-context models
04Structure-aware parsing for code, Markdown, tables, and PDFs
05Integrated evaluation framework for measuring retrieval precision and recall
06Hierarchical Recursive Character Chunking for structural document integrity