013 GitHub stars
02Out-of-core processing for datasets containing billions of rows
03Optimized I/O for HDF5, Apache Arrow, and Parquet file formats
04Seamless integration with scikit-learn, XGBoost, and CatBoost
05Lazy evaluation and virtual columns for memory-efficient feature engineering
06High-speed statistical aggregations and 1D/2D visualizations