About
This skill provides Claude with the specialized capability to perform rigorous performance analysis of machine learning models directly within your development workflow. By leveraging the model-evaluation-suite plugin, it automates the extraction and calculation of critical metrics like precision, recall, and F1-score, allowing developers and data scientists to compare model architectures, validate results against held-out datasets, and identify specific areas for optimization prior to production deployment.