Data Science & ML Agent Skills

Discover Agent Skills for data science & ml. Browse 61 skills for Claude, ChatGPT & Codex.

PyTorch Lightning

Streamlines deep learning development by organizing PyTorch code into scalable, boilerplate-free Lightning modules and automated training workflows.

ETE Phylogenetic Toolkit

Provides comprehensive tools for phylogenetic tree manipulation, evolutionary analysis, and high-quality biological data visualization.

Scientific Critical Thinking

Evaluates scientific research rigor and evidence quality using standardized frameworks like GRADE and Cochrane.

Python Project Structure Refactor

Transforms monolithic Python research code and notebooks into modular, production-ready package structures.

SimPy Simulation Engine

Builds process-based discrete-event simulations in Python to model complex systems with resource contention and time-based events.

Embedding & Chunking Strategies

Optimizes vector search and RAG applications through intelligent embedding model selection and advanced document chunking strategies.

Bayesian Hierarchical Modeling

Implements hierarchical and multilevel Bayesian models with optimized parameterizations for robust statistical inference.

PennyLane Quantum Machine Learning

Builds, trains, and optimizes hybrid quantum-classical models using automatic differentiation and hardware-agnostic circuit programming.

Bayesian Modeling in R

Performs comprehensive Bayesian statistical modeling and posterior analysis using Stan-based R packages like brms and rstanarm.

Aeon Time Series ML

Empowers Claude to perform advanced time series machine learning, including classification, forecasting, and anomaly detection using the specialized aeon toolkit.

Tidymodels Code Review Patterns

Analyzes R machine learning code to detect data leakage, resampling violations, and workflow anti-patterns using tidymodels principles.

Zarr Python Data Storage

Manages large-scale N-dimensional arrays with chunked storage, compression, and seamless cloud integration for scientific computing pipelines.

PubMed Research Assistant

Accesses and queries the PubMed database for biomedical literature, systematic reviews, and citation management.

R Recipes Feature Engineering

Streamlines data preprocessing and feature engineering using R's Tidymodels recipes framework.

Bayesian Model Diagnostics

Performs comprehensive MCMC diagnostic checks and posterior predictive assessments for Bayesian models implemented in Stan or JAGS.

Spark Optimization

Optimizes Apache Spark performance through advanced partitioning, memory tuning, and shuffle management strategies.

LLM Evaluation & Benchmarking

Implements comprehensive evaluation frameworks for LLM applications using automated metrics, human-in-the-loop feedback, and A/B testing.

Vector Index Tuning

Optimizes vector database performance by tuning HNSW parameters, quantization strategies, and memory usage for high-scale search applications.

LangChain Architecture

Architects sophisticated LLM applications using agents, memory, and tool integration within the LangChain framework.

ML Pipeline Workflow

Orchestrates end-to-end MLOps pipelines from data preparation and model training to production deployment and monitoring.

Quantitative Backtesting Frameworks

Builds robust, production-grade backtesting systems for trading strategies while eliminating common statistical biases.

Quantitative Risk Metrics Calculation

Calculates comprehensive portfolio risk metrics like VaR, CVaR, and Sharpe ratios to monitor and manage financial exposure.

Prompt Engineering Patterns

Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production.

Hybrid Search Implementation

Implements optimized hybrid search patterns combining vector similarity and keyword matching to enhance RAG system recall.

Similarity Search Patterns

Implements efficient semantic search and vector database patterns for production-grade retrieval systems.

Video Processor & Transcriber

Automates video format conversion, audio extraction, and high-accuracy speech-to-text transcription using FFmpeg and Whisper.

Clinical Trials Biostatistics

Implements industry-standard clinical trial design and statistical analysis workflows using regulatory-compliant R packages.

Clinical Trial Time-to-Event Analysis

Provides expert guidance and R implementation patterns for survival analysis methods and non-proportional hazards in clinical trials.

R Genomics & Bioinformatics Analysis

Performs comprehensive genomics and bioinformatics statistical analysis using Bioconductor and R tidy modeling workflows.

Tidymodels Hyperparameter Tuning

Optimizes machine learning models using advanced hyperparameter tuning strategies within the R tidymodels ecosystem.

30 results loaded • More available

Scroll for more results...