How does the URL contract ensure data integrity?

The URL contract mandates the use of real canonical URLs instead of placeholders, allowing the system to re-fetch source content and validate data provenance during restoration.

Why does this skill recommend JSON for backups instead of SQL dumps?

JSON backups are more portable, easier to inspect, and allow for version control (git diffs), whereas SQL dumps are often binary-dependent and harder to audit.

What is a golden dataset in AI development?

A golden dataset is a curated collection of high-quality examples used as ground truth to benchmark model performance, measure retrieval quality, and ensure code changes don't cause regressions.

Can I use this skill to automate backups in CI/CD?

Yes, it includes patterns for integrating with GitHub Actions to perform scheduled backups and verify data integrity automatically during the deployment pipeline.

Does it back up vector embeddings?

No, it intentionally excludes embeddings to reduce file size and ensure portability; embeddings are regenerated upon restoration to match the current environment's model version.

Golden Dataset Management

Name: Golden Dataset Management
Author: yonatangross

byyonatangross

•

Data Science & ML

Protects and maintains high-quality test datasets for AI/ML systems through automated backup, restoration, and integrity validation.

This skill provides a production-ready framework for managing 'golden datasets'—the curated ground-truth examples used to benchmark RAG systems, evaluate search quality, and prevent regressions. It automates critical data protection tasks including JSON-based backups for version control, content validation via URL contracts, and disaster recovery workflows. By separating content from embeddings, it ensures your evaluation data remains portable and reproducible across different environments and model versions, making it an essential tool for AI engineers focused on retrieval quality and system reliability.

Key Features

01Automated JSON-based backup and restoration for version control

02CI/CD integration patterns for scheduled backups and automated audits

03Embedding regeneration logic during dataset restoration to support model updates

04Disaster recovery workflows for database migrations and accidental deletions

05URL contract validation to ensure data provenance and prevent placeholders

0629 GitHub stars

Use Cases

01Maintaining consistent test data across development, staging, and production environments

02Benchmarking RAG retrieval performance using precision, recall, and MRR metrics

03Protecting high-quality training examples from data loss during infrastructure migrations

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add yonatangross/skillforge-claude-plugin golden-dataset-management

For use in Claude.ai and ChatGPT

Download Skill