Can I use this for very large tables?

Yes, the skill utilizes efficient aggregation queries and includes sampling techniques for representative rows, though performance will depend on your database engine's resources.

Does this skill modify my database data?

No, the skill only performs read-only SELECT queries and metadata lookups to generate profiles, statistics, and quality assessments.

What specific data quality dimensions are evaluated?

The skill evaluates five key dimensions: completeness (null checks), uniqueness (duplicate detection), freshness (timestamp analysis), validity (range/format checks), and logical consistency.

What databases does the Profiling Tables skill support?

It is designed to work with any database that supports standard SQL and provides an INFORMATION_SCHEMA, such as PostgreSQL, Snowflake, BigQuery, and Redshift.

Profiling Tables

Name: Profiling Tables
Author: astronomer

byastronomer

•

Database Management

Generates comprehensive data profiles and quality assessments for database tables to help engineers and analysts understand dataset structure and health.

This skill automates the process of data discovery by performing a deep-dive analysis of SQL database tables. It extracts schema metadata, calculates type-specific column statistics, analyzes cardinality, and performs a multi-dimensional data quality assessment covering completeness, uniqueness, freshness, validity, and consistency. It is an essential tool for data engineers, analysts, and developers who need to quickly grasp the nuances of an unfamiliar dataset, identify potential data issues, or generate documentation without writing dozens of manual exploratory queries.

Key Features

01Type-specific column statistics for numeric, string, and date types

02Automated schema and metadata extraction from INFORMATION_SCHEMA

03Multi-dimensional data quality scoring and issue identification

0477 GitHub stars

05Representative data sampling and structured summary reporting

06Cardinality and frequency analysis for categorical data distribution

Use Cases

01Generating detailed data documentation and summaries for internal catalogs

02Onboarding new team members to unfamiliar database schemas and datasets

03Performing data quality audits prior to building ETL pipelines or ML models

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add astronomer/agents profiling-tables

For use in Claude.ai and ChatGPT

Download Skill