What file formats are supported for ingestion?

The skill covers common formats including CSV, JSON, Parquet, and Excel, with specific recommendations for high-performance libraries like Polars and pyarrow.

Is it suitable for database migrations?

Absolutely, it includes Change Data Capture (CDC) patterns and logic for legacy database migration, schema evolution, and replication.

Which programming languages does this skill support?

It provides optimized ingestion patterns and code snippets for Python, TypeScript/Node.js, Rust, and Go.

Can this skill help with real-time data processing?

Yes, it includes specific patterns and best practices for streaming ingestion from sources like Kafka and Kinesis, including backpressure and checkpoint management.

Does it include support for cloud storage providers?

Yes, it includes implementation details for AWS S3, Google Cloud Storage (GCS), and Azure Blob Storage patterns.

Data Ingestion Patterns

Name: Data Ingestion Patterns
Author: ancoleman

byancoleman

•

158

•

Database Management

Streamlines the process of loading data from cloud storage, APIs, and streaming sources into databases using optimized multi-language patterns.

This skill provides a comprehensive library of domain-specific patterns for ingesting data into systems from external sources like S3, GCS, REST APIs, and Kafka. It bridges the gap between raw data sources and structured databases by offering implementation guidance for Python, TypeScript, Rust, and Go, covering essential ETL/ELT strategies, batch processing, and Change Data Capture (CDC). Whether you are migrating a legacy database, processing high-performance Parquet files, or building a real-time streaming pipeline, this skill ensures best practices for idempotency, backpressure handling, and schema validation are applied correctly.

Key Features

01Multi-language support for Python, TypeScript, Go, and Rust implementation

02Standardized patterns for Cloud Storage (S3, GCS, and Azure Blob)

03API polling and webhook receiver implementation templates

04High-performance file processing using Polars and dlt (data load tool)

05158 GitHub stars

06Real-time streaming ingestion logic for Kafka and Kinesis

Use Cases

01Automating ETL pipelines to move CSV, JSON, and Parquet files into SQL/NoSQL databases

02Implementing Change Data Capture (CDC) for seamless database migrations and replication

03Building robust API consumers with built-in rate limiting, pagination, and backoff

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add ancoleman/ai-design-components ingesting-data

For use in Claude.ai and ChatGPT

Download Skill