Is there a page limit for document processing?

Yes, the skill supports PDF documents up to 1,000 pages long, with a fixed cost of 258 tokens per page regardless of content density.

Do I need a specific API key for this skill?

Yes, you must provide a GEMINI_API_KEY from Google AI Studio. This can be configured via environment variables or a local .env file within your project.

What types of files can this skill process?

This skill is optimized for PDF documents, utilizing Gemini's native vision capabilities to understand layout and visual elements. Text-only formats like TXT and Markdown are also supported.

How does it handle files larger than 20MB?

For files exceeding 20MB, the skill automatically switches from inline encoding to the Google File API, ensuring reliable processing of large documents.

Gemini Document Processing

Name: Gemini Document Processing
Author: nodays-off

bynodays-off

0•

Data Science & ML

Integrates Google Gemini's native vision capabilities to analyze, summarize, and extract structured data from complex PDF documents.

Gemini Document Processing is a specialized skill designed to leverage Google Gemini's multimodal power for high-fidelity document analysis. Unlike standard OCR, this skill understands the visual context of PDFs—including charts, tables, diagrams, and images—for up to 1,000 pages. It is ideal for developers needing to automate data extraction into validated JSON formats, generate context-aware summaries of long reports, or build intelligent Q&A systems on top of unstructured documentation. With built-in support for both small inline files and large documents via the Google File API, it provides a robust framework for professional-grade document intelligence pipelines.

Key Features

01Native PDF vision processing for documents up to 1,000 pages

020 GitHub stars

03Context-aware summarization and document-based Q&A

04Automated handling of large files (>20MB) via Google File API

05Multimodal understanding of charts, diagrams, and complex layouts

06Structured data extraction with Pydantic and JSON schema validation

Use Cases

01Creating automated executive summaries for high-volume research or financial reports

02Extracting specific fields from invoices, resumes, and medical forms into structured databases

03Converting complex technical manuals or legal contracts into clean Markdown or HTML

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add nodays-off/hogans-alley ai-tools

For use in Claude.ai and ChatGPT

Download Skill