Web Scraping & Data Collection Agent Skills

Discover Agent Skills for web scraping & data collection. Browse 17skills for Claude, ChatGPT & Codex.

PDF Text Extractor

Automates the downloading and text extraction of academic PDFs to provide high-fidelity evidence for research pipelines.

BioRxiv Database Research Tool

Enables efficient searching and retrieval of life sciences preprints from the bioRxiv server for research and analysis.

Web Research Agent

Conducts structured, multi-threaded web research by coordinating subagents to gather and synthesize complex information into comprehensive reports.

Bright Data Web Scraper

Automates web content retrieval using a progressive four-tier fallback strategy to bypass bot detection and access restrictions.

Bright Data Progressive Scraper

Implements a four-tier progressive escalation strategy to reliably scrape web content and bypass advanced bot detection.

Bright Data Progressive Scraper

Implements a four-tier progressive scraping strategy to bypass bot detection and reliably extract web content.

BioRxiv Database Search

Searches and retrieves life sciences preprints from the bioRxiv database with advanced filtering and PDF download capabilities.

Video Downloader

Downloads high-quality videos and audio from YouTube and other platforms for offline access and archival.

Video Downloader

Downloads high-quality video and audio content from YouTube and other platforms directly through your terminal workspace.

Structured Web Research

Automates multi-step information gathering and synthesis using structured planning and parallel subagents.

Structured Web Research

Conducts deep web investigations by delegating tasks to specialized subagents and synthesizing findings into organized reports.

Matrix Repomix

Packs external GitHub or local repositories into a token-efficient format for deep context analysis within Claude Code.

Structured Web Research

Conducts systematic web research through autonomous subagent delegation and multi-source synthesis.

Bright Data Progressive Scraper

Retrieves web content through a four-tier progressive fallback strategy to bypass bot detection and access restrictions.

Advanced Progressive Web Scraper

Automates web content extraction using a four-tier fallback strategy to bypass bot detection and CAPTCHAs.

Web Research Agent

Conducts deep, multi-faceted web research by orchestrating parallel subagents to plan, gather, and synthesize complex information.

MarkItDown Document Converter

Converts complex file formats including PDF, Office documents, and media into clean Markdown optimized for LLM processing.

Claude Community Insights & Feature Research

Analyzes Reddit community discussions to identify feature requests, user pain points, and emerging use cases for Claude AI and Claude Code.

YouTube Transcriber

Extracts subtitles and transcripts from YouTube videos directly into local text files using command-line tools or browser automation.

llms.txt Support

Detects and ingests LLM-optimized documentation via the llms.txt standard to accelerate context gathering for autonomous agents.

Documentation Scraper

Scrapes documentation websites and transforms them into organized, categorized reference files for AI context and offline archives.

Documentation Scraper

Transforms documentation websites into structured, categorized reference files optimized for AI context and offline archives.

Z.AI CLI Multi-Tool

Enhances Claude with advanced vision analysis, real-time web searching, and deep GitHub repository exploration capabilities.

YouTube Transcript Extractor

Downloads and formats YouTube video transcripts with precise timestamps for streamlined content analysis and text extraction.

Topic Collector

Automates the gathering of AI industry trends, product launches, and developer insights from multiple high-signal web sources.

Web Article Translator

Translates web articles and blog posts into high-quality Chinese Markdown files while preserving original imagery and formatting.

Markdown Web Scraper

Extracts web page content and converts it into clean, readable Markdown for seamless AI analysis and data collection.

YouTube Transcript to Chinese

Extracts YouTube subtitles and generates formatted Chinese transcripts with optional translation and timestamp support.

YouTube Collector

Manages YouTube channel tracking by automating video content collection, transcript retrieval, and structured summary generation.

30 results loaded • More available

Scroll for more results...

Web Scraping & Data Collection Agent Skills

PDF Text Extractor

BioRxiv Database Research Tool

Web Research Agent

Bright Data Web Scraper

Bright Data Progressive Scraper

Bright Data Progressive Scraper

BioRxiv Database Search

Video Downloader

Video Downloader

Structured Web Research

Structured Web Research

Matrix Repomix

Structured Web Research

Bright Data Progressive Scraper

Advanced Progressive Web Scraper

Web Research Agent

MarkItDown Document Converter

Claude Community Insights & Feature Research

YouTube Transcriber

llms.txt Support

Documentation Scraper

Documentation Scraper

Z.AI CLI Multi-Tool

Newsletter Event Updater

YouTube Transcript Extractor

Topic Collector

Web Article Translator

Markdown Web Scraper

YouTube Transcript to Chinese

YouTube Collector

Web Scraping & Data Collection Agent Skills

PDF Text Extractor

BioRxiv Database Research Tool

Web Research Agent

Bright Data Web Scraper

Bright Data Progressive Scraper

Bright Data Progressive Scraper

BioRxiv Database Search

Video Downloader

Video Downloader

Structured Web Research

Structured Web Research

Matrix Repomix

Structured Web Research

Bright Data Progressive Scraper

Advanced Progressive Web Scraper

Web Research Agent

MarkItDown Document Converter

Claude Community Insights & Feature Research

YouTube Transcriber

llms.txt Support

Documentation Scraper

Documentation Scraper

Z.AI CLI Multi-Tool

Newsletter Event Updater

YouTube Transcript Extractor

Topic Collector

Web Article Translator

Markdown Web Scraper

YouTube Transcript to Chinese

YouTube Collector