Web Scraping & Data Collection Agent Skills

Discover Agent Skills for web scraping & data collection. Browse 17 skills for Claude, ChatGPT & Codex.

SearXNG Local Search

Deploys a local, privacy-respecting metasearch engine to aggregate web, package repository, and code results in structured JSON.

Zighang AI Job Search

Automates the collection and organization of AI and data-related job listings from Zighang into Obsidian-compatible markdown.

Academic Literature Sweep

Automates the discovery, extraction, and organization of academic literature for qualitative research and theoretical pattern identification.

URL to Markdown Converter

Converts any webpage into clean, formatted Markdown using Chrome CDP for full JavaScript rendering and metadata extraction.

Academic Literature Sweep

Automates the discovery, extraction, and organization of academic literature for qualitative research and theoretical pattern extraction.

Torrent Search & Download

Searches multiple torrent trackers and automates content downloading via magnet links and WebTorrent.

Qualitative Literature Orchestrator

Automates the discovery, retrieval, and organization of academic literature for qualitative research and theoretical pattern extraction.

Exa Semantic Search

Performs neural, context-aware web searches and deep research tasks to find high-quality information that keyword matching misses.

Zighang Job Scraper

Automates the collection of bookmarked job postings from Zighang and synchronizes them into Obsidian as structured Markdown files.

Wayback Cache Management

Manages local API response caching for Wayback Machine operations to optimize performance and ensure data freshness.

Ethical Web Scraping & Data Extraction

Implements ethical, resilient, and legally compliant web scraping strategies to extract high-quality data while avoiding bot detection.

Managing Fighter Images

Orchestrates a multi-source image pipeline to download, validate, and normalize fighter photos from Wikimedia, Sherdog, and Bing.

Wayback Oldest Archive Finder

Retrieves the earliest archived snapshot of any URL from the Wayback Machine to identify a website's original version.

AI News Crawler & Summarizer

Crawls global AI news sources to generate deduplicated, Chinese-language summaries in a structured JSON format.

Efficient Web Scraping

Optimizes data extraction from websites and APIs using specialized Python scripts to maximize performance and minimize token consumption.

Exa Find Similar

Discovers related web content, articles, and research papers using AI-powered similarity matching via Exa.ai.

Wayback Machine Explorer

Lists and manages archived snapshots from the Wayback Machine to track website history and recover lost content.

Exa Deep Research

Conducts complex, multi-step asynchronous research and deep analysis using Exa's AI-driven search engine.

Gemini Deep Research

Executes autonomous multi-step research and information synthesis using the Google Gemini Deep Research Agent.

Exa Get Contents

Extracts structured data and AI-generated summaries from any URL with high token efficiency and live crawling.

Exa Answer

Generates fact-based answers and structured data from the web using AI-powered search and synthesis.

Exa Webset Monitor

Automates the periodic search and refresh of Exa.ai websets to keep your data collections continuously updated.

AT Protocol Data Ingest

Extracts and ingests social graph data and content from the AT Protocol and Bluesky into structured formats.

Video Downloader

Downloads high-quality videos and audio from YouTube and other platforms for offline viewing, editing, and archival.

Deep Research

Conducts multi-step, iterative web investigations to produce comprehensive, structured research reports on any topic.

Firecrawl Scraper

Extracts deep web content, screenshots, and parsed PDF data using the Firecrawl API.

DOM Reader & Web Scraper

Extracts fully rendered HTML and dynamic content from JavaScript-heavy websites using headless browser automation.

YouTube Transcript Extractor

Extracts subtitles and transcripts from YouTube videos and saves them as local text files with timestamps.

30 results loaded • More available

Scroll for more results...

Web Scraping & Data Collection Agent Skills

SearXNG Local Search

Zighang AI Job Search

Academic Literature Sweep

Newsletter Events Research

URL to Markdown Converter

Academic Literature Sweep

Torrent Search & Download

Qualitative Literature Orchestrator

Exa Semantic Search

Zighang Job Scraper

Wayback Cache Management

Ethical Web Scraping & Data Extraction

Managing Fighter Images

Wayback Oldest Archive Finder

AI News Crawler & Summarizer

Newsletter Event Source Manager

Efficient Web Scraping

Exa Find Similar

Wayback Machine Explorer

Exa Deep Research

Gemini Deep Research

Exa Get Contents

Exa Answer

Exa Webset Monitor

AT Protocol Data Ingest

Video Downloader

Deep Research

Firecrawl Scraper

DOM Reader & Web Scraper

YouTube Transcript Extractor

Web Scraping & Data Collection Agent Skills

SearXNG Local Search

Zighang AI Job Search

Academic Literature Sweep

Newsletter Events Research

URL to Markdown Converter

Academic Literature Sweep

Torrent Search & Download

Qualitative Literature Orchestrator

Exa Semantic Search

Zighang Job Scraper

Wayback Cache Management

Ethical Web Scraping & Data Extraction

Managing Fighter Images

Wayback Oldest Archive Finder

AI News Crawler & Summarizer

Newsletter Event Source Manager

Efficient Web Scraping

Exa Find Similar

Wayback Machine Explorer

Exa Deep Research

Gemini Deep Research

Exa Get Contents

Exa Answer

Exa Webset Monitor

AT Protocol Data Ingest

Video Downloader

Deep Research

Firecrawl Scraper

DOM Reader & Web Scraper

YouTube Transcript Extractor