Discover Agent Skills for web scraping & data collection. Browse 17 skills for Claude, ChatGPT & Codex.
Deploys a local, privacy-respecting metasearch engine to aggregate web, package repository, and code results in structured JSON.
Automates the collection and organization of AI and data-related job listings from Zighang into Obsidian-compatible markdown.
Automates the discovery, extraction, and organization of academic literature for qualitative research and theoretical pattern identification.
Extracts event data from Instagram, Facebook, and web aggregators to power local media newsletters.
Converts any webpage into clean, formatted Markdown using Chrome CDP for full JavaScript rendering and metadata extraction.
Automates the discovery, extraction, and organization of academic literature for qualitative research and theoretical pattern extraction.
Searches multiple torrent trackers and automates content downloading via magnet links and WebTorrent.
Automates the discovery, retrieval, and organization of academic literature for qualitative research and theoretical pattern extraction.
Performs neural, context-aware web searches and deep research tasks to find high-quality information that keyword matching misses.
Automates the collection of bookmarked job postings from Zighang and synchronizes them into Obsidian as structured Markdown files.
Manages local API response caching for Wayback Machine operations to optimize performance and ensure data freshness.
Implements ethical, resilient, and legally compliant web scraping strategies to extract high-quality data while avoiding bot detection.
Orchestrates a multi-source image pipeline to download, validate, and normalize fighter photos from Wikimedia, Sherdog, and Bing.
Retrieves the earliest archived snapshot of any URL from the Wayback Machine to identify a website's original version.
Crawls global AI news sources to generate deduplicated, Chinese-language summaries in a structured JSON format.
Adds and configures Instagram accounts and web aggregators to local media event tracking systems.
Optimizes data extraction from websites and APIs using specialized Python scripts to maximize performance and minimize token consumption.
Discovers related web content, articles, and research papers using AI-powered similarity matching via Exa.ai.
Lists and manages archived snapshots from the Wayback Machine to track website history and recover lost content.
Conducts complex, multi-step asynchronous research and deep analysis using Exa's AI-driven search engine.
Executes autonomous multi-step research and information synthesis using the Google Gemini Deep Research Agent.
Extracts structured data and AI-generated summaries from any URL with high token efficiency and live crawling.
Generates fact-based answers and structured data from the web using AI-powered search and synthesis.
Automates the periodic search and refresh of Exa.ai websets to keep your data collections continuously updated.
Extracts and ingests social graph data and content from the AT Protocol and Bluesky into structured formats.
Downloads high-quality videos and audio from YouTube and other platforms for offline viewing, editing, and archival.
Conducts multi-step, iterative web investigations to produce comprehensive, structured research reports on any topic.
Extracts deep web content, screenshots, and parsed PDF data using the Firecrawl API.
Extracts fully rendered HTML and dynamic content from JavaScript-heavy websites using headless browser automation.
Extracts subtitles and transcripts from YouTube videos and saves them as local text files with timestamps.
Scroll for more results...