Discover Agent Skills for web scraping & data collection. Browse 17 skills for Claude, ChatGPT & Codex.
Adds and configures Instagram accounts and web aggregators to local media event tracking systems.
Extracts event data from Instagram, Facebook, and web aggregators to power local media newsletters.
Converts websites into LLM-ready markdown or structured data using the Firecrawl v2 API.
Discovers related web content, articles, and research papers using AI-powered similarity matching via Exa.ai.
Automates the periodic search and refresh of Exa.ai websets to keep your data collections continuously updated.
Generates fact-based answers and structured data from the web using AI-powered search and synthesis.
Conducts complex, multi-step asynchronous research and deep analysis using Exa's AI-driven search engine.
Checks the archival status and availability of URLs within the Internet Archive's Wayback Machine.
Searches for media and automates torrent downloads across multiple sources using a local API.
Conducts comprehensive market intelligence, company analysis, and competitive research using structured methodologies and automated data collection.
Downloads high-quality videos and audio from YouTube and other platforms for offline access and archival.
Searches multiple torrent trackers and automates content downloading via magnet links and WebTorrent.
Downloads YouTube videos and audio with customizable quality and format settings using yt-dlp integration.
Automates source gathering and note synthesis for the development and validation of Claude Code skills.
Crawls global AI news sources to generate deduplicated, Chinese-language summaries in a structured JSON format.
Extracts and ingests social graph data and content from the AT Protocol and Bluesky into structured formats.
Optimizes data extraction from websites and APIs using specialized Python scripts to maximize performance and minimize token consumption.
Lists and manages archived snapshots from the Wayback Machine to track website history and recover lost content.
Manages local API response caching for Wayback Machine operations to optimize performance and ensure data freshness.
Locates and retrieves the most recent archived version of any URL from the Internet Archive's Wayback Machine.
Retrieves and calculates the full historical archive span for any URL using the Wayback Machine.
Retrieves comprehensive GitHub user and organization profile data including repository counts, follower statistics, and account metadata.
Retrieves the earliest archived snapshot of any URL from the Wayback Machine to identify a website's original version.
Retrieves and manages historical visual snapshots of websites using the Internet Archive's Wayback Machine.
Converts batches of images and scanned documents into structured markdown files using local DeepSeek-OCR models via Ollama.
Orchestrates a multi-source image pipeline to download, validate, and normalize fighter photos from Wikimedia, Sherdog, and Bing.
Extracts and analyzes large PDF documents locally with semantic chunking to minimize token usage and maximize context efficiency.
Automates the collection and organization of AI and data-related job listings from Zighang into Obsidian-compatible markdown.
Performs neural, context-aware web searches and deep research tasks to find high-quality information that keyword matching misses.
Downloads videos, extracts high-quality audio, and generates clean, paragraph-style transcripts from YouTube and other media platforms.
Scroll for more results...