Discover Agent Skills for web scraping & data collection. Browse 17skills for Claude, ChatGPT & Codex.
Extracts text and structural data from complex Microsoft Word documents containing nested tables, checkboxes, and multi-layered cell layouts.
Empowers Claude with real-time internet research capabilities by integrating Gemini's Google Search tool directly into the terminal workflow.
Automates the collection, filtering, and processing of Twitter search results into structured link lists and databases.
Extracts clean source code from GitHub file URLs using the GitHub CLI to bypass web scraping noise and HTML clutter.
Extracts data from JavaScript-heavy websites, authenticated pages, and complex documentation using advanced browser automation.
Optimizes B2B data enrichment through intelligent provider selection, waterfall logic, and credit-efficient routing.
Lists and manages configured event sources for Instagram accounts and web aggregators used in newsletter generation.
Extracts structured content from popular Chinese news platforms and converts it into JSON and Markdown formats.
Searches the web using Exa AI to provide real-time information retrieval and up-to-date data for AI coding workflows.
Converts entire websites into LLM-ready markdown and structured data with advanced anti-bot bypass and JavaScript rendering.
Extracts clean, markdown-formatted content and metadata from any URL using the Jina Reader API for LLM consumption.
Downloads and converts public Google Docs, Sheets, and Slides into local formats for direct analysis and integration.
Fetches and analyzes real-time stories, comments, and user data from Hacker News using the official API.
Integrates the Tavily API to perform live web searches and structured data retrieval for RAG-augmented workflows.
Integrates privacy-focused web, image, video, and news search capabilities directly into Claude Code via the Brave Search API.
Accesses real-time search engine results from Google, Bing, and YouTube directly within Claude Code using structured JSON.
Extracts structured data from major social media platforms and websites using the Bright Data Web Scraper API.
Fetches and parses RSS/Atom feeds to automate news gathering and content monitoring directly within Claude Code.
Extracts transcripts from social media videos and scrapes websites into LLM-ready markdown format.
Bypasses anti-bot protections and extracts structured data from complex websites using high-performance Chrome TLS fingerprinting and JS rendering.
Automates web data collection and browser tasks using pre-built Actors for popular sites like Amazon, Google, and LinkedIn.
Automates web scraping, site crawling, and structured data extraction from any URL using the Firecrawl API.
Searches and retrieves life sciences preprints from the bioRxiv server using keywords, authors, date ranges, and categories.
Accesses and queries the ClinicalTrials.gov API v2 to retrieve detailed medical study data, recruitment status, and eligibility criteria for clinical research.
Extracts text commands, terminal inputs, and gameplay moves from screen recordings using optimized OCR and image preprocessing techniques.
Facilitates direct access to PubMed literature and the NCBI E-utilities API for advanced biomedical research and data extraction.
Extracts typed commands and sequential text inputs from screen recordings and terminal sessions using optimized OCR workflows.
Extracts structured data from financial documents using OCR and text extraction while enforcing rigorous data safety and verification protocols.
Accesses official USPTO APIs to perform comprehensive patent and trademark searches, intellectual property analysis, and prosecution history tracking.
Extracts and implements code or algorithms from images by utilizing OCR tools, image preprocessing, and systematic verification strategies.
Scroll for more results...