About
This skill empowers Claude to interact with and extract insights from a wide range of media formats using Gemini 3 Pro's native multimodal capabilities. It provides a standardized framework for high-resolution image understanding, video analysis of files up to one hour, long-form audio transcription, and deep PDF document extraction. By offering granular control over media resolution and token optimization, it allows developers to efficiently balance processing quality with cost when building applications that require sophisticated media comprehension.