Does this skill support persistent storage?

Yes, it provides implementation patterns for attaching and managing persistent NFS filesystems so your datasets and checkpoints remain available across instance restarts.

How do I automate instance launching with this skill?

The skill provides ready-to-use Python API examples and curl commands to programmatically launch, list, and terminate GPU instances.

What software comes pre-installed on these instances?

Instances use Lambda Stack, which includes Ubuntu, NVIDIA drivers, CUDA, cuDNN, NCCL, PyTorch, and TensorFlow, ensuring a zero-config environment for ML.

Can I use this for multi-node distributed training?

Absolutely. The skill includes guidance for 1-Click Clusters, Slurm configurations, and InfiniBand networking required for large-scale multi-node training.

What is the Lambda Labs GPU Cloud skill for Claude Code?

It is a specialized capability that allows Claude to help you provision, manage, and optimize GPU instances on Lambda Labs for machine learning tasks.

Lambda Labs GPU Cloud

Name: Lambda Labs GPU Cloud
Author: Orchestra-Research

byOrchestra-Research

•

3,983

•

Cloud Infrastructure

Provisions and manages high-performance GPU infrastructure on Lambda Labs for machine learning training and inference workflows.

This skill provides Claude with specialized knowledge to orchestrate Lambda Labs GPU resources, ranging from single-node instances to 1-Click Slurm clusters. It streamlines the deployment of NVIDIA H100, B200, and A100 GPUs while ensuring best practices for using the pre-configured Lambda Stack (PyTorch, CUDA, NCCL). By integrating this skill, users can efficiently handle persistent storage mounting, API-based instance management, and cost-optimized scaling for intensive AI research and engineering tasks.

Key Features

01Support for a wide range of NVIDIA GPUs including B200, H100, and GH200.

02Automated instance lifecycle management via Python API and CLI integration.

033,983 GitHub stars

04Optimization for distributed training using PyTorch DDP and FSDP on Slurm clusters.

05Configuration patterns for persistent NFS filesystems to preserve training data.

06Seamless environment setup using pre-installed Lambda Stack for ML workloads.

Use Cases

01Orchestrating high-performance distributed training on InfiniBand-connected clusters.

02Fine-tuning Large Language Models (LLMs) on multi-GPU H100 instances.

03Deploying scalable batch inference pipelines for computer vision or NLP models.

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add orchestra-research/ai-research-skills lambda-labs

For use in Claude.ai and ChatGPT

Download Skill