Environments
Website Environments
4 synthetic web appsCalendr · Cloudfile · Shopora · Pandora’s InboxFull-stack websites with databases, logging, and built-in verification
Desktop Environments
3 VM platformsWindows · Ubuntu · macOSIsolated virtual machines for computer-use agent testing
MCP Environment
45+ servers · 300+ toolsCalendar · CRM · Email · Shopping · and moreTool-based agent interactions via Model Context Protocol
Two Ways to Use Scale Gymnasium
| Method | Agent Loop | Best For |
|---|---|---|
| Gymnasium Web UI | Built-in — Scale provides an agent loop executor | Prototyping, testing, exploration |
| Docker Images | Bring your own — you implement the agent loop | Large-scale evaluation, CI/CD, training pipelines |
Web UI Guide
Try environments instantly through the Gymnasium interface
Docker Quick Start
Run environments locally with your own agent loop
Key Capabilities
Session Isolation
Complete data isolation between test sessions ensures no cross-contamination
Comprehensive Logging
Automatic capture of all agent interactions for analysis and debugging
Built-in Verification
State checks, log validation, and rubric-based evaluation framework
What’s Included
| Resource | Description |
|---|---|
| Environments | 4 website apps + 3 desktop VMs + 45 MCP servers |
| Data Packs | Pre-configured datasets to hydrate environments with realistic state |
| Verifiers | State checks, log checks, and rubric-based evaluation |
| Sample RL Data | Pre-built tasks with prompts and verification criteria |
| APIs | HTTP endpoints for programmatic environment control |