Skip to main content
Scale Gymnasium provides a standardized collection of digital environments for agent training and evaluation. Whether you’re building tool-use agents, or GUI agents for browser-use or desktop-use, Gymnasium offers the environments, data, and verification systems you need.

Environments

Two Ways to Use Scale Gymnasium

MethodAgent LoopBest For
Gymnasium Web UIBuilt-in — Scale provides an agent loop executorPrototyping, testing, exploration
Docker ImagesBring your own — you implement the agent loopLarge-scale evaluation, CI/CD, training pipelines

Key Capabilities

Session Isolation

Complete data isolation between test sessions ensures no cross-contamination

Comprehensive Logging

Automatic capture of all agent interactions for analysis and debugging

Built-in Verification

State checks, log validation, and rubric-based evaluation framework

What’s Included

ResourceDescription
Environments4 website apps + 3 desktop VMs + 45 MCP servers
Data PacksPre-configured datasets to hydrate environments with realistic state
VerifiersState checks, log checks, and rubric-based evaluation
Sample RL DataPre-built tasks with prompts and verification criteria
APIsHTTP endpoints for programmatic environment control

Next Steps