Desktop Environments
Deploy full desktop virtual machine environments running Linux, Windows, or macOS.Hardware Requirements:
- Linux/Windows: Bare-metal hosts with KVM support
- macOS: Apple Mac hardware (Mac Metal instances via Lumier provider)
Step 1: Load the Docker Image
Load the desktop orchestrator image into your local registry:docker load -i desktop-orchestrator.tar
The orchestrator manages VM disk images (qcow2 format) internally. Contact Scale for access to the VM images for your target operating systems.
Step 2: Run the Container
Start the orchestration container, exposing port 3000:docker run -d -p 3000:3000 desktop-orchestrator:latest
The orchestration server manages VM lifecycle, noVNC connectivity, and task execution.Step 3: Initialize a Session
Create a new desktop environment by specifying the OS type:curl -X POST http://localhost:3000/create_desktop \
-H "Content-Type: application/json" \
-d '{
"os_type": "linux",
"require_a11y_tree": true,
"timeout": 3600
}'
Supported os_type values: linux, windows, macosThis returns a task_id for tracking. Poll the task status until the VM is ready:curl -X GET http://localhost:3000/task_status/{task_id}
Once complete, you’ll receive a vm_id and vnc_url for the environment.Run task-specific initialization (download assets, open apps, run setup scripts):curl -X POST http://localhost:3000/initialize_task \
-H "Content-Type: application/json" \
-d '{"vm_id": "vm-abc123", "task_config": {...}}'
Step 4: Interact with the Environment
Build your own agent loop by connecting directly to the in-VM server (running on the VM’s mapped port):
- Get screenshots:
GET /screenshot
- Get accessibility tree:
GET /accessibility
- Execute commands:
POST /execute
You can implement your agent loop from scratch by capturing screenshots, sending them to a vision LLM (e.g., GPT-4o, Claude), and executing the returned actions. Alternatively, use the try-cua library after spinning up a CUA server inside the VM.See the Desktop Environment API Reference for all available endpoints.Step 5: Verify Results
Run the task-specific verifier to assess state and return a score:curl -X POST http://localhost:3000/run_evaluator \
-H "Content-Type: application/json" \
-d '{"vm_id": "vm-abc123", "task_config": {...}}'
See the Desktop Verifiers guide for more details on verification.Success!
You’ve deployed a desktop environment locally. You can now:
- Scale to multiple parallel containers
- Integrate with your training pipeline
- Implement your own agent loop