Rollout Algorithms
Docker-only execution
All commands run inside Docker containers. Use the provided scripts.
W8-RL provides multiple rollout schedulers to maximize throughput and task diversity.
These are configured at runtime via --scheduler-type and --horizon-policy.
Schedulers
- FIFO: baseline queue
- SHDS: short-horizon diversified scheduler
- GRPO: group-based sampling
Horizon policies
- Fixed: constant
max_steps - Bucketed: different
max_stepsper difficulty bucket
Early stop
Early stop policies terminate hopeless episodes early based on browser signals. This increases throughput without reducing signal quality.
Where it lives
w8_rl/rollout/schedulers/w8_rl/rollout/horizon.pyw8_rl/rollout/early_stop.py
Example
Run the command below from the repo root in Docker:
python -m w8_rl.rollout.coordinator_main \
--scheduler-type shds \
--horizon-policy bucketed \
--max-episodes 10
All rollout execution still runs inside Docker.
Next Steps
- Read the Architecture overview: Architecture Overview
- Run a Design2Code task: Design2Code Runs
- Review troubleshooting: Troubleshooting