Tinker Compatibility
Docker-only execution
All commands run inside Docker containers. Use the provided scripts.
W8-RL includes a Tinker adapter for token-based RL training using the Tinker Cookbook API.
It wraps EnvActor directly and runs inside ray-worker.
How it works
- EnvActor provides
reset_external/step_external TinkerEnvAdapterconverts ObsRef to Tinker tokens- GRPO training uses group rollouts and per-group advantage centering
Run rollout (no training)
Run the command below from the repo root in Docker:
./scripts/run_design2code_tinker.sh \
--task-dir tasks --limit 3 \
--policy tinker \
--model Qwen/Qwen3-4B-Instruct-2507 \
--episodes 1 \
--max-tokens 2048 \
--max-total-tokens 3000000
Run training
See Tinker RL Training.
Next Steps
- Read the Architecture overview: Architecture Overview
- Run a Design2Code task: Design2Code Runs
- Review troubleshooting: Troubleshooting