Verifier-backed workflow data for CUA
CUA workflow data needs verifier audits, pass@k distributions, failure traces, and contamination controls to be useful.
Latest notes
CUA workflow data needs verifier audits, pass@k distributions, failure traces, and contamination controls to be useful.
For computer-use agents, the bottleneck is not only RL algorithms. It is the supply of verifiable workflow environments.
Good CUA data is workflow data with verifiers, eval splits, failure traces, and evidence that it improves computer-use agents.