The platform automatically runs untested task/harness pairs.
Each run is limited to 1 hour and $20 in API token cost.
...untested task/harness pairs
✓ evolve + test scored✓ evolve only (no test configured)◐ test missing (click to retest)↻ retest in progress○ pending (click to launch)✗ failed (click to inspect)— not applicable