Can LLMs autonomously evolve faster, better bioinformatics algorithms? Track the race across tasks, datasets, and evolution harnesses.
Which evolution harness produces the best results across all tasks? Ranked by tasks won (first place on the primary metric, ties broken by other directional metrics).
| # | Harness | Type | Tasks | Results | Best Results |
|---|---|---|---|---|---|
| 1 | Claude Code | agent-loop | - | - | - |
| 2 | SkyDiscover (AdaEvolve) | evolutionary | - | - | - |
| 3 | SkyDiscover (Beam Search) | evolutionary | - | - | - |
| 4 | SkyDiscover (Best-of-N) | evolutionary | - | - | - |
| 5 | SkyDiscover (Top-K) | evolutionary | - | - | - |
Build the cell-cell kNN graph faster than scanpy.pp.neighbors while keeping edge-set Jaccard >= 0.9 vs the reference. Train on PBMC 3K (2.6K cells), evaluated on a held-out 50K-cell synthetic dataset to test scaling.
Evolve Leiden graph clustering to run faster while maintaining clustering quality. Train on PBMC 3K (~2.6K cells, fast iteration); evaluated on a held-out 50K-cell synthetic PBMC dataset that preserves the original cluster structure.
Build a peak-calling implementation that matches or outperforms MACS3 on ATAC-seq data. Train on GM12878 scATAC-seq (111M reads).