Submit - BioEvolve Bench

What to evolve

Task ID * URL-safe slug

Name *

Description *

Problem *

Seed Algorithm *

Train Dataset *

Test Dataset(s) Datasets the harness never sees. Used for final leaderboard scores.

Harness *

Evaluation Metrics

Primary

Baseline

Baseline Command

Enter baseline metric values (will auto-populate from metrics above)

Harness Details

Harness ID * URL-safe slug

Name *

Description *

Type *

LLM Provider

LLM Model

Required API Keys Comma-separated env var names injected at runtime

Code Upload

Upload your harness code. Must include a run.py entry point that implements:

def run(workspace: str) -> None:
    """
    workspace contains:
      task.yaml     - Task config (metrics, baseline, evaluator command)
      seed/         - Modifiable seed code (e.g., solve.py)
      evaluator/    - Read-only evaluation scripts
      data/         - Read-only input data
      reference/    - Read-only reference algorithm source

    Write results to:
      results/metrics.json   - Final best scores
      results/best/          - Best evolved code snapshot
      results/history.jsonl  - Iteration log (optional)
    """

Files *

Drop files here or click to browse

Must include run.py. Can also include requirements.txt, configs, etc.

Submissions are reviewed by an admin before being registered. You'll be notified when approved.

Dataset Details

Dataset ID * URL-safe slug (lowercase, hyphens only)

Name *

Description *

Organism

Assay

Scale Plain number (no commas)

Source URL

Files

Files are uploaded to the bioevolve-data Modal Volume under <dataset-id>/. Sandboxes mount the volume read-only so train/ test runs see them at /workspace/data/. Large files (multi-GB BAMs) should be prepared on Modal directly via a script like infra/data/prepare_*.py instead of going through this form.

Files *

Drop files here or click to browse

Each file's name is preserved. Total upload < 1 GB recommended.

File descriptions (optional) Used to populate the registry yaml's files: block.

Admin-only: this writes a new registry/datasets/<id>.yaml and pushes the files to the shared Modal Volume immediately.