The Virtual Embryo Challenge
Generative modeling of mouse embryogenesis across space, scale, and time — under genetic perturbation.
Embryogenesis is fundamental — and largely unmodelled
A single fertilised cell becomes a complete organism through spatiotemporally coordinated gene regulation, cell-fate transitions, tissue morphogenesis, and organ formation. Disruptions cause congenital defects, which still affect 1 in 33 newborns and remain a leading cause of infant mortality.
Large embryo atlases and spatial-transcriptomics datasets give us snapshots, but they don't reveal how cell states transition, how local molecular changes propagate to tissue- and organ-level phenotypes, or how development responds to perturbation.
The Virtual Embryo Challenge establishes a standardised benchmark for predictive embryogenesis: a curated dataset, an evaluation pipeline, baseline models, and three tasks that jointly stress spatial context, multiscale reasoning, temporal dynamics, and perturbation response.
Three tasks, one shared atlas
Each task uses staged train / validation / hidden-test splits over the same whole-embryo + heart-focused resource. Hidden labels are never released; final rankings reflect generalisation to held-out stages, embryos, and genotypes.
Forecast the gene-expression distribution at unseen future stages from earlier ones.
Predict expression + cell-type composition + 3D spatial organization jointly across stages.
Predict mutant developmental outcomes — cell-type distribution, heart morphology, gene expression — under unseen knock-outs.
Multimodal whole-embryo perturbation resource
~1 million cells across 11 developmental time points, spanning early gastrulation through cardiac progenitor emergence, heart-tube formation, looping, and later morphogenesis.
Whole-embryo per-cell expression and chromatin accessibility across staged embryos.
Coronal sections decoded into per-cell 3D positions plus measured RNA.
Per-cell cell-type, tissue-domain, and anatomical-region calls plus morphology-derived features.
Three CKO conditions across cardiac developmental regulators, with paired wild-type controls and bulk-RNA validation.
Three metrics, automatic scoring, hidden labels
Scores are computed on held-out embryos after schema validation (gene order, cell-type vocabulary, coordinate convention, missing-value policy). Sub-scores per task; an overall composite for ranking.
Pseudobulk Pearson correlation per evaluation stratum (embryo / region / cell type), averaged with bootstrap confidence intervals.
A frozen probe classifier — trained by the organizers and locked before evaluation — assigns predicted-vs-observed cell-type proportions at global, regional, and per-condition levels.
Fused Gromov-Wasserstein distance combining expression similarity with spatial-structure preservation. Penalises predictions that get the marginals right but the geometry wrong.
Three phases · launch → development → final
- 2026-06-30Site + submission portal + eval platform live
- 2026-07-20Starter kit released; website opens to participants
- 2026-07-30P1 · Test phase begins (workflow + leaderboard validation)
- 2026-08-15P2 · Development phase begins; validation dataset released
- 2026-10-25P3 · Final test phase begins (new held-out dataset)
- 2026-11-02Final submissions due; official evaluation starts
- 2026-11-18Winners announced at NeurIPS
$70K total from the Laude Institute Moonshots Seed Grant
Top teams per task across both tracks.
15–20 grants ($1k–$2k each) for early-career attendees of the NeurIPS workshop.
Website, starter-kit repo, tutorials, reproducible walkthroughs, Slack workspace, community support.
Evaluation runs on the Stanford Sherlock GPU cluster (NVIDIA H100 / H200 80 GB).
Explore the data that powers it
The challenge is grounded in the same atlas you can browse on this site: 3-D spatial-transcriptomics specimens by Theiler stage, a whole single-cell time-lapse from gastrula to birth, and the EMA anatomical references. Use them now to understand the modality coverage and stage spacing before the starter kit drops.
Hosted by the Qiu Lab, Stanford University, in collaboration with developmental-biology, computational-biology, and machine-learning communities. The full organising committee will be announced alongside the starter-kit release.
Subscribe to receive starter-kit, dataset, and timeline updates. The competition site, GitHub repository, and Slack workspace go live two weeks before P1.
This page summarises the NeurIPS 2026 competition proposal currently under review. Dates, datasets, prize amounts, and exact metric formulations are subject to change between proposal acceptance and launch.