Atlas · Knowledge graph

The Virtual Embryo KG

A reasoning-capable graph of mouse embryonic anatomy (EMAPA), cell types (CL), genes (MGI), developmental stages, datasets and papers — agent-queried via /kg/* + MCP. Federates Bgee for live gene-expression facts and accepts new claims from kg-extract-paper runs into a staging area.

Knowledge graph map

Each node is one entity type; bubble size scales with instance count. Click a type to load its top-degree instances; click any instance for triples + atlas links. — or search above and click a hit to seed the map.

Types

Top predicates

Literature pipeline

Papers at each stage of the ingest pipeline. Click a row to see the per-paper claim subgraph (curated / staging) or the abstract (queued). Each paper is in exactly one stage.

Lifecycle

QUEUEDdiscovered or in catalog · no claims yet→STAGINGagent has extracted claims · awaiting review / promotion→CURATEDconf ≥ 4 claims have been promoted to the vetted graph

Internal detail: promoting a paper to curated copies its high-confidence claims into a vetted graph; the original staging graph is retained server-side as an audit trail (so we can re-promote later if new predicates are added). That backup is not user-facing — a promoted paper just shows CURATED, not both states.

Loading paper list…

Agent integration · MCP

Query this KG from any LLM agent

modelcontextprotocol.io ↗

The same FastAPI routes that power this page are wrapped as a Model Context Protocol server (ve_api.mcp_server), so an LLM agent can ask "what's the developmental stage of EMAPA_16105?" and the runtime dispatches a structured tool call instead of having the model hand-write Cypher. For the cases where structured tools are too narrow, kg_cypher gives the agent a read-only Cypher escape hatch directly into Neo4j; kg_cypher_write is the same thing with writes enabled, gated behind an admin key.

Tools exposed

15 MCP tools — grouped by what an agent typically does next.

Browse: kg_search fuzzy substring across all labels + synonymskg_entity one-hop properties of an IRIkg_expand neighbours, optionally filtered by predicatekg_subgraph stage- or seed-anchored subgraph for visualisationkg_resolve_entity free-text → canonical IRI (gene symbol, anatomy term…)
Raw graph access: kg_expression where a mouse gene is expressed (local claims)kg_cypher raw read-only Cypher against Neo4jkg_cypher_write Cypher with writes — requires VE_ADMIN_KEY
Literature: pubmed_search PubMed E-utilities passthroughbiorxiv_recent recent preprints in mouse-embryo spacekg_known_papers what's already curated/staged — avoid duplicate work
Submit (paper extraction) admin key: kg_predicate_list allowed predicates for the current schema versionkg_schema JSON schema of the extraction envelopekg_validate_extractiondry-run an envelope before submitting (open)kg_submit_extraction land claims in the staging graph — requires VE_ADMIN_KEY

Wire it up

One command in Claude Code; Claude Desktop and other MCP hosts take the same stdio command in their config.

# Read-only access — no key needed
claude mcp add virtualembryo-kg -- \
  /path/to/venv/bin/python -m ve_api.mcp_server

# With write access (curator role)
VE_ADMIN_KEY=<your-key> claude mcp add virtualembryo-kg -- \
  /path/to/venv/bin/python -m ve_api.mcp_server

The server reads VE_API_BASE (defaults to http://localhost:8787) and proxies tool calls to the FastAPI backend — so the same query budgets, schema validators, and Neo4j connection pool the web UI uses are reused. Tools fail loudly if the backend is down rather than silently returning empty results.

Write auth

Read tools (kg_search, kg_subgraph, kg_expression, kg_cypher, …) are open to any MCP client. Write tools — kg_submit_extraction, kg_cypher_write, and the admin POST /kg/invalidate_cache — require an X-Admin-Key header that matches the VE_ADMIN_KEY env var set on the backend. Without it the backend returns 401; without the env var at all the backend fails closed with 503 (no anonymous writes ever).

How to get one: there's no self-service portal yet. The backend owner generates a key with python scripts/gen_admin_key.py and shares it through a secrets manager (1Password / Bitwarden) with the small set of curators that need write access. Same key for everyone — there's no per-user audit trail today.

·For the NeurIPS Challenge agents, the same server is mounted into the eval harness so submissions can query the KG during inference (read-only by default).

·Schema lives at ve_api/extraction_schema/v8.json — bump the version, re-export the envelope, and kg_predicate_list picks up the new predicates without redeploy.

·The HTTP equivalents (GET /kg/search, POST /kg/sparql, …) stay live for non-MCP clients.