An agentic system for rare disease diagnosis with traceable reasoning
Summary
Researchers developed DeepRare, an agentic Large Language Model (LLM) system designed for rare disease differential diagnosis support. DeepRare processes heterogeneous patient inputs (text, HPO terms, genomic data) using a three-tier architecture inspired by the Model Context Protocol (MCP), which includes a central host, specialized agent servers, and external medical resources. A key feature is its self-reflective loop and the generation of transparent, traceable reasoning chains linked to verifiable medical evidence, addressing the common LLM hallucination issue.
Evaluated on 6,401 clinical cases across eight datasets covering 2,919 rare diseases, DeepRare consistently outperformed 15 baseline methods, achieving a Recall@1 of 57.18% in HPO-based evaluations, significantly surpassing the next best method. In multi-modal input scenarios, it achieved a Recall@1 of 69.1% on whole-exome cases. Furthermore, when benchmarked against ten rare disease physicians using HPO input only, DeepRare achieved a Recall@1 of 64.4%, surpassing the clinicians' average of 54.6%, marking a significant milestone in computational rare disease diagnosis.
The system's reasoning chains were validated by experts, showing 95.4% agreement on evidence factuality. Failure analysis showed most errors stemmed from suboptimal reasoning weighing (41.0%) or phenotypic mimicry (38.5%), rather than factual errors. Ablation studies confirmed the superiority of the full agentic design over individual LLMs. DeepRare is deployed as a user-friendly Web application to serve as a diagnostic copilot, aiming to reduce diagnostic odysseys and improve clinical efficiency.
(Source:Nature)