Endri Gupta

Towards Co-Morbidity-Aware Clinical Guidance: Integrating AI, Knowledge Graphs, and Biomedical Evidence for Mechanism-Informed Decision Support

Endri Gupta presents her Master's thesis which explores the integration of artificial intelligence, knowledge graphs, and biomedical evidence to support clinical decision-making in the presence of co-morbidities.

Background and Motivation

Clinical decision-making in healthcare is often guided by evidence-based recommendations focused on single diseases [1]. While effective for standardized treatment, such guidelines fall short for patients with multiple co-existing conditions (co-morbidities), which are common in real-world settings [2,3]. The COVID-19 pandemic highlighted this limitation. Though mainly affecting the lungs, COVID-19 has also been linked to neurological changes. Recent studies suggest it may increase the risk of neurodegenerative diseases (NDDs) like Alzheimer’s and Parkinson’s [4,5,6]. Yet, clinical guidelines have not kept pace with this evolving understanding, leaving physicians without tools to address these complex, overlapping conditions. To bridge this gap, my Master’s thesis proposes a new framework for co-morbidity-aware clinical guidance that combines semantic technologies with artificial intelligence (AI) to integrate diverse sources of biomedical evidence.

 

Methods

We present a prototype framework that integrates clinical trial metadata, guideline content, and biomedical mechanisms to support co-morbidity-aware decision-making. Focusing on COVID-19 and its potential links to Alzheimer’s, Parkinson’s, and Multiple Sclerosis, our framework unifies top-down and bottom-up evidence using graph-based methods. It combines two core strategies: (1) semantic analysis of clinical guidelines, and (2) construction of a co-morbidity knowledge graph.

Approach 1: Top-Down Semantic Analysis of Clinical Guidelines

This approach investigates whether clinical guidelines reflect co-morbidity-aware recommendations, focusing on documents related to COVID-19, Alzheimer’s, Parkinson’s, and Multiple Sclerosis.

Data Collection: We curated 108 guidelines from trusted health agencies (e.g., WHO, NICE, CDC), selected for clinical relevance, credibility, and English availability.

Recommendation Extraction: Due to inconsistent guideline format, traditional NLP meth- ods were ineffective. Instead, we used prompt engineering with gpt-3.5-turbo [7] (via LangChain) [8] to extract actionable recommendations. Few-shot learning [9] improved performance across diverse layouts.

Embedding and Visualization: Recommendations were embedded using all-mpnet-base-v2 [10] and reduced via UMAP (n neighbors = 15, min dist = 0.1, cosine metric). Resulting clusters were visualized to assess cross-disease thematic overlap. A masked version (with disease names replaced) was also tested to evaluate semantic generalization. As shown in Figure 1, this process enabled visualization of semantic patterns across different disease contexts.

Figure 1: Overview of Approach 1: Guidelines were collected and chunked, recommendations extracted using GPT-3.5 + prompt engineering, embedded using MPNet, and clustered using UMAP to assess cross-disease semantic similarity.

Approach 2: Bottom-Up Graph Construction for Mechanism-Aware Reasoning

Post-COVID conditions such as fatigue, dyspnea, and anosmia are increasingly documented but remain poorly reflected in clinical guidelines. This approach adopts a bottom-up strategy, starting from common post-COVID phenotypes to trace connections across clinical trials, biomedical mechanisms, and guideline content – supporting holistic, mechanism-aware clinical reasoning.

Step 1: Phenotype Extraction. We conducted a PubMed search using MeSH descriptor D000094024, identifying 171 phenotypic traits from post-COVID studies published between 2019–2024. Using SciSpacy and UMLS linking, we filtered for clinically relevant symptoms and selected the top 20 traits (e.g., fatigue, dyspnea, cognitive impairment) as semantic anchors.

Step 2: Trial and Guideline Matching. Each phenotype was used to query ClinicalTrials.gov, yielding 1,646 trials. Metadata such as conditions, interventions, and outcomes were extracted and matched to 237 clinical guidelines using MPNet-based cosine similarity (threshold 0.7). This created high-confidence links between symptoms, trial evidence, and guideline statements.

Step 3: Co-Morbidity Graph Construction. We developed a clinical evidence graph connecting post-COVID phenotypes to trials and matched guideline content, representing how symptoms and interventions appear across evidence layers. In parallel, a separate co-morbidity hypothesis graph – developed by colleagues – encoded curated biomedical relationships from sources like PrimeKG, DisGeNET, OpenTargets, etc.

Step 4: Graph Integration. Both graphs were integrated using shared ontology id values for overlapping disease and phenotype entities (e.g., olfaction disorders, MESH:D000857). Ontology normalization via UMLS, MeSH, and OLS ensured semantic consistency. The unified graph links clinical observations with mechanistic links, enabling explainable reasoning and future LLM-based guidance generation.

The overall workflow for Approach 2 is illustrated in Figure 2, showing how phenotypes drive the integration of trials, guidelines, and mechanistic evidence into a coherent graph structure.

Figure 2: Overview of Approach 2: Reverse workflow from phenotypic traits to LLM-driven guideline synthesis.

Results

Approach 1: Top-Down Analysis of Guideline Recommendations

Using GPT-3.5 for recommendation extraction significantly improved the clarity and consistency of clinical guideline content compared to traditional NLP methods. UMAP visualizations of embedded recommendations showed clear disease-specific clustering, indicating that current guidelines are disease-specific. However, when disease names were masked, semantic overlap emerged among neurological conditions, while COVID-19 remained isolated. This suggests that, although treatment themes may be shared, guidelines fail to reflect co-morbidity-aware thinking – especially in linking COVID-19 with neuro-degenerative care. These findings highlight the need for integrated frameworks that support cross-disease reasoning in clinical decision-making.

Approach 2: Bottom-Up Graph Construction – Olfaction Disorders Link COVID-19 to Neurodegenerative Risk

Olfaction disorders (MESH:D000857), including anosmia and hyposmia, emerged as central intermediaries in the co-morbidity knowledge graph, linking COVID-19 to neurodegenerative outcomes. Identified as frequent post-COVID symptoms in both clinical trials and guideline content, these conditions also appear in the COMMUTE hypothesis graph with strong mechanistic ties to nerve degeneration, neuroinflammation, and immune gene activation (IL1B, IL6, TNF). Their repeated presence across data layers highlights their dual role as both observable sequelae and early biomarkers of long-term neurological risk. Supporting this, trials targeting olfactory dysfunction often focus on cognitive impairment, memory loss, and inflammation – key endpoints within the graph. Notably, recent studies suggest that drugs like donepezil [11], used in Alzheimer’s treatment, may benefit post-COVID memory impairment in patients with olfactory loss. These findings reinforce the biological plausibility of olfaction-driven pathways and emphasize the need for clinical frameworks that recognize olfactory dysfunction not just as a symptom, but as a gateway to earlier detection and targeted intervention in post-COVID neurodegeneration.

 

Discussion

The transformation of the traditional evidence pyramid in the era of artificial intelligence, as discussed by Bellini et al. [12], emphasizes the need for intelligent systems that go beyond passively retrieving data – they should actively interlink and expand diverse layers of clinical evidence. My Master’s thesis supports this evolving vision by constructing a co-morbidity knowledge graph that connects biological mechanisms, clinical trials, and guideline recommendations into a unified, interpretable framework. This graph enables bidirectional reasoning: from molecular hypotheses to clinical validation, and from real-world observations back to mechanistic insight. For instance, conditions like olfaction disorders are not only treated as post-COVID symptoms but also as indicators of deeper neurodegenerative pathways. Unlike traditional hierarchical models of evidence, this graph-based approach reveals hidden patterns and supports more personalized, mechanism-aware clinical recommendations. It demonstrates how AI can bridge biological insight and clinical relevance – laying the groundwork for scalable, intelligent decision-support systems.

Citations

[1] G. Feder, M. Eccles, R. Grol, C. Griffiths, and J. Grimshaw, “Using clinical guidelines,” BMJ, vol. 318, no. 7185, pp. 728–730, 1999. https://doi.org/10.1136/bmj.318.7185.728

[2] C. Muth, J. W. Blom, S. M. Smith, K. Johnell, A. I. Gonz´alez-Gonz´alez et al., “Evidence supporting the best clinical management of patients with multimorbidity and polypharmacy: a systematic guideline review and expert consensus,” Journal of Internal Medicine, vol. 285, no. 3, pp. 272–288, 2019. https://doi.org/10.1111/joim.12842

[3] S. T. Skou, F. S. Mair, M. Fortin et al., “Multimorbidity,” Nature Reviews Disease Primers, vol. 8, no. 1, p. 48, 2022. https://doi.org/10.1038/s41572-022-00376-4

[4] G. Douaud, S. Lee, F. Alfaro-Almagro et al., “SARS-CoV-2 is associated with changes in brain structure in UK Biobank,” Nature, vol. 604, pp. 697–707, 2022. https://doi.org/10.1038/s41586-022-04569-5

[5] C. Greene, R. Connolly, D. Brennan et al., “Blood–brain barrier disruption and sustained systemic inflammation in individuals with long COVID-associated cognitive impairment,” Nature Neuroscience, vol. 27, pp. 421–432, 2024. https://doi.org/10.1038/s41593-024-01576-9

[6] COMMUTE Project,, “About the project: COMMUTE,” 2024. https://www.commute-project.eu/en/about.html

[7] OpenAI,, “Prompt engineering guide,” 2024, accessed: 2025-04-15.  https://platform.openai.com/docs/guides/prompt-engineering/prompt-engineering

[8] LangChain,, “Langchain: build context-aware LLM applications,” 2024, accessed: 2025-04-15. [Online]. Available: https://github.com/langchain-ai/langchain

[9] OpenAI,, “Gpt-3.5 turbo fine-tuning and API updates,” 2023, accessed: 2025-04-15. https://openai.com/index/gpt-3-5-turbo-fine-tuning-and-api-updates/

[10] Sentence  Transformers, “all-mpnet-base-v2,”  2021. https://huggingface.co/sentence-transformers/all-mpnet-base-v2

[11] P. Pooladgar, M. Sakhabakhsh, S. Soleiman-Meigooni, A. Taghva, M. Nasiri, and I. A. Darazam, “The effect of donepezil hydrochloride on post-COVID memory impairment: a randomized controlled trial,” Journal of Clinical Neuroscience, vol. 118, pp. 44–50, 2023. https://doi.org/10.1016/j.jocn.2023.09.005

[12] V. Bellini, E. Ori, F. Coccolini, G. Montori, M. Sartelli, and F. Catena, “Evidence pyramid and artificial intelligence: a metamorphosis of clinical research,” Discover Health Systems, vol. 2, p. 40, 2023. https://doi.org/10.1007/s44250-023-00050-w