Knowledge graph visualizes knowledge on psychoses from unstructured literature
SANKT AUGUSTIN. The Fraunhofer Institute for Algorithms and Scientific Computing SCAI and the software company Kairntech (Grenoble) have created a knowledge graph for psychiatric disorders, especially psychoses. Knowledge graphs allow unstructured texts to be represented in a structured, comparable format. They visualize cause-and-effect models to help medical professionals make decisions about therapies. SCAI and Kairntech use artificial intelligence (AI) and natural language processing (NLP) techniques to extract dependencies and relationships. This approach allows to open up the entire mechanistic knowledge on an indication area.
Today, most of the existing knowledge on a particular discipline is available as unstructured texts in numerous scientific publications. For example, in these papers, the interactions of protein X with regulatory gene sequence Y or the effects of a gene variant Z on the clinical course of a disease are described. Only a fraction of this knowledge is available in a structured form. The vast majority of our medical knowledge is encoded in barely structured scientific prose.
"In the field of psychiatric diseases, such structured databases do not exist as we would need them: comprehensive, detailed, and yet up-to-date," explains Prof. Martin Hofmann-Apitius, head of SCAI's Department of Bioinformatics. "Together with Kairntech, we have succeeded in analyzing the rapidly growing literature in the field of psychiatric diseases in an automated way in a short time."
Software company Kairntech has been working on a generic platform for AI/NLP tasks since 2018. "We had already successfully applied our software to several use cases such as the analysis of legal documents, press articles, or technical reports. Therefore, the transfer to another domain was obvious," say Stefan Geißler and Olivier Terrier from Kairntech.
The project required a precise recognition of entities – uniquely identifiable objects about which information is stored – in the technical literature and their linkage with suitable canonical vocabularies. In addition, there was the determination of appropriate "cause-effect" relationships between these entities. Finally, constraints such as associated biological processes and test systems had to be determined before encoding the result in Biological Expression Language (BEL). "For the first time, we have extracted indication-specific cause-and-effect information on a large scale," explains Martin Hofmann-Apitius. "The study shows that it is possible to generate specific, high-quality, computable cause-and-effect models for other fields of knowledge."
SCAI and Kairntech both see a wide range of future applications for their combined workflows in the pharmaceutical, biotech, and healthcare industries.
About Fraunhofer SCAI, Business Area Bioinformatics:
The research focuses on information extraction and NLP, and the generation of computable cause-and-effect models for neurological and psychiatric diseases.
Kairntech provides AI and NLP software solutions for document analysis in a variety of business contexts.
Chief Scientific Officer, Kairntech