J 2022

Deep Learning Analysis of Polish Electronic Health Records for Diagnosis Prediction in Patients with Cardiovascular Diseases

ANETTA, Krištof, Aleš HORÁK, Tomasz JADCZYK, Wojciech WOJAKOWSKI, Krystian WITA et. al.

Basic information

Original name

Deep Learning Analysis of Polish Electronic Health Records for Diagnosis Prediction in Patients with Cardiovascular Diseases

Authors

ANETTA, Krištof (703 Slovakia, belonging to the institution), Aleš HORÁK (203 Czech Republic, belonging to the institution), Tomasz JADCZYK (616 Poland), Wojciech WOJAKOWSKI (616 Poland) and Krystian WITA (616 Poland)

Edition

Journal of Personalized Medicine, Basel, MDPI, 2022, 2075-4426

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10200 1.2 Computer and information sciences

Country of publisher

Switzerland

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

Impact factor

Impact factor: 3.508 in 2021

RIV identification code

RIV/00216224:14330/22:00125875

Organization unit

Faculty of Informatics

UT WoS

000818311800001

Keywords in English

electronic health records; deep learning; text analysis; diagnosis prediction; Polish language

Tags

International impact, Reviewed
Změněno: 6/4/2023 10:01, RNDr. Pavel Šmerk, Ph.D.

Abstract

V originále

Electronic health records naturally contain most of the medical information in the form of doctor’s notes as unstructured or semi-structured texts. Current deep learning text analysis approaches allow researchers to reveal the inner semantics of text information and even identify hidden consequences that can offer extra decision support to doctors. In the presented article, we offer a new automated analysis of Polish summary texts of patient hospitalizations. The presented models were found to be able to predict the final diagnosis with almost 70% accuracy based just on the patient’s medical history (only 132 words on average), with possible accuracy increases when adding further sentences from hospitalization results; even one sentence was found to improve the results by 4%, and the best accuracy of 78% was achieved with five extra sentences. In addition to detailed descriptions of the data and methodology, we present an evaluation of the analysis using more than 50,000 Polish cardiology patient texts and dive into a detailed error analysis of the approach. The results indicate that the deep analysis of just the medical history summary can suggest the direction of diagnosis with a high probability that can be further increased just by supplementing the records with further examination results.

Links

EF19_073/0016943, research and development project
Name: Interní grantová agentura Masarykovy univerzity
LM2018101, research and development project
Name: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy (Acronym: LINDAT/CLARIAH-CZ)
Investor: Ministry of Education, Youth and Sports of the CR
MUNI/IGA/1326/2021, interní kód MU
Name: New Horizons of Electronic Health Record Analysis using Deep Learning (Acronym: Health Record Analysis using Deep Learning)
Investor: Masaryk University