ANETTA, Krištof. Data Mining from Free-Text Health Records : State of the Art, New Polish Corpus. In Aleš Horák. Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020. Brno: Tribun EU, 2020, p. 13-22. ISBN 978-80-263-1600-8.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Data Mining from Free-Text Health Records : State of the Art, New Polish Corpus
Authors ANETTA, Krištof (703 Slovakia, guarantor, belonging to the institution).
Edition Brno, Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020, p. 13-22, 10 pp. 2020.
Publisher Tribun EU
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Czech Republic
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW PDF ve sborníku Domovská stránka workshopu
RIV identification code RIV/00216224:14330/20:00117842
Organization unit Faculty of Informatics
ISBN 978-80-263-1600-8
ISSN 2336-4289
UT WoS 000655471300002
Keywords in English EHR; electronic health records; named entity recognition; text data mining; NLP; natural language processing; Slavic languages; Polish
Tags named entity recognition, natural language processing, NLP, polish, Slavic languages, text data mining
Tags International impact
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 13/5/2024 17:46.
Abstract
This paper deals with data mining from free-form text electronic health records both from global perspective and with specific application to Slavic languages. It introduces the reader to the promises and challenges of this enterprise and provides a short overview of the global state of the art and of the general absence of this kind of research in Central European Slavic languages. It describes pl_ehr_cardio, a new corpus of Polish health records with 18 years’ worth of medical text. This paper marks the beginning of a pioneering research project in medical text data mining in Central European Slavic languages.
Links
LM2018101, research and development projectName: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy (Acronym: LINDAT/CLARIAH-CZ)
Investor: Ministry of Education, Youth and Sports of the CR
MUNI/A/1411/2019, interní kód MUName: Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita.
Investor: Masaryk University, Category A
PrintDisplayed: 10/6/2024 06:38