Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1590018, author = {Bamburová, Michaela and Nevěřilová, Zuzana}, address = {Brno}, booktitle = {Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2019}, editor = {Aleš Horák, Pavel Rychlý, Adam Rambousek}, keywords = {structured information extraction; table understanding; entity recognition}, howpublished = {tištěná verze "print"}, language = {eng}, location = {Brno}, isbn = {978-80-263-1530-8}, pages = {55-62}, publisher = {Tribun EU}, title = {Structured Information Extraction from Pharmaceutical Records}, url = {https://nlp.fi.muni.cz/raslan/2019/paper09-bamburova.pdf}, year = {2019} }
TY - JOUR ID - 1590018 AU - Bamburová, Michaela - Nevěřilová, Zuzana PY - 2019 TI - Structured Information Extraction from Pharmaceutical Records PB - Tribun EU CY - Brno SN - 9788026315308 KW - structured information extraction KW - table understanding KW - entity recognition UR - https://nlp.fi.muni.cz/raslan/2019/paper09-bamburova.pdf N2 - The paper presents an iterative approach to understanding semi-structured or unstructured tabular data with pharmaceutical records. Thetask is to split records with entities such as drug name, dosage strength,dosage form, and package size into the appropriate columns. The data isprovided by many suppliers, and so it is very diverse in terms of structure.Some of the records are easy to parse using regular expressions; othersare difficult and need advanced methods. We used regular expressionsfor the easy-to-parse data and conditional random fields for the morecomplex records. We iteratively extend the training data set using theabove methods together with manual corrections. Currently, the F1 scorefor correct classification into 5 classes is 95%. ER -
BAMBUROVÁ, Michaela a Zuzana NEVĚŘILOVÁ. Structured Information Extraction from Pharmaceutical Records. In Aleš Horák, Pavel Rychlý, Adam Rambousek. \textit{Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2019}. Brno: Tribun EU, 2019, s.~55-62. ISBN~978-80-263-1530-8.
|