HA, Hien Thi. Approximate String Matching for Detecting Keywords in Scanned Business Documents. Online. In Ales Horak, Pavel Rychly, Adam Rambousek. Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2019. Brno, Czech Republic: NLP Consulting, 2019, s. 49-54. ISBN 978-80-263-1530-8. |
Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1648599, author = {Ha, Hien Thi}, address = {Brno, Czech Republic}, booktitle = {Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2019}, editor = {Ales Horak, Pavel Rychly, Adam Rambousek}, keywords = {approximate string matching; Levenshtein distance; weighted edit distance; OCR; invoice}, howpublished = {elektronická verze "online"}, language = {eng}, location = {Brno, Czech Republic}, isbn = {978-80-263-1530-8}, pages = {49-54}, publisher = {NLP Consulting}, title = {Approximate String Matching for Detecting Keywords in Scanned Business Documents}, url = {https://nlp.fi.muni.cz/raslan/raslan19.pdf}, year = {2019} }
TY - JOUR ID - 1648599 AU - Ha, Hien Thi PY - 2019 TI - Approximate String Matching for Detecting Keywords in Scanned Business Documents PB - NLP Consulting CY - Brno, Czech Republic SN - 9788026315308 KW - approximate string matching KW - Levenshtein distance KW - weighted edit distance KW - OCR KW - invoice UR - https://nlp.fi.muni.cz/raslan/raslan19.pdf L2 - https://nlp.fi.muni.cz/raslan/raslan19.pdf N2 - Optical Character Recognition (OCR) is achieving higher ac- curacy. However, to decrease error rate down to zero is still a human desire. This paper presents an approximate string matching method using weighted edit distance for searching keywords in OCR-ed business docu- ments. The evaluation on a Czech invoice dataset shows that the method can detect a significant part of erroneous keywords. ER -
HA, Hien Thi. Approximate String Matching for Detecting Keywords in Scanned Business Documents. Online. In Ales Horak, Pavel Rychly, Adam Rambousek. \textit{Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2019}. Brno, Czech Republic: NLP Consulting, 2019, s.~49-54. ISBN~978-80-263-1530-8.
|