Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{1741557, author = {Ha, Hien Thi and Horák, Aleš and Minh Tuan, BUi}, address = {Portugal}, booktitle = {Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART}, doi = {http://dx.doi.org/10.5220/0010243807950802}, editor = {Ana Paula Rocha ; Luc Steels and Jaap van den Herik}, keywords = {Information Extraction; Scanned Documents; Document Metadata; Contract Metadata Extraction; Czech}, howpublished = {elektronická verze "online"}, language = {eng}, location = {Portugal}, isbn = {978-989-758-484-8}, pages = {795-802}, publisher = {The SciTePress Digital Library}, title = {Contract Metadata Identification in Czech Scanned Documents}, url = {https://www.scitepress.org/PublicationsDetail.aspx?ID=Alw3hhTLE1M=&t=1}, year = {2021} }
TY - JOUR ID - 1741557 AU - Ha, Hien Thi - Horák, Aleš - Minh Tuan, BUi PY - 2021 TI - Contract Metadata Identification in Czech Scanned Documents PB - The SciTePress Digital Library CY - Portugal SN - 9789897584848 KW - Information Extraction KW - Scanned Documents KW - Document Metadata KW - Contract Metadata Extraction KW - Czech UR - https://www.scitepress.org/PublicationsDetail.aspx?ID=Alw3hhTLE1M=&t=1 N2 - Although nowadays digital-born documents are generally prevalent, exchange of business documents often consists in processing their scanned image form as a general human-readable format with one-to-one correspondence to paper documents. Bulk processing of such scanned documents then requires human intervention to extract and enter the main document metadata. In this paper, we present the design and evaluation of a contract processing module in the OCRMiner system. The information extraction process allows to combine layout properties with text analysis as input to a rule-based extraction with confidence score propagation. The first results are evaluated with public Czech contract documents reaching the item extraction accuracy of almost 88%. ER -
HA, Hien Thi, Aleš HORÁK a BUi MINH TUAN. Contract Metadata Identification in Czech Scanned Documents. Online. In Ana Paula Rocha ; Luc Steels and Jaap van den Herik. \textit{Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART}. Portugal: The SciTePress Digital Library, 2021, s.~795-802. ISBN~978-989-758-484-8. Dostupné z: https://dx.doi.org/10.5220/0010243807950802.
|