RADZISZEWSKI, Adam and Marek GRÁC. Using Low-Cost Annotation to Train a Reliable Czech Shallow Parser. In Habernal, Ivan; Matoušek, Václav. Text, Speech, and Dialogue. Plzeň: Springer Berling Heidelberg, 2013, p. 575-582. ISBN 978-3-642-40584-6. Available from: https://dx.doi.org/10.1007/978-3-642-40585-3_72.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Using Low-Cost Annotation to Train a Reliable Czech Shallow Parser
Authors RADZISZEWSKI, Adam (616 Poland) and Marek GRÁC (703 Slovakia, guarantor, belonging to the institution).
Edition Plzeň, Text, Speech, and Dialogue, p. 575-582, 8 pp. 2013.
Publisher Springer Berling Heidelberg
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 60200 6.2 Languages and Literature
Country of publisher Czech Republic
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
Impact factor Impact factor: 0.402 in 2005
RIV identification code RIV/00216224:14210/13:00069444
Organization unit Faculty of Arts
ISBN 978-3-642-40584-6
ISSN 0302-9743
Doi http://dx.doi.org/10.1007/978-3-642-40585-3_72
UT WoS 000337294900072
Keywords in English corpus annotation; shallow parsing; Czech
Tags NLP, rivok
Changed by Changed by: Mgr. Vendula Hromádková, učo 108933. Changed: 6/4/2015 22:16.
Abstract
Bushbank is a relatively new concept - a type of annotated corpus where annotation is driven by use of automatic tools and the task of human annotators is limited to accepting or rejecting parts of their output. This creates a possibility to obtain annotated corpora of considerable size at relatively low cost. In this paper we ask the question if the Czech Bushbank is reliable enough to be used for a NLP task instead of a traditional corpus with high annotation rigour. We perform evaluation of three different parsers using its shallow syntactic annotation, including a CRF chunker made originally for Polish. The results are very promising, showing that many practical applications could benefit from low-cost annotation.
PrintDisplayed: 5/10/2024 23:24