Detailed Information on Publication Record
2013
Using Low-Cost Annotation to Train a Reliable Czech Shallow Parser
RADZISZEWSKI, Adam and Marek GRÁCBasic information
Original name
Using Low-Cost Annotation to Train a Reliable Czech Shallow Parser
Authors
RADZISZEWSKI, Adam (616 Poland) and Marek GRÁC (703 Slovakia, guarantor, belonging to the institution)
Edition
Plzeň, Text, Speech, and Dialogue, p. 575-582, 8 pp. 2013
Publisher
Springer Berling Heidelberg
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
60200 6.2 Languages and Literature
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
printed version "print"
Impact factor
Impact factor: 0.402 in 2005
RIV identification code
RIV/00216224:14210/13:00069444
Organization unit
Faculty of Arts
ISBN
978-3-642-40584-6
ISSN
UT WoS
000337294900072
Keywords in English
corpus annotation; shallow parsing; Czech
Změněno: 6/4/2015 22:16, Mgr. Vendula Hromádková
Abstract
V originále
Bushbank is a relatively new concept - a type of annotated corpus where annotation is driven by use of automatic tools and the task of human annotators is limited to accepting or rejecting parts of their output. This creates a possibility to obtain annotated corpora of considerable size at relatively low cost. In this paper we ask the question if the Czech Bushbank is reliable enough to be used for a NLP task instead of a traditional corpus with high annotation rigour. We perform evaluation of three different parsers using its shallow syntactic annotation, including a CRF chunker made originally for Polish. The results are very promising, showing that many practical applications could benefit from low-cost annotation.