GRÁC, Marek. Case study of BushBank concept. In The 25th Pacific Asia Conference on Language, Information and Computation. Singapore: Institute for Digital Enhancement of Cognitive Development, Waseda University, 2011, p. 353-361, 8 pp. ISBN 978-4-905166-02-3.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Case study of BushBank concept
Authors GRÁC, Marek (703 Slovakia, guarantor, belonging to the institution).
Edition Singapore, The 25th Pacific Asia Conference on Language, Information and Computation, p. 353-361, 8 pp. 2011.
Publisher Institute for Digital Enhancement of Cognitive Development, Waseda University
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 60200 6.2 Languages and Literature
Country of publisher Singapore
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
RIV identification code RIV/00216224:14330/11:00065905
Organization unit Faculty of Informatics
ISBN 978-4-905166-02-3
Keywords in English corpus; rapid development; annotation; treebank
Tags best1
Tags International impact, Reviewed
Changed by Changed by: RNDr. Pavel Šmerk, Ph.D., učo 3880. Changed: 30/4/2014 10:22.
Abstract
In this paper, we present a new type of annotated corpus, called BushBank, which improves handling of ambiguity in natural language. Unlike in traditional approaches where data are directly disambiguated, in a BushBank, disambiguation is done later, based on application needs. This has major impact on the structures used in the corpus, since ordinary syntactic trees disallow ambiguity. Our approach was tested on 10.000 sentences and more than a hundred annotators when creating Czech BushBank. The paper contains information about creating such a resource and the methods used to obtain high inter-annotator agreement.
Links
GAP401/10/0792, research and development projectName: Temporální aspekty znalostí a informací
Investor: Czech Science Foundation
LC536, research and development projectName: Centrum komputační lingvistiky
Investor: Ministry of Education, Youth and Sports of the CR, Centrum komputační lingvistiky
PrintDisplayed: 27/4/2024 00:19