2003
Text Corpus with Errors
PALA, Karel; Pavel RYCHLÝ and Pavel SMRŽBasic information
Original name
Text Corpus with Errors
Authors
PALA, Karel; Pavel RYCHLÝ and Pavel SMRŽ
Edition
Berlin, Text, Speech and Dialogue: Sixth International Conference, TSD 2003, p. 90-97, 8 pp. 2003
Publisher
Springer Verlag
Other information
Language
English
Type of outcome
Proceedings paper
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Czech Republic
Confidentiality degree
is not subject to a state or trade secret
References:
RIV identification code
RIV/00216224:14330/03:00009149
Organization unit
Faculty of Informatics
ISBN
3-540-200-24-X
UT WoS
000186386400012
Keywords in English
error detection
Tags
Changed: 26/5/2004 15:13, doc. Mgr. Pavel Rychlý, Ph.D.
Abstract
In the original language
This paper presents a description of a Czech text corpus (Chyby) containing various kinds of errors such as spelling, typographical, grammatical, style, lexical. We explain how Chyby has been built, how the errors in it have been discovered, marked and annotated. The classification of the errors is presented and the statistics concerning the types of errors is given. The tools for annotating the errors are also described. To the best of our knowledge, this is first text corpus of this sort prepared for Czech.
Links
| MSM 143300003, plan (intention) |
|