Detailed Information on Publication Record
2017
Text Punctuation: An Inter-annotator Agreement Study
BOHÁČ, Marek, Michal ROTT and Vojtěch KOVÁŘBasic information
Original name
Text Punctuation: An Inter-annotator Agreement Study
Authors
BOHÁČ, Marek (203 Czech Republic), Michal ROTT (203 Czech Republic) and Vojtěch KOVÁŘ (203 Czech Republic, guarantor, belonging to the institution)
Edition
Cham, Text, Speech, and Dialogue: 20th International Conference, TSD 2017, p. 120-128, 9 pp. 2017
Publisher
Springer International Publishing
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
printed version "print"
References:
Impact factor
Impact factor: 0.402 in 2005
RIV identification code
RIV/00216224:14330/17:00095096
Organization unit
Faculty of Informatics
ISBN
978-3-319-64205-5
ISSN
UT WoS
000449869200014
Keywords (in Czech)
doplňování čárek;mluvený jazyk;mezianotátorská shoda
Keywords in English
Comma adding;Spoken language;Inter-annotator agreement
Tags
Tags
International impact, Reviewed
Změněno: 27/4/2020 23:37, Mgr. Michal Petr
V originále
Spoken language is a phenomenon which is hard to be annotated accurately. One of the most ambiguous tasks is to fill in the punctuation marks into the spoken language transcription. Used punctuation marks are often dependent on how annotators understand the transcription content. This may differ as the spoken language often lacks clear structure (inherent to written language) due to the utterance spontaneity or due to skipping between ideas. Therefore we suspect that filling commas into the spoken language transcription is a very ambiguous task with low inter-annotator agreement (IAA). In this paper we analyze the IAA within group of annotators and we propose methods to increase it. We also propose and evaluate a reformulation of classical GT annotations for cases with multiple annotations available.
In Czech
Článek se zabývá problematikou doplňování čárek do mluveného textu, zejména mezianotátorskou shodou a přesností současných počítačových programů.
Links
GA15-13277S, research and development project |
| ||
LM2015071, research and development project |
| ||
MUNI/A/0897/2016, interní kód MU |
|