BOHÁČ, Marek, Michal ROTT and Vojtěch KOVÁŘ. Text Punctuation: An Inter-annotator Agreement Study. In Ekštein, Kamil Matoušek, Václav. Text, Speech, and Dialogue: 20th International Conference, TSD 2017. Cham: Springer International Publishing, 2017, p. 120-128. ISBN 978-3-319-64205-5. Available from: https://dx.doi.org/10.1007/978-3-319-64206-2_14.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Text Punctuation: An Inter-annotator Agreement Study
Authors BOHÁČ, Marek (203 Czech Republic), Michal ROTT (203 Czech Republic) and Vojtěch KOVÁŘ (203 Czech Republic, guarantor, belonging to the institution).
Edition Cham, Text, Speech, and Dialogue: 20th International Conference, TSD 2017, p. 120-128, 9 pp. 2017.
Publisher Springer International Publishing
Other information
Original language English
Type of outcome Proceedings paper
Field of Study 10201 Computer sciences, information science, bioinformatics
Country of publisher Czech Republic
Confidentiality degree is not subject to a state or trade secret
Publication form printed version "print"
WWW URL
Impact factor Impact factor: 0.402 in 2005
RIV identification code RIV/00216224:14330/17:00095096
Organization unit Faculty of Informatics
ISBN 978-3-319-64205-5
ISSN 0302-9743
Doi http://dx.doi.org/10.1007/978-3-319-64206-2_14
UT WoS 000449869200014
Keywords (in Czech) doplňování čárek;mluvený jazyk;mezianotátorská shoda
Keywords in English Comma adding;Spoken language;Inter-annotator agreement
Tags firank_B
Tags International impact, Reviewed
Changed by Changed by: Mgr. Michal Petr, učo 65024. Changed: 27/4/2020 23:37.
Abstract
Spoken language is a phenomenon which is hard to be annotated accurately. One of the most ambiguous tasks is to fill in the punctuation marks into the spoken language transcription. Used punctuation marks are often dependent on how annotators understand the transcription content. This may differ as the spoken language often lacks clear structure (inherent to written language) due to the utterance spontaneity or due to skipping between ideas. Therefore we suspect that filling commas into the spoken language transcription is a very ambiguous task with low inter-annotator agreement (IAA). In this paper we analyze the IAA within group of annotators and we propose methods to increase it. We also propose and evaluate a reformulation of classical GT annotations for cases with multiple annotations available.
Abstract (in Czech)
Článek se zabývá problematikou doplňování čárek do mluveného textu, zejména mezianotátorskou shodou a přesností současných počítačových programů.
Links
GA15-13277S, research and development projectName: Hyperintensionální logika pro analýzu přirozeného jazyka
Investor: Czech Science Foundation
LM2015071, research and development projectName: Jazyková výzkumná infrastruktura v České republice (Acronym: LINDAT-Clarin)
Investor: Ministry of Education, Youth and Sports of the CR
MUNI/A/0897/2016, interní kód MUName: Rozsáhlé výpočetní systémy: modely, aplikace a verifikace VI.
Investor: Masaryk University, Category A
PrintDisplayed: 17/7/2024 05:35