On Evaluation of Natural Language Processing Tasks: Is Gold
Standard Evaluation Methodology a Good Solution?

KOVÁŘ, Vojtěch, Miloš JAKUBÍČEK and Aleš HORÁK. On Evaluation of Natural Language Processing Tasks: Is Gold Standard Evaluation Methodology a Good Solution?. Online. In Jaap van den Herik and Joaquim Filipe. Proceedings of the 8th International Conference on Agents and Artificial Intelligence. Rome: SCITEPRESS, 2016. p. 540-545. ISBN 978-989-758-172-4. [citováno 2024-04-24]

Other formats: BibTeX LaTeX RIS

Basic information
Original name	On Evaluation of Natural Language Processing Tasks: Is Gold Standard Evaluation Methodology a Good Solution?
Name in Czech	K evaluaci úkolů zpracování přirozeného jazyka: je metodologie používající "gold standardy" dobrým řešením?
Authors	KOVÁŘ, Vojtěch (203 Czech Republic, guarantor, belonging to the institution), Miloš JAKUBÍČEK (203 Czech Republic, belonging to the institution) and Aleš HORÁK (203 Czech Republic, belonging to the institution)
Edition	Rome, Proceedings of the 8th International Conference on Agents and Artificial Intelligence, p. 540-545, 6 pp. 2016.
Publisher	SCITEPRESS

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Italy
Confidentiality degree	is not subject to a state or trade secret
Publication form	storage medium (CD, DVD, flash disk)
RIV identification code	RIV/00216224:14330/16:00087757
Organization unit	Faculty of Informatics
ISBN	978-989-758-172-4
Keywords (in Czech)	zpracování přirozeného jazyka; aplikace; vyhodnocování; evaluace
Keywords in English	Natural Language Processing; Applications; Evaluation
Tags	firank_B
Tags	International impact, Reviewed
Changed by	Changed by: RNDr. Vojtěch Kovář, Ph.D., učo 139915. Changed: 7/3/2016 17:20.

Abstract

The paper discusses problems in state of the art evaluation methods used in natural language processing (NLP). Usually, some form of gold standard data is used for evaluation of various NLP tasks, ranging from morphological annotation to semantic analysis. We discuss problems and validity of this type of evaluation, for various tasks, and illustrate the problems on examples. Then we propose using application-driven evaluations, wherever it is possible. Although it is more expensive, more complicated and not so precise, it is the only way to find out if a particular tool is useful at all.

Abstract (in Czech)
Práce se zabývá problémy v metodologii vyhodnocování v oblasti zpracování přirozeného jazyka (NLP). Většinou jsou pro takové vyhodnocování používána tzv. "gold standard" data. Diskutujeme problémy a validitu tohoto přístupu a navrhujeme aplikačně orientovanou alternativu.

Links
GA15-13277S, research and development project	Name: Hyperintensionální logika pro analýzu přirozeného jazyka
GA15-13277S, research and development project	Investor: Czech Science Foundation
7F14047, research and development project	Name: Harvesting big text data for under-resourced languages (Acronym: HaBiT)
7F14047, research and development project	Investor: Ministry of Education, Youth and Sports of the CR

PrintDisplayed: 24/4/2024 04:19

On Evaluation of Natural Language Processing Tasks: Is Gold Standard Evaluation Methodology a Good ...

Other applications