Graded and Word-Sense-Disambiguation Decisions in Corpus
Pattern Analysis: a Pilot Study

CINKOVA, Silvie, Ema KREJČOVÁ, Anna VERNEROVÁ and Vít BAISA. Graded and Word-Sense-Disambiguation Decisions in Corpus Pattern Analysis: a Pilot Study. Online. In Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Portorož, Slovenia: European Language Resources Association (ELRA), 2016, p. 848-854. ISBN 978-2-9517408-9-1.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	Graded and Word-Sense-Disambiguation Decisions in Corpus Pattern Analysis: a Pilot Study
Authors	CINKOVA, Silvie (203 Czech Republic), Ema KREJČOVÁ (203 Czech Republic), Anna VERNEROVÁ (203 Czech Republic) and Vít BAISA (203 Czech Republic, belonging to the institution).
Edition	Portorož, Slovenia, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), p. 848-854, 7 pp. 2016.
Publisher	European Language Resources Association (ELRA)

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	Slovenia
Confidentiality degree	is not subject to a state or trade secret
Publication form	electronic version available online
RIV identification code	RIV/00216224:14330/16:00090038
Organization unit	Faculty of Informatics
ISBN	978-2-9517408-9-1
Keywords in English	CPA; graded decisions; English; verbs; usage patterns; annotation; Likert scales
Tags	firank_B
Tags	International impact, Reviewed
Changed by	Changed by: Mgr. et Mgr. Vít Baisa, Ph.D., učo 139654. Changed: 27/5/2016 13:35.

Abstract

We present a pilot analysis of a new linguistic resource, VPS-GradeUp (available at http://hdl.handle.net/11234/1-1585 ). The resource contains 11,400 graded human decisions on usage patterns of 29 English lexical verbs, randomly selected from the Pattern Dictionary of English Verbs (Hanks, 2000 2014). The selection was random and based on their frequency and the number of senses their lemmas have in PDEV. This data set has been created to observe the interannotator agreement on PDEV patterns produced using the Corpus Pattern Analysis (Hanks, 2013). Apart from the graded decisions, the data set also contains traditional Word-Sense-Disambiguation (WSD) labels. We analyze the associations between the graded annotation and WSD annotation. The results of the respective annotations do not correlate with the size of the usage pattern inventory for the respective verbs lemmas, which makes the data set worth further linguistic analysis.

Links
LM2015071, research and development project	Name: Jazyková výzkumná infrastruktura v České republice (Acronym: LINDAT-Clarin)
LM2015071, research and development project	Investor: Ministry of Education, Youth and Sports of the CR
7F14047, research and development project	Name: Harvesting big text data for under-resourced languages (Acronym: HaBiT)
7F14047, research and development project	Investor: Ministry of Education, Youth and Sports of the CR

PrintDisplayed: 1/9/2024 01:38

Graded and Word-Sense-Disambiguation Decisions in Corpus Pattern Analysis: a Pilot Study

Other applications