D 2016

VPS-GradeUp: Graded Decisions on Usage Patterns

BAISA, Vít, Silvie CINKOVA, Ema KREJČOVÁ and Anna VERNEROVÁ

Basic information

Original name

VPS-GradeUp: Graded Decisions on Usage Patterns

Authors

BAISA, Vít (203 Czech Republic, belonging to the institution), Silvie CINKOVA (203 Czech Republic), Ema KREJČOVÁ (203 Czech Republic) and Anna VERNEROVÁ (203 Czech Republic)

Edition

Portorož, Slovenia, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), p. 823-827, 5 pp. 2016

Publisher

European Language Resources Association (ELRA)

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Slovenia

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

electronic version available online

References:

RIV identification code

RIV/00216224:14330/16:00090124

Organization unit

Faculty of Informatics

ISBN

978-2-9517408-9-1

Keywords in English

Corpus Creation; Corpus Annotation; Word Sense Disambiguation; Validation of Language Resources

Tags

Tags

International impact, Reviewed
Změněno: 7/6/2016 16:46, Mgr. et Mgr. Vít Baisa, Ph.D.

Abstract

V originále

We present VPS-GradeUp - a set of 11,400 graded human decisions on usage patterns of 29 English lexical verbs from the Pattern Dictionary of English Verbs by Patrick Hanks. The annotation contains, for each verb lemma, a batch of 50 concordances with the given lemma as KWIC, and for each of these concordances we provide a graded human decision on how well the individual PDEV patterns for this particular lemma illustrate the given concordance, indicated on a 7-point Likert scale for each PDEV pattern. With our annotation, we were pursuing a pilot investigation of the foundations of human clustering and disambiguation decisions with respect to usage patterns of verbs in context. The data set is publicly available at http://hdl.handle.net/11234/1-1585.

Links

LM2015071, research and development project
Name: Jazyková výzkumná infrastruktura v České republice (Acronym: LINDAT-Clarin)
Investor: Ministry of Education, Youth and Sports of the CR
7F14047, research and development project
Name: Harvesting big text data for under-resourced languages (Acronym: HaBiT)
Investor: Ministry of Education, Youth and Sports of the CR