SemEval-2015 Task 15: A CPA dictionary-entry-building task

BAISA, Vít, Jane BRADBURY, Silvie CINKOVÁ, Ismaïl EL MAAROUF, Adam KILGARRIFF and Octavian POPESCU. SemEval-2015 Task 15: A CPA dictionary-entry-building task. Online. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, Colorado: Association for Computational Linguistics, 2015, p. 315-324. ISBN 978-1-941643-40-2.

Other formats: BibTeX LaTeX RIS

Basic information
Original name	SemEval-2015 Task 15: A CPA dictionary-entry-building task
Authors	BAISA, Vít (203 Czech Republic, belonging to the institution), Jane BRADBURY (826 United Kingdom of Great Britain and Northern Ireland), Silvie CINKOVÁ (203 Czech Republic), Ismaïl EL MAAROUF (250 France, guarantor), Adam KILGARRIFF (826 United Kingdom of Great Britain and Northern Ireland) and Octavian POPESCU (642 Romania).
Edition	Denver, Colorado, Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), p. 315-324, 10 pp. 2015.
Publisher	Association for Computational Linguistics

Other information
Original language	English
Type of outcome	Proceedings paper
Field of Study	10201 Computer sciences, information science, bioinformatics
Country of publisher	United States of America
Confidentiality degree	is not subject to a state or trade secret
Publication form	electronic version available online
WWW	URL
RIV identification code	RIV/00216224:14330/15:00083584
Organization unit	Faculty of Informatics
ISBN	978-1-941643-40-2
Keywords in English	semeval; corpus pattern analysis; concordance clustering; semantic evaluation
Tags	firank_B
Changed by	Changed by: Mgr. et Mgr. Vít Baisa, Ph.D., učo 139654. Changed: 11/5/2017 07:43.

Abstract

This paper describes the first SemEval task to explore the use of Natural Language Processing systems for building dictionary entries, in the framework of Corpus Pattern Analysis. CPA is a corpus-driven technique which provides tools and resources to identify and represent unambiguously the main semantic patterns in which words are used. Task 15 draws on the Pattern Dictionary of English Verbs (www.pdev.org.uk), for the targeted lexical entries, and on the British National Corpus for the input text. Dictionary entry building is split into three subtasks which all start from the same concordance sample: 1) CPA parsing, where arguments and their syntactic and semantic categories have to be identified, 2) CPA clustering, in which sentences with similar patterns have to be clustered and 3) CPA automatic lexicography where the structure of patterns have to be constructed automatically. Subtask 1 attracted 3 teams, though none could beat the baseline (rule-based system). Subtask 2 attracted 2 teams, one of which beat the baseline (majority-class classifier). Subtask 3 did not attract any participant. The task has produced a major semantic multidataset resource which includes data for 121 verbs and about 17,000 annotated sentences, and which is freely accessible.

Links
LM2010013, research and development project	Name: LINDAT-CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat (Acronym: LINDAT-Clarin)
LM2010013, research and development project	Investor: Ministry of Education, Youth and Sports of the CR
7F14047, research and development project	Name: Harvesting big text data for under-resourced languages (Acronym: HaBiT)
7F14047, research and development project	Investor: Ministry of Education, Youth and Sports of the CR

PrintDisplayed: 4/5/2024 22:51

SemEval-2015 Task 15: A CPA dictionary-entry-building task

Other applications