SemEval-2015 Task 15: A CPA dictionary-entry-building task

BAISA, Vít, Jane BRADBURY, Silvie CINKOVÁ, Ismaïl EL MAAROUF, Adam KILGARRIFF a Octavian POPESCU. SemEval-2015 Task 15: A CPA dictionary-entry-building task. Online. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, Colorado: Association for Computational Linguistics, 2015, s. 315-324. ISBN 978-1-941643-40-2.

Další formáty: BibTeX LaTeX RIS

Základní údaje
Originální název	SemEval-2015 Task 15: A CPA dictionary-entry-building task
Autoři	BAISA, Vít (203 Česká republika, domácí), Jane BRADBURY (826 Velká Británie a Severní Irsko), Silvie CINKOVÁ (203 Česká republika), Ismaïl EL MAAROUF (250 Francie, garant), Adam KILGARRIFF (826 Velká Británie a Severní Irsko) a Octavian POPESCU (642 Rumunsko).
Vydání	Denver, Colorado, Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), od s. 315-324, 10 s. 2015.
Nakladatel	Association for Computational Linguistics

Další údaje
Originální jazyk	angličtina
Typ výsledku	Stať ve sborníku
Obor	10201 Computer sciences, information science, bioinformatics
Stát vydavatele	Spojené státy
Utajení	není předmětem státního či obchodního tajemství
Forma vydání	elektronická verze "online"
WWW	URL
Kód RIV	RIV/00216224:14330/15:00083584
Organizační jednotka	Fakulta informatiky
ISBN	978-1-941643-40-2
Klíčová slova anglicky	semeval; corpus pattern analysis; concordance clustering; semantic evaluation
Štítky	firank_B
Změnil	Změnil: Mgr. et Mgr. Vít Baisa, Ph.D., učo 139654. Změněno: 11. 5. 2017 07:43.

Anotace

This paper describes the first SemEval task to explore the use of Natural Language Processing systems for building dictionary entries, in the framework of Corpus Pattern Analysis. CPA is a corpus-driven technique which provides tools and resources to identify and represent unambiguously the main semantic patterns in which words are used. Task 15 draws on the Pattern Dictionary of English Verbs (www.pdev.org.uk), for the targeted lexical entries, and on the British National Corpus for the input text. Dictionary entry building is split into three subtasks which all start from the same concordance sample: 1) CPA parsing, where arguments and their syntactic and semantic categories have to be identified, 2) CPA clustering, in which sentences with similar patterns have to be clustered and 3) CPA automatic lexicography where the structure of patterns have to be constructed automatically. Subtask 1 attracted 3 teams, though none could beat the baseline (rule-based system). Subtask 2 attracted 2 teams, one of which beat the baseline (majority-class classifier). Subtask 3 did not attract any participant. The task has produced a major semantic multidataset resource which includes data for 121 verbs and about 17,000 annotated sentences, and which is freely accessible.

Návaznosti
LM2010013, projekt VaV	Název: LINDAT-CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat (Akronym: LINDAT-Clarin)
LM2010013, projekt VaV	Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, Projekt LINDAT-Clarin - Vybudování a provoz českého uzlu pan-evropské infrastruktury pro výzkum
7F14047, projekt VaV	Název: Harvesting big text data for under-resourced languages (Akronym: HaBiT)
7F14047, projekt VaV	Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, Harvesting big text data for under-resourced languages

VytisknoutZobrazeno: 25. 4. 2024 13:47

SemEval-2015 Task 15: A CPA dictionary-entry-building task

Další aplikace