Acquiring Data for Textual Entailment Recognition

NEVĚŘILOVÁ, Zuzana. Acquiring Data for Textual Entailment Recognition. In Seventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2013. Brno: Tribun EU, 2013, s. 29-37. ISBN 978-80-263-0520-0.

Další formáty: BibTeX LaTeX RIS

Základní údaje
Originální název	Acquiring Data for Textual Entailment Recognition
Autoři	NEVĚŘILOVÁ, Zuzana (203 Česká republika, garant, domácí).
Vydání	Brno, Seventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2013, od s. 29-37, 9 s. 2013.
Nakladatel	Tribun EU

Další údaje
Originální jazyk	angličtina
Typ výsledku	Stať ve sborníku
Obor	10201 Computer sciences, information science, bioinformatics
Stát vydavatele	Česká republika
Utajení	není předmětem státního či obchodního tajemství
Forma vydání	tištěná verze "print"
WWW	URL
Kód RIV	RIV/00216224:14330/13:00070350
Organizační jednotka	Fakulta informatiky
ISBN	978-80-263-0520-0
Klíčová slova anglicky	extual entailment; language game; games with a purpose; GWAP;
Příznaky	Mezinárodní význam, Recenzováno
Změnil	Změnila: RNDr. Zuzana Nevěřilová, Ph.D., učo 3839. Změněno: 27. 5. 2021 09:12.

Anotace

Language resources are hardly ever large enough. Building language resources that can be used as a gold standard for semantic analysis requires effort and investment. We present a prototype for acquiring language resources by means of a language game which is a cheap but long-term method. Games employed to acquire language resources are not new. For example games with a purpose are used for collecting common sense knowledge. The game presented in this paper is a work in progress. It collects annotated pairs text–hypothesis suitable for recognizing textual entailment in Czech. The game narrative is based on Sherlock Holmes and dr. Watson dialogues. For generating the dialogue line we use rule-based approaches such as syntactic analysis, anaphora resolution, synonym and hypernym replacement, word order rearrangement and verb frame based inference. To generate natural sounding sentences we added a language model score (based on n-gram frequencies in a corpus).

Návaznosti
LM2010013, projekt VaV	Název: LINDAT-CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat (Akronym: LINDAT-Clarin)
LM2010013, projekt VaV	Investor: Ministerstvo školství, mládeže a tělovýchovy ČR, Projekt LINDAT-Clarin - Vybudování a provoz českého uzlu pan-evropské infrastruktury pro výzkum

VytisknoutZobrazeno: 27. 7. 2024 14:16

Acquiring Data for Textual Entailment Recognition

Další aplikace