D 2008

GDEX: Automatically finding good dictionary examples in a corpus

RYCHLÝ, Pavel; Miloš HUSÁK; Adam KILGARRIFF; Michael RUNDELL; Katy MCADAM et. al.

Basic information

Original name

GDEX: Automatically finding good dictionary examples in a corpus

Name in Czech

GDEX: Automatické vyhledávání dobrých slovníkových příkladů v korpusu

Authors

RYCHLÝ, Pavel (203 Czech Republic, guarantor); Miloš HUSÁK (203 Czech Republic); Adam KILGARRIFF (826 United Kingdom of Great Britain and Northern Ireland); Michael RUNDELL (826 United Kingdom of Great Britain and Northern Ireland) and Katy MCADAM (826 United Kingdom of Great Britain and Northern Ireland)

Edition

1. vyd. Barcelona, Proceedings of the XIII EURALEX International Congress, p. 425-432, 7 pp. 2008

Publisher

Institut Universitari de Lingüística Aplicada

Other information

Language

English

Type of outcome

Proceedings paper

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Spain

Confidentiality degree

is not subject to a state or trade secret

RIV identification code

RIV/00216224:14330/08:00024233

Organization unit

Faculty of Informatics

ISBN

9788496742673

Keywords in English

good dictionary examples; evaluation of sentence informativeness and readability

Tags

International impact, Reviewed
Changed: 8/1/2009 09:50, doc. Mgr. Pavel Rychlý, Ph.D.

Abstract

V originále

Users appreciate examples. If a dictionary entry includes contextualized examples of the different senses a word may have, then the user generally gets what they want in a quick and straightforward way. Thus, there are grounds for including lots of examples and contexts. Producing good examples, however, can be labour-intensive, thus, expensive. We automatically found good candidate sentences in a corpus, with which lexicographers could work. The technology used to add examples to an online version of a leading dictionary: we describe and evaluate the project. We consider a range of other ways in which the finding of good examples can bridge the gap between corpuses, dictionaries, and language learning.

In Czech

Příklady užití slov či frází v heslem slovníků jsou cenným zdrojem informací. Vytváření vhodných příkladů je ovšem náročná manuální práce. Článek popisuje algoritmus automatické identifikace vět z korpusu, které mohou sloužit jako dobré příklady užití pro slovníková hesla.

Links

LC536, research and development project
Name: Centrum komputační lingvistiky
Investor: Ministry of Education, Youth and Sports of the CR, Centrum komputační lingvistiky
1ET100300419, research and development project
Name: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu
Investor: Academy of Sciences of the Czech Republic, Intelligent Models, Algorithms, Methods and Tools for the Semantic Web (realization)
1ET200610406, research and development project
Name: Jazyková poradna na internetu
Investor: Academy of Sciences of the Czech Republic, Internet Language Consulting Service
2C06009, research and development project
Name: Prostředky tvorby komplexní báze znalostí pro komunikaci se sémantickým webem v přirozeném jazyce (Acronym: COT-SEWing)
Investor: Ministry of Education, Youth and Sports of the CR