Detailed Information on Publication Record
2019
Automating dictionary production: a Tagalog-English-Korean dictionary from scratch
BAISA, Vít, Marek BLAHUŠ, Michal CUKR, Ondřej HERMAN, Miloš JAKUBÍČEK et. al.Basic information
Original name
Automating dictionary production: a Tagalog-English-Korean dictionary from scratch
Authors
BAISA, Vít (203 Czech Republic, belonging to the institution), Marek BLAHUŠ (203 Czech Republic), Michal CUKR (203 Czech Republic), Ondřej HERMAN (203 Czech Republic, belonging to the institution), Miloš JAKUBÍČEK (203 Czech Republic, belonging to the institution), Vojtěch KOVÁŘ (203 Czech Republic, belonging to the institution), Marek MEDVEĎ (703 Slovakia, belonging to the institution), Michal MĚCHURA (203 Czech Republic, belonging to the institution), Pavel RYCHLÝ (203 Czech Republic, belonging to the institution) and Vít SUCHOMEL (203 Czech Republic, belonging to the institution)
Edition
Brno, Czech Republic, Proceedings of the 6th Biennial Conference on Electronic Lexicography, p. 805-818, 14 pp. 2019
Publisher
Lexical Computing CZ s.r.o.
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
electronic version available online
References:
RIV identification code
RIV/00216224:14330/19:00107599
Organization unit
Faculty of Informatics
ISSN
Keywords in English
Sketch Engine; Lexonomy; post-editing lexicography; dictionary; corpus; Tagalog; Filipino; English; Korean
Tags
International impact, Reviewed
Změněno: 22/10/2023 01:49, RNDr. Miloš Jakubíček, Ph.D.
Abstract
V originále
In this paper we present lexicographic work on a Tagalog-English-Korean dictionary. The dictionary is created entirely from scratch and all of its content (besides audio pronunciation) is initially generated fully automatically from a large web corpus that we built for these purposes, and then post-edited by human editors. The full size of the dictionary is 45,000 entries, out of which 15,000 most frequent entries are manually post-edited, while the remaining 30,000 entries are left only as automated. The project is currently ongoing and will be finished in December 2019. The dictionary will be part of the online platform run by the Naver Corporation and freely available.
Links
GA18-23891S, research and development project |
| ||
LM2015071, research and development project |
|