DENISOVÁ, Michaela a Pavel RYCHLÝ. Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative Analysis. Online. In Nöth, E., Horák, A., Sojka, P. International Conference on Text, Speech, and Dialogue. Cham: Springer Nature Switzerland, 2024, s. 30-42, 12 s. ISBN 978-3-031-70563-2. Dostupné z: https://dx.doi.org/10.1007/978-3-031-70563-2_3. |
Další formáty:
BibTeX
LaTeX
RIS
@inproceedings{2426237, author = {Denisová, Michaela and Rychlý, Pavel}, address = {Cham}, booktitle = {International Conference on Text, Speech, and Dialogue}, doi = {http://dx.doi.org/10.1007/978-3-031-70563-2_3}, editor = {Nöth, E., Horák, A., Sojka, P.}, keywords = {bilingual lexicon induction; cross-lingual word embeddings; neural machine translation systems}, howpublished = {elektronická verze "online"}, language = {eng}, location = {Cham}, isbn = {978-3-031-70563-2}, pages = {30-42}, publisher = {Springer Nature Switzerland}, title = {Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative Analysis}, url = {https://tsdconference.org/tsd2024/download/preprints/1195.pdf}, year = {2024} }
TY - JOUR ID - 2426237 AU - Denisová, Michaela - Rychlý, Pavel PY - 2024 TI - Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative Analysis PB - Springer Nature Switzerland CY - Cham SN - 9783031705632 KW - bilingual lexicon induction KW - cross-lingual word embeddings KW - neural machine translation systems UR - https://tsdconference.org/tsd2024/download/preprints/1195.pdf N2 - Bilingual lexicon induction (BLI) from comparable data has become a common way of evaluating cross-lingual word embeddings (CWEs). These models have drawn much attention, mainly due to their availability for rare and low-resource language pairs. An alternative offers systems exploiting parallel data, such as popular neural machine translation systems (NMTSs), which are effective and yield state-of-the-art results. Despite the significant advancements in NMTSs, their effectiveness in the BLI task compared to the models using comparable data remains underexplored. In this paper, we provide a comparative study of the NMTS and CWE models evaluated on the BLI task and demonstrate the results across three diverse language pairs: distant (Estonian-English) and close (Estonian-Finnish) language pair and language pair with different scripts (Estonian-Russian). Our study reveals the differences, strengths, and limitations of both approaches. We show that while NMTSs achieve impressive results for languages with a great amount of training data available, CWEs emerge as a better option when faced less resources. ER -
DENISOVÁ, Michaela a Pavel RYCHLÝ. Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative Analysis. Online. In Nöth, E., Horák, A., Sojka, P. \textit{International Conference on Text, Speech, and Dialogue}. Cham: Springer Nature Switzerland, 2024, s.~30-42, 12 s. ISBN~978-3-031-70563-2. Dostupné z: https://dx.doi.org/10.1007/978-3-031-70563-2\_{}3.
|