Závěrečná práce: Bc. Martin Geletka, učo 456576: Speeding up inference time of neural machine translation
Diplomová práce
Speeding up inference time of neural machine translation
Anotace
Vďaka takzvaným Transformer modelom sa nedávno dosiahli signifikantné posuny v úlohách strojového prekladu. V praxi však tieto modely trpia vysokou latenciou, takže sú často nepoužiteľné v praktických aplikáciách. Táto práca študuje dôvody tejto vysokej latencie a zhrnuje, aplikuje a porovnáva techniky strojového učenia, ktorých cieľom je skrátenie inferenčného času Transformer modelov použitých pri …více
Abstract
Large qualitative gains were recently made in machine translation tasks thanks to Transformers models. However, in practice, these models suffer from high latency, such that they often are hardly usable in practical applications. This thesis study studies the reasons behind high latency time and tries to overview, employ and compare techniques, which tries to decrease the inference time of the Transformer …více
Zadání práce
Large qualitative gains were recently observed in Natural Language Processing (NLP) tasks thanks to huge so-called transformers models with hundreds of millions of parameters. However, in practice, these models suffer from high latency, such that they often are hardly usable in practical applications. Specifically, in Neural Machine Translation (NMT), classical approaches iteratively condition each output word on previously generated outputs, effectively aggregating the prediction time. This thesis aims to overview techniques that avoid this property and produce its results in parallel, allowing an order of magnitude lower latency on inference.
The student in this thesis will:
- understand and describe the process of neural machine translation using transformer models
- train different standard (auto-regressive) transformer models for NMT
- measure the performance and inference times of these models
- study the available relevant options of speeding up the inference time and deploy their application for NMT
- evaluate both qualitative and performance impact concerning the inference time on a task of NMT
28. 1. 2022 11:13, doc. RNDr. Petr Sojka, Ph.D., učo 2378
Konzultant
Práce na příbuzné téma
Seznam prací, které mají shodná klíčová slova.
-
One Bit at a Time: Impact of Quantisation on Neural Machine Translation
Mgr. Marek Petrovič -
Transformer Neural Networks for Natural Language Processing
Jonáš Konečný -
Translating Medical Texts using Neural Machine Translation
Mgr. Magdalena Panská -
Pretraining and Evaluation of Czech ALBERT Language Model
RNDr. Petr Zelina, učo 469366 -
Analyse comparative des traductions automatiques des textes de différents styles fonctionnels du français vers le tchèque
Mgr. et Mgr. Lenka Koňaříková -
Translating Science and Technology: Expert and Popular Science Texts in English-to-Czech Translation using NMT
Mgr. Petr Zahradník -
Automatic text summarization
Mgr. Adam Hájek -
Utilisation of language representations for Information Retrieval
Ing. Petr Mička




