KOTZÉ, Gideon, Vincent VANDEGHINSTE, Scott MARTENS and Jörg TIEDEMANN. Large aligned treebanks for syntax-based machine translation. LANGUAGE RESOURCES AND EVALUATION. NETHERLANDS: SPRINGER, 2017, Vol. 51(2), p. 249-282. ISSN 1574-020X. Available from: https://dx.doi.org/10.1007/s10579-016-9369-0.
Other formats:   BibTeX LaTeX RIS
Basic information
Original name Large aligned treebanks for syntax-based machine translation
Authors KOTZÉ, Gideon, Vincent VANDEGHINSTE, Scott MARTENS and Jörg TIEDEMANN.
Edition LANGUAGE RESOURCES AND EVALUATION, NETHERLANDS, SPRINGER, 2017, 1574-020X.
Other information
Type of outcome Article in a journal
Confidentiality degree is not subject to a state or trade secret
Impact factor Impact factor: 0.656
Doi http://dx.doi.org/10.1007/s10579-016-9369-0
Keywords in English parallel treebank, parallel corpus, machine translation, syntax-based machine translation, constituent alignment, tree alignment, resource development
Tags constituent alignment, machine translation, parallel corpus, parallel treebank, Pouze vystaveno, resource development, syntax-based machine translation, tree alignment
Tags International impact, Reviewed
Changed by Changed by: Gideon Kotzé, PhD, učo 247652. Changed: 1/11/2022 12:11.
Abstract
We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the non-terminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax- and example-based machine translation system called Parse and Corpus-Based Machine Translation (PaCo-MT). For the language pair Dutch to English, we present non-terminal alignment evaluation scores for a variety of tree alignment approaches. Finally, based on the parallel treebanks created by these approaches, we evaluate the MT system itself and compare the scores with those of Moses, a current state-of-the-art statistical MT system, when trained on the same data.
PrintDisplayed: 28/8/2024 12:17