Current Challenges
Philipp Koehn 3 November 2022
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
WMT 2016 i
human
.6
.2 --
A --
Neural MT
• uedin-nmt
Statistical MT_* metamind
uedin-syntax •
• NYU-UMONTREAL
ONLINE-B
äOMT-RULE-BASED • KIT-LIMSI •• . . CAMBRIDGE
Kiir ONLINE-A
JHU-SYNTAX • . JHU-PBMT
UEDIN-PBMT
ONLINE-F # ONLINE-G
BLEU
H-1-1-1-1-1-1-1-1-1
18 20 22 24 26 28 30 32 34 36
(in 2017 barely any statistical machine translation submissions)
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
2017: Google: "Near Human Quality" 2 ^
6 perfect translation
human
I* _ -neural (GNMT)
phrase-based (PBMT)
English English English Spanish French Chinese > > > > > >
Spanish French Chinese English English English
Translation model
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
2018: More Hype
Microsoft Research Achieves Human Parity For Chinese English Translation
Written by Sue Gee Wednesday, 21 March 2018
Researchers in Microsoft's labs in Beijing and in Redmond and Washington have developed an Al machine translation system that can translate with the same accuracy as a human from Chinese to English.
SDL Cracks Russian to English Neural Machine Translation
Global Enterprises to Capitalize on Near Perfect Russian to English Machine Translation as SDL Sets New Industry Standard
'90% of the system's output labelled as perfect byprofessional Russian-English translators
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Just Better Fluency?
Adequacy +1%
100
80
60
III
CS^EN DE^EN RO^EN RU^EN
llONLINE-Bll UEDIN-NMT
Fluency
+13%
100
80
60
nrrr
CS^EN DE^EN RO^EN RU^EN
IIONLINE-BI ■ UEDIN-NMT
(from: Sennrich and Haddow, 2017)
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
lack of training data
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Amount of Training Data
Corpus Size (English Words)
English-Spanish systems trained on 0.4 million to 385.7 million words
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Translation Examples
Source A Republican strategy to counter the re-election of Obama
i 1024 Un órgano de coordinación para el anuncio de libre determinación
1 512 Lista de una estrategia para luchar contra la elección de hoj as de Ohio
256 Explosion realiza una estrategia divisiva de luchar contra las elecciones de autor
1 128 Una estrategia republicana para la eliminación de la reelección de Obama
1 64 Estrategia siria para contrarrestar la reelección del Obama .
1 i 32 1 Una estrategia republicana para contrarrestar la reelección de Obama
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
8
domain mismatch
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Domain Mismatch 9 ^jiy
System | Law Medical IT Koran Subtitles
All Data 30.532.8 45.142.2 35.344.7 17.917.9 26.420.8
Law 31.134.4 12.118.2 3.5 6.9 1.3 2.2 2.8 6.0
Medical 3.910.2 39.443.5 2.0 8.5 0.6 2.0 1.4 5.8
IT 1.9 3.7 6.5 5.3 42.139.8 1.8 1.6 3.9 4.7
Koran 0.4 1.8 0.0 2.1 0.0 2.3 15.918.8 1.0 5.5
^^^^
Subtitles 7.0 9.9 9.317.8 9.213.6 9.0 8.4 25.922.1
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Translation Examples
Source Schaue um dich herum.
Ref. Look around you.
All NMT: Look around you. SMT: Look around you.
Law NMT: Sughum gravecorn. SMT: In order to implement dich Schaue .
Medical NMT: EMEA / MB / 049 / 01-EN-Final Work progamme for 2002 SMT: Schaue by dich around .
IT NMT: Switches to paused. SMT: To Schaue by itself . \t \t
Koran NMT: Take heed of your own souls. SMT: And you see.
Subtitles NMT: Look around you. SMT: Look around you .
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
11
V
rare words
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Rare Words
• More frequent in training —>• more likely to get right in test
• Let's measure thisl
• One problem
— frequency measured for input words
— translation correctness measured for output words
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Translation Accuracy for Input Words 13
• Generate word alignment between input and output words
• Look up count of input word in training
• Link to output word via word alignment
• Check if it is also in the reference translation!
• A lot of tedious special cases
— one-to-many alignment, only some output words in reference
— input word not aligned to any target word
— many-to-one alignment
— output word occurs multiple time in output or reference sentence
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Count vs. Accuracy
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
15
word alignment
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Word Alignment
§ c ^
5 qj to
•2 g
3
OS PI
^ I J§ ^ CÜ
73
qj
03 qj C ^ > qj 03 03 qj h
CD
5-1
- ^_ q qj
4^ ,£> cn <-m >->
89
die 56 Beziehungen
zwischen Obama und Netanjahu
72
16
26
96
79
98
sind 42 11 38
seit 22 54 10
Jahren 98
angespannt 84
• 11 14 23
49
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Word Alignment?
the
relationship between Obama and Netanyahu
has
been
stretched
for years
•1 c
I -a
:cö U
CO
CO
47
17
11
81
72
87
93
95
38 16 26
21 14 54
77
38 33 12
90
19 32 17
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
18
beam search
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
1 2 4 8 12 20 30 50 100 200 500 1,000
Beam Size
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
20
noisy data
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Noise in Training Data
• Crawled parallel data from the web (very noisy)
SMT NMT
WMT17 24.0 27.2
+ Paracrawl 25.2 (+1.2) 17.3 (-9.9)
(German-English, 90m words each of WMT17 and Crawl data)
5% 10% 20% 50% 100%
Raw crawl data 27.4 24.2 26.6 24.2 24.7 24.4 20.9 24.s 17.3
+0.2 +0.2 +0.4 +0.8 -6.3 + 1.2
-0.9 +02
-2.5 ,q q
• Corpus cleaning methods [Xu and Koehn, EMNLP 2017] give improvements
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Types of Noise
• Misaligned sentences
• Disfluent language (from MT, bad translations)
• Wrong language data (e.g., French in German-English corpus)
• Untranslated sentences
• Short segments (e.g., dictionaries)
• Mismatched domain
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Mismatched Sentences
• Artificial created by randomly shuffling sentence order
• Added to existing parallel corpus in different amounts
5% 10% 20% 50% 100%
24.0 24.0 23.9 26.1 23.9 25.3 23.4
-0.0 -0.0 -0.1 —-0.1 " -0.6
• Bigger impact on NMT (green, left) than SMT (blue, right)
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Misordered Words 24
• Artificial created by randomly shuffling words in each sentence
5% 10% 20% 50% 100%
Source 24.0 23.6 23.9 26.6 23.6 25.5 23.7
-0.0 -0.4 -0.1 -0.6 -0.4
Target 24.0 24.0 23.4 26.7 23.2 26.1 22.9
-0.0 -0.0 -0.6 -0.5 -0.8 -1.1 -1.1
• Similar impact on NMT than SMT, worse for source reshuffle
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Untranslated Sentences 25 ^
5% 10% 20% 50% 100%
17.6 23.8 11.2 23.9 5.6 23.8 3.2 23.4 3.2 21.1
-0.2 -0.1 -0.2 -0.6 -2.9
Source -9.8
-16.0
-21.6
-24.0 -24.0
Target 27.2 27.0 26.7 26.8 26.9
-0.0 -0.2 -0.5 -0.4 -0.3
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Wrong Language ^
5% 10% 20% 50% 100%
fr source 26.9 24.0 -0.3 -0.0 26.8 23.9 -0.4 -0.1 26.8 23.9 -0.4 -0.1 26.8 23.9 -0.4 -0.1 26.8 23.8 -0.4 -0.2
fr target 26.7 24.0 26.6 23.9 26.7 23.8 26.2 23.5 25.0 23.4
-0.5 -0.0 -0.6 -0.1 -0.5 -0.2 -1.0 -0.5 -2.2
• Surprisingly robust, maybe due to domain mismatch of French data
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Short Sentences
5% 10% 20% 50%
1 -2 words 27.1 24.1 26.5 23.9 26.7 23.8
-0.1 +0.1 -0.7 -0.1 -0.5 -0.2
27.8 24.2 27.6 24.5 2M) 24.5 26.6 24.2
1 -5 words +0.6 +0.2 +0.4 +0.5 TdT +0.5 -0.6 +0-2
• No harm done
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
28
control over output
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Specifying Decoding Constraints 29
• Overriding the decisions of the decoder
• Why?
=4> translations have followed strict terminology =4> rule-based translation of dates, quantities, etc.
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
XML Schema
The router is a model Psy X500 Pro .
• The XML tags specify to the decoder that
- the word router to be translated as Router
- The router is, to be translated before the rest ()
- brand name Psy X500 Pro to be translated as a unit (, )
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Formal Constraints
• Subtitles
— translation has to fit into space on screen (may have to be shortened)
— input and output broken up into linesl
• Speech translation
— input often not well-formed
— real time translation: start while sentence is spoken
— subtitles: have be readable in limited time
— dubbing: sync up with video of speaker's mouth movement!
• Poetry
— meter
— rhyme
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
32
catastrophic errors
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Catastrophic Errors
News | Science and Technology
Facebook apologises for rude mistranslation of Xi Jinping's name
Company blames technical glitch that 'caused incorrect translations'of Chinese leader's name from Burmese to English.
Facebook's auto translation Al fail leads to a nightmare for a Palestinian man
The Al feature had "Good morning" in Arabic wrongly translated as "attack them" in Hebrew.
By Gianluca Mezzofiore on October 24, 2017 f V Q Industry News • By Marion Marking On 3 Aug 2020
Thai Mistranslation Shows Risk of Auto-Translating Social Media Content
After a machine translation of a post from English into Thai about the King's birthday proved offensive to the Thai monarchy, Facebook Thailand said it was deactivating auto-translate on Facebook and Instagram, revamping machine translation (MT) quality, and offering the Thai people its "profound apology."
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
What are Catastrophic Errors?
• Generation of profanity
— first step: maintain list of offensive words for each language
— only eliminate these words, if the input did not include such words
— but: offensive language is not limited to specific words
• Generation of violent / inciting content
• Opposite meaning
• Mistranslation of names =4> All this is hard to detect
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
35
robustness
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Robustness to User Generated Content
36
=4
English
German
daily content of #scaramouche from genshin impact #WM * mute
#mouchecc for no cc tweets! not leak free http://dailymouch e.carrd.co
x
täglicher Inhalt von #scaramouche von genshin impact #JH# j{ stumm
#mouchecc für keine CC-Tweets! nicht auslaufsicher ^ http://dailymouche.ca rrd.co
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Challenges
• Jargon and acronyms
• Misspellings (sometimes intended for effect)
• Mangled grammar
• Special symbols (emojis, etc.)
• Hashtags, URLs,...
• Use of dialectical languages
• Use of non-standard writing systems (e.g., Latin script due to lack of keyboard)
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Some Methods
• Special handling of non-words like emojis, hashtags, URLs
• Creating synthetic noisy training data
• Adversarial training
• Resources
— Machine translation of noisy text data set (MTNT)
— WMT 2020 Shared Task on Machine Translation Robustness
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
bias
Philipp Koehn Machine Translation: Current Challenges 3 November 2022
Gender Bias
The doctor asked the nurse to help her in the procedure
El doctor le pidio a la enfermera que le ayudara con el procedimiento
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Gender Bias
English ▼ Spanish ▼
the doctor said: x La doctora dijo: toma
take the pill. la pildOra. (feminine)
^ ID
El doctor dijo: toma la
PildOra. (masculine)
Open in Google Translate Feedback
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Robustness to Style
"You Sound Just Like Your Father" Commercial Machine Translation Systems Include Stylistic Biases
Dirk Hovy Federico Bianchi Tommaso Fornaciari
Bocconi University ViaSarfatti 25, 20136 Milan, Italy
{dirk.hovy, f.bianchi, fornaciari.tommaso} @unibocconi.it
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Dialect Bias
• Models often trained only on standard languages (British, American)
• Work less well on other dialects
• Bigger problem for automatic speech recognition
DialBd
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Evaluate Across Language Varieties
44
=4
BLEU score on standard language is not enough Also need test sets for each language variety
Headline quality
Acceptable degradation on important language varieties
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
document-level translation
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Document-Level Translation
The shop is selling a nice table. Jane is quite taken by it. The table would match the chairs in her living room.
• Machine translation translates one sentence at a time
• But: surrounding context may help
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Document-Level Translation
The shop is selling a nice table. Jane is quite taken by it. The table would match the chairs in her living room.
Machine translation translates one sentence at a time
But: surrounding context may help
— translation of pronouns may require co-reference
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Document-Level Translation
The shop is selling a nice table. Jane is quite taken by it. The table would match the chairs in her living room.
• Machine translation translates one sentence at a time
• But: surrounding context may help
— translation of pronouns may require co-reference
— ambiguous words may be informed by broader context
Philipp Koehn Machine Translation: Current Challenges 3 November 2022
Document-Level Translation
The shop is selling a nice table. Jane is quite taken by it. The table would match the chairs in her living room.
• Machine translation translates one sentence at a time
• But: surrounding context may help
— translation of pronouns may require co-reference
— ambiguous words may be informed by broader context
— consistent translation of repeated words
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Conditioning on Broader Context
50
=4
The shop is selling a nice table.
Jane is quite taken by it.
The table would match the chairs in her living room.
Der Laden verkauft einen schönen Tisch.
Full Document Translation
Hierarchical attention
— compute which previous sentences matter most
— compute which words in these sentences matter most
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
Conditioning on Broader Context 51
The shop is selling a nice table. Jane is quite taken by it. The table would match the chairs in her living room.
J I
Der Laden yerMuft ejnen sdfrM^IJsch. Er gefaUt Jane sehr. ...
• Concatenate all sentences together
— document = very long sentence
— special treatment for sentence boundaries
— requires scaling of neural decoding implementation
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022
52
questions?
Philipp Koehn
Machine Translation: Current Challenges
3 November 2022