PA154 - Technical Informations Introduction PA154 Jazykové modelování (1.1) Pavel Rychlý pary@fi.muni.cz March 2, 2021 Slides and recorded videos in IS https://is.muni.cz/auth/el/fi/jaro2021/PA154/index.qwarp Final written exam (online) 50 points, 25 points for E optional individual projects up to 25 points Individual projects Language models—what are they good for? presentation on a new research in language modeling small project as a part of bigger collaborative projects ► neural machine translation ► lexical acquisition assigning scores to sequencies of words predicting words generating text statistical machine translation automatic speech recognition optical character recognition Predicting words Do you speak ... Would you be so ... Statistical machine ... Faculty of Informatics, Masaryk . WWII has ended in ... In the town where I was ... Lord of the ... Generating text □escribes without et rots PA154 Jazykové modelování (1.1) 6/8 MT + OCR Language models - probability of a sentence BblXOA B rOPOfl LM is a probability distribution over all possible word sequences. What is the probability of utterance of s? P(./vf(Catalonia President urges protests) P(_/vj(President Catalonia urges protests) p^/vj(urges Catalonia protests President) Ideally, the probability should strongly correlate with fluency and intelligibility of a word sequence.