FI:PA154 Language Modeling - Course Information
PA154 Language ModelingFaculty of Informatics
- Extent and Intensity
- 2/0. 2 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).
- doc. Mgr. Pavel Rychlý, Ph.D. (lecturer)
RNDr. Miloš Jakubíček, Ph.D. (seminar tutor)
RNDr. Vojtěch Kovář, Ph.D. (seminar tutor)
- Guaranteed by
- doc. RNDr. Aleš Horák, Ph.D.
Department of Machine Learning and Data Processing - Faculty of Informatics
Contact Person: doc. Mgr. Pavel Rychlý, Ph.D.
Supplier department: Department of Machine Learning and Data Processing - Faculty of Informatics
- Mon 10:00–11:50 C416
- Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
- Fields of study the course is directly associated with
- there are 19 fields of study the course is directly associated with, display
- Course objectives
- This course aims at providing the students with state-of-the-art in (mainly statistical) methods, algorithms and tools used for processing of large text corpora when they are created or subject to subsequent information retrieval.
These tools are practically used in many areas of natural language processing (semiautomatic building of text corpora, morphological analysis and desambiguation, syntactic analysis, effective indexation and search in text corpora, statistical machine translation, semantic analysis etc.).
At the end of the course students will not only be able to use these tools, but mainly will understand the related theories and algorithms, which is often a key competence for the right (effective and correct) usage of these tools.
- NLTK toolkit
- Elements of Probability and Information Theory
- Language Modeling in General and the Noisy Channel Model
- Smoothing and the Expectation-Maximization algorithm
- Markov models, Hidden Markov Models (HMMs)
- Viterbi Algorithm
- Tagging methods, HMM Tagging, Statistical Transformation Rule-Based Tagging
- Statistical Alignment and Machine Translation
- Text Categorization and Clustering
- Graphical Models
- Parallelization, MapReduce
- Teaching methods
- Assessment methods
- Written exam.
- Language of instruction
- Further Comments
- Study Materials
The course is taught annually.