PA164 Machine learning and natural language processing

Faculty of Informatics
Autumn 2022
Extent and Intensity
2/1/0. 3 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: z (credit).
Taught in person.
Teacher(s)
doc. RNDr. Lubomír Popelínský, Ph.D. (lecturer)
doc. Mgr. Bc. Vít Nováček, PhD (lecturer)
RNDr. Ondřej Sotolář (assistant)
Guaranteed by
doc. RNDr. Lubomír Popelínský, Ph.D.
Department of Machine Learning and Data Processing – Faculty of Informatics
Supplier department: Department of Machine Learning and Data Processing – Faculty of Informatics
Timetable
Mon 14:00–15:50 A217
  • Timetable of Seminar Groups:
PA164/01: each even Wednesday 18:00–19:50 B116, V. Nováček
Prerequisites
The basics of machine learning (e.g. IB031), computational linguistics (e.g. PA153) and neural networks (e.g. PV021), is assumed. The course is given in English (or in Czech depending on the audience). Task solutions can be in English, Czech or Slovak (exceptionally in another language).
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
fields of study / plans the course is directly associated with
there are 55 fields of study the course is directly associated with, display
Course objectives
Students will obtain knowledge about methods and tools for text mining and natural language learning. At the end of the course students should be able to create systems for text analysis by machine learning methods. Students are able to understand, explain and exploit contents of scientific papers from this area.
Learning outcomes
A student will be able
- to pre-process text data for text mining;
- to build a system for analysis of text by means of machine learning;
- to understand research papers from this area;
- to write a technical report.
Syllabus
  • Natural language processing(NLP). Corpora. Tools for NLP.
  • Inroduction to machine learning
  • Disambiguation. Morphological disambiguaiton and word-sense disambiguation
  • Shallow parsing and machine learning
  • Entity recognition and collocations
  • Document categorization
  • Information extraction from text
  • Keyness. Keyword detection
  • Anomaly detection in text. Novelty detection
  • Document and term clustering
  • Web mining
Literature
    recommended literature
  • Charu C. Aggarwal, Machine Learning for Text. Springer 2018
  • MANNING, Christopher D. and Hinrich SCHÜTZE. Foundations of statistical natural language processing. Cambridge: MIT Press. xxxvii, 68. ISBN 0-262-13360-1. 1999. info
  • LIU, Bing. Web data mining : exploring hyperlinks, contents, and usage data. Berlin: Springer. xix, 532. ISBN 9783540378815. 2007. info
    not specified
  • Mining text data. Edited by Charu C. Aggarwal - ChengXiang Zhai. New York: Springer Science+Business Media. xi, 522. ISBN 9781461432227. 2012. info
Teaching methods
a lecture combined with demonstrations and a work on a project
Assessment methods
Combination of written and oral examination. A defence of a project is as a part of the examination.
Language of instruction
English
Further Comments
Study Materials
The course is taught annually.
Teacher's information
http://www.fi.muni.cz/~popel/lectures/ll/
The course is also listed under the following terms Autumn 2003, Autumn 2004, Autumn 2005, Autumn 2006, Autumn 2007, Autumn 2008, Autumn 2009, Autumn 2010, Autumn 2011, Autumn 2012, Autumn 2013, Autumn 2014, Autumn 2015, Autumn 2016, Autumn 2017, Autumn 2018, Autumn 2020, Autumn 2021, Autumn 2023.
  • Enrolment Statistics (Autumn 2022, recent)
  • Permalink: https://is.muni.cz/course/fi/autumn2022/PA164