PA153 Natural Language Processing

Faculty of Informatics
Autumn 2018
Extent and Intensity
2/0/0. 2 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).
Teacher(s)
prof. PhDr. Karel Pala, CSc. (lecturer), doc. RNDr. Aleš Horák, Ph.D. (deputy)
RNDr. Marek Medveď, Ph.D. (seminar tutor)
RNDr. Vojtěch Kovář, Ph.D. (assistant)
RNDr. Zuzana Nevěřilová, Ph.D. (alternate examiner)
Guaranteed by
doc. RNDr. Aleš Horák, Ph.D.
Department of Machine Learning and Data Processing – Faculty of Informatics
Contact Person: prof. PhDr. Karel Pala, CSc.
Supplier department: Department of Machine Learning and Data Processing – Faculty of Informatics
Timetable
Wed 12:00–13:50 B410
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
fields of study / plans the course is directly associated with
Course objectives
The course offers a deeper knowledge about the natural language processing and computational linguistics.
The students will learn about the particular levels of linguistic analysis - morphology, syntax, semantics and pragmatics.
They will be able to use language data - corpora, types of corpora, corpus tools, perform tagging corpus texts, disambiguation with rule based and statistical systems.
They will be acquainted with representation of the morphological stuctures, notation and algorithms for morphological analysis.
The students will be able to work with the representations of syntactic structures - formal grammars and their types. They will learn about context-free, functional and definite-clause grammars and related parsing algorithms.
The data structures such as valency frames and their types will be explained as well.
They will learn about lexical semantics - meanings of words and collocations, machine readable dictionaries, lexical databases (WordNet, EuroWordNet, thesauri).
Semantic analysis of sentence, principles of logical semantic and Normal Translation Algorithm will be presented.
Pragmatics and discourse analysis and its segmentation, anaphora and (co-)reference will be explained.
The students obtain basic knowledge about dialogue systems, inference systems and knowledge representation for NLP systems.
They will be able to understand the principles of the communication agents and main evaluation techniques.
Syllabus
  • Natural language processing and computational linguistics.
  • Natural language and understanding.
  • Levels of linguistic analysis - morphology, syntax, semantics.
  • Language data - corpora. Types of corpora. Corpus tools. Tagging corpus texts. Disambiguation, rule based and statistical systems.
  • Representation of the morphological stuctures, notation, morphological algorithms.
  • Representation of syntactic structures - formal grammars and their types. Context-free and definite-clause grammars. Parsing algorithms. Valency frames and their types.
  • Semantic representation. Lexical meanings (words and collocations), machine readable dictionaries, lexical databases (WordNet, EuroWordNet, thesauri).
  • Semantic analysis of sentence meaning, Normal Translation Algorithm.
  • Pragmatics.
  • Discourse analysis and its segmentation. Anaphora and (co-)reference.
  • Inference and knowledge representation for NL systems.
  • Dialogue systems.
  • Communication agents.
  • Evaluation techniques
Literature
  • ALLEN, James. Natural language understanding. 2nd ed. Redwood City: Benjamin/Cummings Publishing Company, 1995, xv, 654 s. ISBN 0-8053-0334-0. info
  • CHOMSKY, Noam. Syntaktické struktury. gramatické pravidlo. Praha: Academia, 1966, 209 s. URL info
Teaching methods
Teaching is performed in the form of oral lectures and seminars, in which the slides and demos of the relevant software tools are combined. Students work out homeworks, prepare presentations based on the literature they had read and develop smaller projects. At the appropriate points of the teaching the open dialog between a teacher and students is used.
Assessment methods
written test
Language of instruction
Czech
Further Comments
Study Materials
The course is taught annually.
The course is also listed under the following terms Autumn 2002, Autumn 2003, Autumn 2004, Autumn 2005, Autumn 2006, Autumn 2007, Autumn 2008, Autumn 2009, Autumn 2010, Autumn 2011, Autumn 2012, Autumn 2013, Autumn 2014, Autumn 2015, Autumn 2016, Autumn 2017, Autumn 2019, Autumn 2020, Autumn 2021, Autumn 2022, Autumn 2023, Autumn 2024.
  • Enrolment Statistics (Autumn 2018, recent)
  • Permalink: https://is.muni.cz/course/fi/autumn2018/PA153