I047 Introduction to Corpus Linguistics and Computer Lexicography

Faculty of Informatics
Spring 2002
Extent and Intensity
2/0. 2 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).
Teacher(s)
prof. PhDr. Karel Pala, CSc. (lecturer)
Guaranteed by
prof. PhDr. Karel Pala, CSc.
Department of Machine Learning and Data Processing – Faculty of Informatics
Contact Person: prof. PhDr. Karel Pala, CSc.
Timetable
Tue 16:00–17:50 A107
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
fields of study / plans the course is directly associated with
Syllabus
  • Introduction to Corpus Linguistics and Computational Lexicography
  • Information technologies and language (text) corpora. Beginning of corpus linguistics, purpose of corpora.
  • Building corpora, collecting corpus data and their standardization, SGML, TEI, representativeness of corpora, their maintenance.
  • Corpora tools, query processors: CQP, MANATEE, concordance programmes -- XKWIC, OCP, LEXA, WORDCRUNCHER. Queries, regular expressions and their use. Statistical programmes, absolute and relative frequencies, M/I and T-score. Sorting programmes, different codings, code conversions.
  • Annotated corpora, tagging on various levels: structural tagging (SGML), grammatical tagging -- POS, lemmata, word forms, program AJKA.
  • Syntactic tagging, treebanks, skeleton analysis, constraint grammars, desambiguation on morphological and syntactic level.
  • Parallel corpora, alignment programes.
  • Czech National Corpus, working with CNC, words, constructions, collocations. Building dictionaries.
  • Basic concepts of Computational Lexicography.
Language of instruction
Czech
Further comments (probably available only in Czech)
The course is taught annually.
The course is also listed under the following terms Spring 1996, Spring 1997, Spring 1998, Spring 1999, Spring 2000, Spring 2001.
  • Enrolment Statistics (recent)
  • Permalink: https://is.muni.cz/course/fi/spring2002/I047