FF:CJBB105 Introduction in Corp. Linguist - Course Information
	CJBB105 Introduction in Corpus Linguistics - Lecture
Faculty of ArtsSpring 2020
- Extent and Intensity
- 2/0/0. 4 credit(s). Type of Completion: zk (examination).
- Teacher(s)
- Mgr. Dana Hlaváčková, Ph.D. (lecturer)
- Guaranteed by
- Mgr. Dana Hlaváčková, Ph.D.
 Department of Czech Language – Faculty of Arts
 Contact Person: Jaroslava Vybíralová
 Supplier department: Department of Czech Language – Faculty of Arts
- Timetable
- Tue 10:00–11:40 D41
- Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
 The capacity limit for the course is 40 student(s).
 Current registration and enrolment status: enrolled: 0/40, only registered: 0/40, only registered with preference (fields directly associated with the programme): 0/40
- fields of study / plans the course is directly associated with
- there are 17 fields of study the course is directly associated with, display
- Course objectives
- The aim of the course is to give the first information about corpus-based approach to language and linguistics. Following issues are to be discussed:
 1) Corpus linguistics – history.
 2) What is a corpus and what is in it?
 3) Quantitative data.
 4) The use of corpora in language studies.
 5) Corpora and computational linguistics.
 6) Corpus managers.
 7) Part of speech analysis and tagging of a corpus.
 8) Czech national corpus.
 9) Corpora at MU.
- Learning outcomes
- Upon completion of the course the student will be able to:
 - understand the issues of corpus linguistics,
 - understand the basic terminology of the field and use it,
 - orient themself in the corpus typology,
 - know the possibilities of using corpora.
- Syllabus
- 1. Corpus Linguistics – History (ÚČNK).
- 2. Building Corpora.
- 3. Corpora of ČNK.
- 4. Automatical Morphological Analysis (tokenization, tagging, disambiguation).
- 5. Some problems of Automatical Morphological Analysis.
- 6. Spoken Language Corpora.
- 7. Corpus of Private Corespondence.
- 8. Corpus Manager.
- 9. Quantitative Data.
- 10. Diachrony and Corpora.
 
- Literature
- Čermák, F.: Jazykový korpus: Prostředek a zdroj poznání. SaS, 56, 1995, s. 119-140.
- Čermák F., Králík J., Kučera K. (1997): Recepce současné češtiny a reprezentativnost korpusu (Výsledky a některé souvislosti jedné orientační sondy na pozadí budování Českého národního korpusu). SaS, 58, 2, s. 118-124.
- Čermák F, Blatná R. (eds.) (1995): Manuál lexikografie. Jinočany : H&H.
- http://ufal.mff.cuni.cz/pdt2.0/index-cz.html
- Čermák František (1999): Oxfordská lexikografie přechází také plně na korpus. Slovo a slovesnost, 60, s. 136-141.
- http://ucnk.ff.cuni.cz/
- Encyklopedický slovník češtiny. Edited by Petr Karlík - Marek Nekula - Jana Pleskalová. Praha: Nakladatelství Lidové noviny, 2002, 604 s. ISBN 80-7106-484-X. info
- Studie z korpusové lingvistiky. Edited by František Čermák - Jana Klímová - Vladimír Petkevič. Vyd. 1. V Praze: Karolinum, 2000, 531 s. ISBN 807184893X. info
- MCENERY, Tony and Andrew WILSON. Corpus linguistics. Edinburgh: Edinburgh University Press, 1996, 209 s. ISBN 0-7486-0482-0. info
- BARNBROOK, Geoff. Language and computers :a practical introduction to the computer analysis of language. Edinburgh: Edinburgh University Press, 1996, ix, 209 s. ISBN 0-7486-0785-4. info
 
- Teaching methods
- A lecture with corpora and corpora tools presentation. Homereading.
- Assessment methods
- Colloquium. Written test: terminology, definitions - (knowledge of texts for homereading). The test will contain ten questions.
- Language of instruction
- Czech
- Follow-Up Courses
- Further Comments
- Study Materials
 The course is taught annually.
- Enrolment Statistics (Spring 2020, recent)
- Permalink: https://is.muni.cz/course/phil/spring2020/CJBB105