CJBB75 Using a language corpus - elementary skills

Faculty of Arts
Spring 2014
Extent and Intensity
0/2/0. 3 credit(s). Type of Completion: z (credit).
Teacher(s)
doc. PhDr. Klára Osolsobě, Dr. (lecturer)
Guaranteed by
doc. PhDr. Klára Osolsobě, Dr.
Department of Czech Language – Faculty of Arts
Contact Person: Jaroslava Vybíralová
Supplier department: Department of Czech Language – Faculty of Arts
Timetable
each even Wednesday 9:10–10:45 G13
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
The capacity limit for the course is 20 student(s).
Current registration and enrolment status: enrolled: 0/20, only registered: 0/20, only registered with preference (fields directly associated with the programme): 0/20
fields of study / plans the course is directly associated with
Course objectives
The seminar focuses on development of elementary skills when using a language corpus.
Syllabus
  • 1. Bonito - search engine, the full access to the Czech National Corpus - CNC (agreement and registration) 2. Available Corpora of CNC 2. How to search in corpora 3. Morphological tagging (word, lemma, part of speech, ...) 4. Processing of the found data (alphabetical classification, etc.) 5. Grammar and corpora - data observation and data mining 6. Corporus evidence versus grammatical rule 7. Problem solving and its presentation: How to search in untagged corpus 8. Problem solving and its presentation: Formal morphology 9. Problem solving and its presentation: Word formation 10. Problem solving and its presentation: Lexicon 11. Problem solving and its presentation: Morphological variants 12. Problem solving and its presentation: Concluding from the corpus evidence
Literature
  • ŠULC, Michal. Korpusová lingvistika : první vstup. 1. vyd. Praha: Karolinum. 94 s. ISBN 8071848476. 1999. info
  • Jak využívat Český národní korpus. Edited by František Čermák - Renata Blatná. Vydání první. Praha: NLN, Nakladatelství Lidové noviny. 181 stran. ISBN 8071067369. 2005. info
  • Český národní korpus :úvod a příručka uživatele. Edited by Jan Kocek - Marie Kopřivová - Karel Kučera. Vyd. 1. Praha: Filozofická fakulta UK - Ústav Českého národního korpusu. 156 s. ISBN 80-85899-94-9. 2000. info
Teaching methods
A practical introduction to corpora will be followed by a set of exemplary exercises, homeworks and follow-up class discussion.
Assessment methods
Final project: problem solving and its presentation. During the course every student will hand in 3 homeworks (1-3 pages). One homework will have a form of 10 minutes presentation (in ppt format).
Language of instruction
Czech
Follow-Up Courses
Further comments (probably available only in Czech)
Study Materials
The course is taught each semester.
General note: Jde o inovovaný kurz se starým názvem "Základy práce s korpusem". Nezapisují si ho studenti, kteří již v minulosti tento předmět pod starým názvem absolvovali. Před návštěvou CJBB75 je doporučeno absolvovat CJBB105 // oba předměty navštěvovat současně.
Information on course enrolment limitations: Předmět je povinný pro studenty Č. jazyka se specializací počítač. lingvistika, tito dostávají při zápisu přednost.
The course is also listed under the following terms Spring 2004, Autumn 2004, Spring 2005, Spring 2006, Autumn 2006, Spring 2007, Autumn 2007, Spring 2008, Autumn 2008, Spring 2009, Autumn 2009, Spring 2010, Autumn 2010, Spring 2011, Autumn 2011, Spring 2012, Autumn 2012, Spring 2013, Spring 2015, Spring 2016, Spring 2017, Spring 2018, Autumn 2018, Spring 2019, Spring 2020, Spring 2021, Spring 2022, Spring 2023, Spring 2024, Spring 2025.
  • Enrolment Statistics (Spring 2014, recent)
  • Permalink: https://is.muni.cz/course/phil/spring2014/CJBB75