FF:AJ22099 Statistics and Corpora

Selected courses (1)

Faculty of Arts

Autumn 2014

AJ22099 Statistics and Corpora for Cognitive Linguistics

AJ22099 Statistics and Corpora for Cognitive Linguistics

Faculty of Arts
Autumn 2014

Extent and Intensity

0/20/0. 2 credit(s) (plus 3 credits for an exam). Recommended Type of Completion: zk (examination). Other types of completion: z (credit).

Teacher(s)

Dylan Glynn (lecturer), doc. PhDr. Jana Chamonikolasová, Ph.D. (deputy)
doc. Wei-lun Lu, Ph.D. (lecturer)

Guaranteed by

Jeffrey Alan Vanderziel, B.A.
Department of English and American Studies – Faculty of Arts
Contact Person: Tomáš Hanzálek
Supplier department: Department of English and American Studies – Faculty of Arts

Course Enrolment Limitations

The course is only offered to the students of the study fields the course is directly associated with.

The capacity limit for the course is 15 student(s).
Current registration and enrolment status: enrolled: 0/15, only registered: 0/15

fields of study / plans the course is directly associated with

there are 8 fields of study the course is directly associated with, display

Course objectives

To run the basic functionalities of R, the most widely used statistical program in the social sciences. To be able to apply and interpret tests for statistical significance To be able apply multivariate statistical analyses on complex categorical data. To be able to apply and interpret binary logistic regression modelling of complex categorical

Syllabus

Monday 27 Oct (3 hr) Session 1. Introduction A summary of the method, it’s strengths and weakness and possible applications. In this session you sit and listen. Session 2. Research Question Choice of research topic, collection of data, and development of a coding schema. In this session we work together to obtain data to learn the method. Wednesday 29 Oct (4.5 hr) Session 3. R The statistical program for doing statistics. In this session, we introduce the program R and learn how to manipulate data etc Session 4. Statistical significance Probability that what we have found is not just chance In this session, we introduce basic tests for statistical significance such as the Chi-Squared Test. Session 5. Correspondence analysis A multivariate statistical technique for identifying complex patterns of association in the data In this session we learn how to apply and interpret the technique Thursday 30 Oct (4.5 hr) Session 6. Cluster Analysis
A multivariate statistical technique for identifying groups of similarities in the data In this session we learn how to apply and interpret the technique

Session 7. Logistic Regression
Confirmatory modelling of data and calculation of predictive accuracy. In this session we learn how to apply and interpret the technique

Session 8. Logistic Regression (cont.)
Confirmatory modelling of data and calculation of predictive accuracy. In this session we learn how to apply and interpret the technique

Friday 31 Oct (After the Linguistics Student Workshop and the public talk) Session 9. Optional - Your research! (1.5 hr)
This session will be devoted to the participants’ personal projects and how the techniques can be applied to their research or studies. Also, considering the tight schedule, this session may be used to go back to problems encountered during the course.

Literature

Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
Glynn, D. & Robinson, J. (Eds.). (2014). Corpus Methods for Semantics. Quantitative studies in polysemy and synonymy. Amsterdam & Philadelphia: John Benjamins.
Gries, St. Th., & Stefanowitsch, A. (Eds.). (2006). Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis. Berlin & New York: Mouton de Gruyter.
Divjak, D. (2010a). Structuring the lexicon: A clustered model for near-synonymy. Berlin & New York: Mouton de Gruyter.
Glynn, D. & Robinson, J. (Eds.). (2014). Corpus Methods for Semantics. Quantitative studies in polysemy and synonymy. Amsterdam & Philadelphia: John Benjamins.
Dalgaard, P. (2008). Introductory statistics with R (2nd ed.). Dordrecht: Springer.
Geeraerts, D., Grondelaers, St., & Bakema, P. (1994). The structure of lexical variation: Meaning, naming, and context. Berlin & New York: Mouton de Gruyter.
Glynn, D., & Fischer, K. (Eds.) (2010). Quantitative Cognitive Semantics: Corpus-driven approaches. Berlin & New York: Mouton de Gruyter.
Harrell, F. (2002). Regression modeling strategies: With Applications to linear models, logistic regression, and survival analysis. Heidelberg & New York: Springer Maindonald & Baun 2003 - Data Analysis and Grpahics using R
Gries, St. Th. (2013). Statistics for linguistics with R: A practical introduction (2nd ed.). Berlin & New York: Mouton de Gruyter.
Husson, F., Lê, S., & Pagès, J. (2011). Exploratory multivariate analysis by example using R. London: Chapman & Hall.
Dirven, R., Goossens, L., Putseys, Y., & Vorlat, E. (1982). The scene of linguistic action and its perspectivization by speak, talk, say, and tell. Amsterdam & Philadelphia: John Benjamins.
Johnson, K. (2008). Quantitative methods in linguistics. Oxford: Blackwell.
Faraway, J. (2006). Extending the linear model with R: Generalized linear, mixed effects and nonparametric regression models. London: Taylor & Francis
Hosmer, D., & Lemeshow, S. (2013) [1989, 2000]. Applied logistic regression. Hoboken: John Wiley.

Language of instruction

English

Further comments (probably available only in Czech)

Study Materials
The course is taught only once.
Information on the per-term frequency of the course: Proběhne v týdnu od 27. 10. 2014.
The course is taught: in blocks.
Note related to how often the course is taught: Proběhne v týdnu od 27. 10. 2014.

Enrolment Statistics (recent)
Permalink: https://is.muni.cz/course/phil/autumn2014/AJ22099

FF:AJ22099 Statistics and Corpora - Course Information

Faculty of Arts

Autumn 2014

AJ22099 Statistics and Corpora for Cognitive Linguistics

Other applications