AJ22099 Statistics and Corpora for Cognitive Linguistics

Faculty of Arts
Autumn 2014
Extent and Intensity
0/20/0. 2 credit(s) (plus 3 credits for an exam). Recommended Type of Completion: zk (examination). Other types of completion: z (credit).
Teacher(s)
Dylan Glynn (lecturer), doc. PhDr. Jana Chamonikolasová, Ph.D. (deputy)
doc. Wei-lun Lu, Ph.D. (lecturer)
Guaranteed by
Jeffrey Alan Vanderziel, B.A.
Department of English and American Studies – Faculty of Arts
Contact Person: Tomáš Hanzálek
Supplier department: Department of English and American Studies – Faculty of Arts
Course Enrolment Limitations
The course is only offered to the students of the study fields the course is directly associated with.

The capacity limit for the course is 15 student(s).
Current registration and enrolment status: enrolled: 0/15, only registered: 0/15
fields of study / plans the course is directly associated with
there are 8 fields of study the course is directly associated with, display
Course objectives
To run the basic functionalities of R, the most widely used statistical program in the social sciences. To be able to apply and interpret tests for statistical significance To be able apply multivariate statistical analyses on complex categorical data. To be able to apply and interpret binary logistic regression modelling of complex categorical
Syllabus
  • Monday 27 Oct (3 hr) Session 1. Introduction A summary of the method, it’s strengths and weakness and possible applications. In this session you sit and listen. Session 2. Research Question Choice of research topic, collection of data, and development of a coding schema. In this session we work together to obtain data to learn the method. Wednesday 29 Oct (4.5 hr) Session 3. R The statistical program for doing statistics. In this session, we introduce the program R and learn how to manipulate data etc Session 4. Statistical significance Probability that what we have found is not just chance In this session, we introduce basic tests for statistical significance such as the Chi-Squared Test. Session 5. Correspondence analysis A multivariate statistical technique for identifying complex patterns of association in the data In this session we learn how to apply and interpret the technique Thursday 30 Oct (4.5 hr) Session 6. Cluster Analysis
    A multivariate statistical technique for identifying groups of similarities in the data In this session we learn how to apply and interpret the technique

    Session 7. Logistic Regression
    Confirmatory modelling of data and calculation of predictive accuracy. In this session we learn how to apply and interpret the technique

    Session 8. Logistic Regression (cont.)
    Confirmatory modelling of data and calculation of predictive accuracy. In this session we learn how to apply and interpret the technique

    Friday 31 Oct (After the Linguistics Student Workshop and the public talk) Session 9. Optional - Your research! (1.5 hr)
    This session will be devoted to the participants’ personal projects and how the techniques can be applied to their research or studies. Also, considering the tight schedule, this session may be used to go back to problems encountered during the course.
Literature
  • Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
  • Glynn, D. & Robinson, J. (Eds.). (2014). Corpus Methods for Semantics. Quantitative studies in polysemy and synonymy. Amsterdam & Philadelphia: John Benjamins.
  • Gries, St. Th., & Stefanowitsch, A. (Eds.). (2006). Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis. Berlin & New York: Mouton de Gruyter.
  • Divjak, D. (2010a). Structuring the lexicon: A clustered model for near-synonymy. Berlin & New York: Mouton de Gruyter.
  • Glynn, D. & Robinson, J. (Eds.). (2014). Corpus Methods for Semantics. Quantitative studies in polysemy and synonymy. Amsterdam & Philadelphia: John Benjamins.
  • Dalgaard, P. (2008). Introductory statistics with R (2nd ed.). Dordrecht: Springer.
  • Geeraerts, D., Grondelaers, St., & Bakema, P. (1994). The structure of lexical variation: Meaning, naming, and context. Berlin & New York: Mouton de Gruyter.
  • Glynn, D., & Fischer, K. (Eds.) (2010). Quantitative Cognitive Semantics: Corpus-driven approaches. Berlin & New York: Mouton de Gruyter.
  • Harrell, F. (2002). Regression modeling strategies: With Applications to linear models, logistic regression, and survival analysis. Heidelberg & New York: Springer Maindonald & Baun 2003 - Data Analysis and Grpahics using R
  • Gries, St. Th. (2013). Statistics for linguistics with R: A practical introduction (2nd ed.). Berlin & New York: Mouton de Gruyter.
  • Husson, F., Lê, S., & Pagès, J. (2011). Exploratory multivariate analysis by example using R. London: Chapman & Hall.
  • Dirven, R., Goossens, L., Putseys, Y., & Vorlat, E. (1982). The scene of linguistic action and its perspectivization by speak, talk, say, and tell. Amsterdam & Philadelphia: John Benjamins.
  • Johnson, K. (2008). Quantitative methods in linguistics. Oxford: Blackwell.
  • Faraway, J. (2006). Extending the linear model with R: Generalized linear, mixed effects and nonparametric regression models. London: Taylor & Francis
  • Hosmer, D., & Lemeshow, S. (2013) [1989, 2000]. Applied logistic regression. Hoboken: John Wiley.
Language of instruction
English
Further comments (probably available only in Czech)
Study Materials
The course is taught only once.
Information on the per-term frequency of the course: Proběhne v týdnu od 27. 10. 2014.
The course is taught: in blocks.
Note related to how often the course is taught: Proběhne v týdnu od 27. 10. 2014.

  • Enrolment Statistics (recent)
  • Permalink: https://is.muni.cz/course/phil/autumn2014/AJ22099