PSY532 R101: Praktický úvod pro používání statistického programu R

Fakulta sociálních studií
podzim 2018
Rozsah
1/1/0. 4 kr. Ukončení: z.
Vyučující
Mgr. Hynek Cígler, Ph.D. (přednášející)
Mgr. Vít Gabrhel (přednášející)
doc. Mgr. Stanislav Ježek, Ph.D. (přednášející)
Garance
doc. Mgr. Stanislav Ježek, Ph.D.
Katedra psychologie – Fakulta sociálních studií
Dodavatelské pracoviště: Katedra psychologie – Fakulta sociálních studií
Rozvrh
Po 18:00–19:40 PC25
Omezení zápisu do předmětu
Předmět je určen pouze studentům mateřských oborů.

Předmět si smí zapsat nejvýše 15 stud.
Momentální stav registrace a zápisu: zapsáno: 0/15, pouze zareg.: 0/15
Mateřské obory/plány
předmět má 48 mateřských oborů, zobrazit
Cíle předmětu
This course has three aims. First, it is a course in how to become completely independent of SPSS, should you find yourself in a workplace without an expensive SPSS license. A second aim is to provide a “refresher” course in common statistical analyses. No matter what software you see yourself using in the future, this is a chance to revise statistical theory and even learn new concepts that R requires you to consider as you tell it what to do. A final aim is to make you excited about the wide range of analyses possible in R. Often, when reviewers comment on drafts of your papers, they will suggest small statistical checks that are easier to perform in R than SPSS. Having completed this course, you should feel confident exploring any of these specialised operations in R.
Osnova
  • 1. Data management: Entering, inspecting and “cleaning” your data in R This lecture and seminar set will describe how to import data into R from Excel and SPSS. Students will additionally become acquainted with the R-Studio interface, which divides the screen into four meaningful windows for viewing and analysing data. The data set we will be using for most of the course will also be introduced, alongside the methods of obtaining means, standard deviations and other descriptive statistics in R. 2. Basic ANOVA and regression The lecture will briefly revise the principles of analysis of variance and regression before running through examples in R. We will cover t-tests, one-way ANOVA with contrasts and post-hoc tests, factorial ANOVA, repeated measures ANOVA, analysis of covariance, simple linear regression, and hierarchical linear regression. The seminar will introduce “bootstrapping” in relation to ANOVA and regression. 3. Graphing R has many “apps” (or “packages”) for drawing graphs, and, in these two weeks we will learn one of the most popular, a package called ggplot2. While enabling you to draw many different kinds of quick graphs to explore data, this package also allows you to adjust graph features for publications and presentations. 4. Working with skewed, clustered and categorical data Often we wish to examine group-based differences in variables that are not normally distributed, clustered based on some other variable such as gender, or categorical (e.g., responses belonging to one of two categories – “yes” and “no”). This lecture-seminar set will cover techniques for working with this kind of data: generalized linear modelling, multilevel modelling, logistic regression and chi-square analysis. A new data set will be introduced for some of the analyses. Diverting our attention from R a little, we will discuss how results from some of these analyses tend to be interpreted and reported in research articles. 5. Bayesian data analysis We will dedicate one week to Bayesian data analysis, an approach to hypothesis testing that is gaining popularity around the world. SPSS is not suitable for this kind of analysis, whereas R is among the programs that offer many options. Under the Bayesian approach, belief in a hypothesis should be based on the data and prior assumptions about the probability of the hypothesis. The lecture will highlight the convenience of reporting degree of belief in a hypothesis as opposed to the usual practice of reporting degree of belief in the hypothesis if the null hypothesis were true. Our exploration of “prior assumptions” will be largely in the context of a Bayesian analysis R package (arm) where the prior assumption is that the variable being predicted in a linear model has few rather than many predictors. The package will be used to re-analyse our course data sets, with reporting and interpretation discussed. 6. Handling missing data Among researchers using SPSS, a common approach to handling missing data is the replacement of missing values according to an expectation-maximisation (EM) algorithm. We will discuss what this algorithm means in broad terms and then consider the advantages of using the algorithm to calculate multiple possible missing values instead of just one. The norm package in R does precisely this, giving us what is termed a “multiple imputation”. The lecture will cover an alternative approach to multiple imputation that is useful if categorical variables are among those missing. The mi package developed for this purpose will be demonstrated. In the seminar, we will put R aside for another moment and discuss an emerging methodological trend – planned missingness. This is the approach of asking participants to answer a smaller subset of survey questions rather than the full set. Analysis of the full survey is subsequently possible following multiple imputation.
Informace učitele
Další komentáře
Studijní materiály
Předmět je zařazen také v obdobích podzim 2014, jaro 2015, podzim 2016, podzim 2017.
  • Statistika zápisu (nejnovější)
  • Permalink: https://is.muni.cz/predmet/fss/podzim2018/PSY532