ENV006 Statistical Thinking and Data Treatment

Přírodovědecká fakulta
podzim 2015
Rozsah
1/2. 3 kr. (příf plus uk plus > 4). Doporučované ukončení: zk. Jiná možná ukončení: k, z.
Vyučující
doc. Mgr. Dominik Heger, Ph.D. (přednášející)
Mgr. Ján Krausko (cvičící)
Mgr. Ľubica Vetráková, Ph.D. (cvičící)
Garance
prof. RNDr. Jana Klánová, Ph.D.
RECETOX – Přírodovědecká fakulta
Kontaktní osoba: doc. Mgr. Dominik Heger, Ph.D.
Dodavatelské pracoviště: RECETOX – Přírodovědecká fakulta
Rozvrh
Po 11:00–13:50 B09/316
Omezení zápisu do předmětu
Předmět je nabízen i studentům mimo mateřské obory.
Mateřské obory/plány
Cíle předmětu
Statistics is a science of drawing conclusions from data. The aim of this introductory statistical course is to learn how to think about the data, and explain few basic concepts in statistics, all applied on practical examples mostly form natural sciences. The course consists of three parts: 1. Descriptive statistics. 2. Probability 3. Inference. Each class weakly requires reading of the textbooks and working down the homework. At the end of the course, the student should be able to appreciate the language of statistics, independently solve certain the problems and know where to look for more advanced methods. The course offers a minimum knowledge to understand terms like standard error of mean, coefficient of determination and regression, statistical testing and others.
Osnova
  • 1. Statistics, data, variables 1. Variable, data, statistics 2. Categories of variables 3. Depicting a categorical variable - bar graph 4. Physical quantities and their notation 5. Number of significant figures 6. Accuracy, precision, and uncertainty 7. Depicting a quantitative variable - Stem and leaf plot
  • 2. Measures of location and spread 1. Depicting a quantitative variable - histogram, percentiles 2. Frequency distribution - histogram, distribution function 3. Measures of location – median, mode, mean 4. Markov's inequality 5. Root meand square 6. Measures of spread – range, interquartile range, Variance, standard deviation, confidence limits
  • 3. Measures of spread 1. SD 2. Chebychev’s inequality 3. Error propagation
  • 4. Normal distribution 1. Affine transformation 2. Standard units 3. Normal distribution
  • 5. Relations between two variables 1. Scatterplot and visual inspection of association 2. Bivariant data, Point of averages. 3. Post Hoc Ergo Propter Hoc fallacy. 4. Linear association - correlation. 5. Correlation coefficient 6. Ecological correlation
  • 6. Regression, Regression analysis 1. Point of averages 2. Graph of averages 3. Regression line 4. Interpolation x extrapolation 5. Vertical residual 6. Regression diagnostics 7. RMS error in regression
  • 7. Probability 1: How to count without counting? 1. What is randomness? 2. Probability of one draw (additional rule, inclusion-exclusion formula, complement rule). 3. Probability of multiple draws (multiplication rules with and without replacement). 4. Probability distributions. 5. Binomial distribution and formula for random variable. 6. Hypergeometrical distribution and formula for simple random samples.
  • 8. Probability 2: Large random samples 1. What is randomness? 2. Probability of one draw (additional rule, inclusion-exclusion formula, complement rule). 3. Probability of multiple draws (multiplication rules with and without replacement). 4. Probability distributions. 5. Binomial distribution and formula for random variable. 6. Hypergeometrical distribution and formula for simple random samples.
  • 9. Probability 3: Central limit theorem 1. Binomial distribution - expected value and standard error 2. de Moivre-Laplace Theorem 3. Central limit theorem 4. Sampling without replacement: the correction factor
  • 10. Inference 1 1. Population x Sample; Parameter x Estimator (Statistics) 2. Mean squared error; Unbiased estimator 3. s* - Bootstrap estimate of SD of the box 4. Sample standard deviation - s 5. Confidence interval 6. Statistical tests
  • 11. Inference 2 1. One-sample z test - Significance level, power of the test, P-value - the observed significance level 2. Two-tailed test 3. One-sample t-test 4. Two samples test - standard error of the difference, s_pooled 5. Paired-samples 6. Multisample hypotheses - ANOVA, Multiple comparisons - Tukey honestly significant difference test, Goodness of the fit 7. R^2 - coefficient of determination
Literatura
  • https://www.edx.org/course/uc-berkeleyx/uc-berkeleyx-stat2-1x-introduction-1138#.VBhMghaqIhY http://www.stat.berkeley.edu/~stark/SticiGui/ Jerrold H. Zar: Biostatistical Analsis Statistics (4th edition) by Freedman, Pisani, and Purves; W.W. Norton, 2007
Výukové metody
Lectures are intended to verbally explain the problems with the help of blackboard. Sometimes they are supported by necessary presentations. There are always handful of practical examples that we try to solve together during the class - mostly on the blackboards, sometimes with the help of some software (Excel, Statiscica, R, Maple]. Each lecture ends with an overview and is summarized by comprehension questions, which are used for the discussion at the beginning of next class. Students are asked to explain topics to their colleagues. Each week, there is a recommended reading in the online textbook, including a video and java active exercises (http://www.stat.berkeley.edu/~stark/SticiGui/index.htm). Also, there are an IS graded homework, weekly.
Metody hodnocení
Grading rules: Homework 25 % Partial Tests 25 % Final Test 50 % Oral exam - if required A > 90 %, B > 80 %, C > 70 %, D > 60 %, E > 50 %
Vyučovací jazyk
Angličtina
Informace učitele
https://is.muni.cz/auth/el/1431/podzim2015/ENV006/index.qwarp
Další komentáře
Předmět je vyučován každoročně.
Předmět je zařazen také v obdobích podzim 2011, podzim 2011 - akreditace, podzim 2012, podzim 2013, podzim 2014.
  • Statistika zápisu (nejnovější)
  • Permalink: https://is.muni.cz/predmet/sci/podzim2015/ENV006