1 Methodology and Statistics in Sports Sciences e059 2 Guarantor and Teacher subject MaSiSS Mgr. Michal Bozděch, Ph.D. ̶ Email: Michal.bozdech@fsps.muni.cz ̶ Phone number: 549 49 6863 ̶ Department of Kinesiology ̶ Room: D33/333 3 Research Methodology MaSiSS Statistica Analysis 1. A review of the Fundamentals 2. Type of Research 1. Selected Types of Qualitative Research 2. Selected Types of Quantitative Research 3. Research Project 4. Research Ethics 5. The Literature Review 6. The Research Hypotheses 7. Data Collection Methods 1. Collecting a Primary and Secondary Data 2. Collecting a Qualitative and Quantitative Data 1. Characteristics of a Good Test 2. Variable Types and Scaling 1. Research Variables 2. Levels of Measurement (Scale of Measure) 3. Descriptive Statistic 1. Measures of Distribution or Frequency 2. Measures of Central Tendency 3. Measures of Dispersion or Variation 4. Measures of Position 4. Inferential or Analytical Statistic 1. Hypothesis (or Predictions) Testing 2. Choosing a Comparison Test 3. Choosing a Correlation Tests 4. Choosing a Regression Tests 5. The effect size 4 Research Methodology MaSiSS Examples of theory Science (Benešová, 1999) - is a systematic, critical and methodological pursuit of true and common knowledge in a defined area of reality Science (Ferjenčík, 2000) - is a comprehensive system of information obtained by a scientific method. Science provides guidelines for examination (methods) and an explanation of collected information (scientific theory) Theory (Zháněl, 2014) - The primary goal of science is , i.e. an attempt to find general explanations of natural phenomena (via science research) (Dovalil et al., 2012) 5 Research Methodology MaSiSS Sport science – Kinesiology, Kinanthropology, Sportwissenschaft - is multidisciplinary science about human (voluntary) movement. Paradigm - a fundamental concept of a certain scientific discipline, which is considered to be exemplary. Defines proper procedures and methods, according to which rules and conventions. Methodology vs methods (Gabriel, 2011) – method is a research tool (interwiew in qualitatice study) and methodology is the justification for using this method. • Methodology – the summary and study of methods, or the science of methods. • Method – a tool or process for monitoring, researching, learning, exploring and achieving certain goals. • Methodical guidelines – specific instructions for carrying out specific activities. 6 Dunning-Kruger effect MaSiSS 7 Type of Research MaSiSS Quantitative researchQualitative research 8 Type of Research MaSiSS Qualitative research - According to Švaříček & Šeďová et al. (2007) it is a process of examining phenomena and problems in an authentic environment in order to obtain a comprehensive picture of these phenomena based on deep data and a specific relationship between a researcher and a research participant. The aim of a qualitative research is to uncover and represent how people understand, experience and create social reality. It focuses on how individuals and/or groups view, understand and interpret the World (Zháněl, 2014), via non-numerical data. Quantitative research - is the process of collecting and analyzing numerical data. It can be used to find patterns and averages, make predictions, test causal relationships, and generalize results to wider populations (Bhandari, 2021). A common mistake: when it is relatively simple to transforming words (eg answers in a questionnaire) into numbers (occurrence) these are quantitative data Mixed research Hendl (2005) then talks about a combination of quantitative and qualitative methods in a single research activity when speaking of a mixed research strategy, in which the results obtained by the two strategies complement each other 9 Type of Qualitative Research MaSiSS Type of Quantitative Research Phenomenological method – how participant experiences, feel have opinion about a specific event or activity. It utilizes indepth interviews, observation or survey to gather information Grounded theory - tries to explain why a course of action evolved the way it did. Need large subject number to developer theoretical model based on existing data (DNA – genetic model) Case study - in-depth look at one test subject (opposite of grounded theory). Various data are compiled to create a bigger conclusion Focus groups - group of individuals who are asked questions about their opinions and attitudes towards certain fenomen Ethnographic research - subjects are experiencing a culture that is unfamiliar to them Descriptive research - seeks to describe the current status of an identified variable. The researcher does not usually begin with an hypothesis, but is likely to develop one after collecting data Correlation research - aim is to determine the extent of a relationship between two or more variables using statistical data. This type of research will recognize trends and patterns in data Causal-Comparative/Quasi-Experimental - establish cause-effect relationships among the variables. The researcher does not randomly assign groups Experimental Research - establish the cause-effect relationship. Subjects are randomly assigned to experimental 10 Type of Research MaSiSS Types of different research (Hendl, 2016) Methodological study, Case study, Comparison, Correlation-predictive study, Experiment, Quasiexperiment, Evaluation, Development studies, Trend analysis, attitudes, Status, Exploration, Historical study, Modelling, Proposal and demonstration, systematic review, Meta-analysis, Theoretical studies, Analytical, Qualitative study Types of different research (Mishra & Alok, 2017) ̶ Descriptive vs. analytical research ̶ Applied vs. fundamental research ̶ Conceptual vs. empirical research Other types of research ̶ one-time research vs. longitudinal research ̶ field-setting research vs. laboratory research vs. model research ̶ clinical vs. diagnostic research ̶ case-study methods vs. exhaustively approaches ̶ exploratory vs. formalized research ̶ The objective of exploratory research (creation of hypotheses rather than their testing) vs. formalized research (specific hypotheses are tested) ̶ conclusion-oriented and decision-oriented research 11 Type of Research Studies MaSiSS RCT* = randomized controlled trial 12 Type of Research Studies MaSiSS Cross-sectional study Longitudinal study One point in time Several point in time Different samples Same sample Change at a societal level Change at the individual level 13 Type of Research Studies MaSiSS Longitudinal study Pseudo-longitudinal study Semi-longitudinal study Several point in time One point in time Several point in time Same sample Different (age) samples Different and same samples Change at the individual level Maturation Until the youngest group become the oldest group 14 Research Project MaSiSS (Williman, 2011) (Elman & Mahoney, 2020; Hendl, 2016) 15 Research Project MaSiSS Phases of the research process (Rockmann & Bömermann, 2006) 1. Formulation of a research problem 2. Research planning 3. Implementation of Research 4. Evaluation of research 5. Publication of research results Scheme of logical thought progress of scientific work (Zháněl, 2014) Research intention → Research problem → Research objective → Research question (hypothesis) Principles of a good research project (Robson, 1993) 1. it says what you want to do and why do you want to do 2. is written clearly and without unnecessary description (secondary facts), 3. is clearly organized and straightforward 16 Research Ethics MaSiSS There are two aspects of ethical issues in research (Williman, 2011) 1. The individual values of the researcher relating to honesty and frankness and personal integrity. 2. The researcher’s treatment of other people involved in the research, relating to informed consent, confidentiality, anonymity and courtesy. Most commen problems are • Plagiarism (or wrong used of citations) • Non-disclosure acknowledgement • Data collection/analysis/interpretation • Disclosure of interest; using third-party materiál Plagiarism - It is a simple matter to follow the clear guidelines in citation that will prevent you being accused of passing off other people’s work as your own (Williman, 2011) 17 Research Ethics MaSiSS Helsinki Declaration General Principles ̶ The health of my patient will be my first consideration ̶ Physician must promote and safeguard the health, well-being and rights of patients ̶ … never take precedence over the rights and interests of individual research subjects ̶ researcher protect the life, health, dignity, integrity, right to self-determination, privacy, and confidentiality of personal information of research subjects ̶ … ̶ Adopted by the 18th The World Medical Association (WMA) General Assembly, Helsinki, Finland, June 1964 ̶ Statement of ethical principles for medical research involving human subjects, including research on identifiable human material and data 18 Research Ethics MaSiSS Cherry pick data • Carefully chosen time range (on x-axis) or data scale (on y-axis) • Picked specific data points can hide important changess in between • This mean grap his not wrong, bud leaving out relevant data can give a misleading impression 19 Research Ethics MaSiSS Cherry pick data Example of cherry-picking (on y-axis) • In 1992 Chevrolet car ads claims that all Chevy trunks sold in the last 10 years are still on the road. For soporting this claim there use this graph. • The graph can also by intepret that Chevy can are twice as dependable as Toyota trucks • Until you see scala of graf, which is from 95% to 100% • Seond graf shows how graf will look on 0% to 100% scale 20 Research Ethics MaSiSS Cherry pick data Example of cherry-picking (on x-axis) • Usuali in line graphs showing something changing over a time • Graph show job loss by quater • But scale on x-asis is inconsistent o September 08 to march 09 = 6 months o March 09 to jun 10 = 15 months 21 Research Ethics MaSiSS Cherry pick data Knowing the full signifikance of data that graph presented • If boths graphs used same data (Annual global ocean average temperatus), whay they look so different? • And if you know that even a rise of half a degree Celsius can cause massive ecological disruption, which graph you thing is more appropriate? 22 Research Ethics MaSiSS Cherry pick data Example of omitting variables ̶ Donald J. Trump: USA has the lowest / best mortality rate. ̶ if we omit better states Bonus ̶ Donald J. Trump: "I noticed that the more tests we do, the more we get infected" = tests cause COVID-19? Define footer – presentation title / department23 Research Ethics MaSiSS Hacking of p-value Also know as data dredging, data snooping, data fishing or data butchery Statistically significant, when in reality, there is no underlying effect (Head et al., 2015; Norman, 2014) Academic journal prefer articles with statistically significant results and researchers are force to publish in high quality journals (with higher Impact Factor, IF) In practice we often encounter so-called p-hacking and salami publication/slicing (splitting research data/results into several publications) with caused false TYPE I ERROR (accept a true null hypothesis – false positive, a) Prevention from p-hacking ̶ Create two sets of data (cross-validation method) ̶ build strong project (don’t change the projekt methodology during the research) ̶ used Bonferroni correction (for more statistical analyses), ̶ Scheffé's method ̶ False discovery rate, ̶ Raw data publishing p-haking leads to false positive results = negative impact to future research field and researchers prestige 24 Research Ethics MaSiSS Hacking of p-value New way to reduce p-hacking or other non-ethical manipulation is 2-step manuscript submission o 1st you submit only introduction and description of your method and journal decides whether to publish before seeing your results 1. step 2. step Blind peer review submission process 25 The Literature Review MaSiSS Step-by-step guide of literature review 1. Search for relevant literature on selected topic a. What? – books, academic sources (journal articles) b. Where? Google Scholar, discovery.muni.cz; Web of science; ScienceDirect; PubMed c. Define your keywords and their synonyms d. For more relevant results use Boolean operators 2. Evaluate and select sources 3. Identify themes, debates and gaps 4. outline your project structure 5. write and rewrite what you wrote “If you steal from one author it’s plagiarism, if you steal from many it’s research” - Wilson Mizner 26 The Literature Review MaSiSS Step-by-step guide of literature review 1. Search for relevant literature on selected topic 2. Evaluate and select sources a. you can’t read everything that has ever been written (unless your topic is very) b. read just abstract, which normally consists of background, aim, methods, results, conclusion and sometimes even recommendations for practice c. save relevant articles (in pdf) for later purposes d. check cite literature to find other relevant sources e. tips: pay attention to the citation count and experts in the selected field 27 The Literature Review MaSiSS Step-by-step guide of literature review 1. search for relevant literature on selected topic 2. Evaluate and select sources 3. Identify themes, debates and gaps a. Find connection between different sources i. in what they agree and disagree, ii. slightly or very different conclusions iii. trends and patterns iv. used process, methods, equipment and statistic tests v. gaps 4. outline your project structure a. different approaches: the most used approach is from general to specific (this applies to chapters, subchapters and also to paragraphs) 5. write and rewrite what you wrote 28 The Research Hypotheses MaSiSS Hypotheses are statements that relate to the existence of the relationship between variables or prediction of defined variables using other variables (Zháněl, 2014) ̶ Without hypothesis the research in unfoucussed ̶ Hypothesis is necessary link between theory and investigation Sources of hypothesis ̶ Theory and studies (literature research) ̶ Observation (from own practice / experience) ̶ Intuition (less then theory and observation) ̶ Culture (behavior or beliefs of social, ethnic or age group) ̶ New trend (possible future experience) 29 The Research Hypotheses MaSiSS Charakcteristic of good hypothesis • Power of prediction – predict the future situation, not only the present situation • Based on observation – If we cannot verificated a thing, which we cannot observed • Simplicity – everyone should be able to understand it • Clarity – It should be clear from ambiguous information • Testability – it should be able to by be tested empirically • Limit the unninteraction • Relevant to problem – A hypothesis is guidance for the identification and solution of the problem • Specific – avoid generalization terms, omit unwanted factors (variables) • Relevant to available researchers techniques – you must know workable techniques before formulating a hypothesis • Provide new suggestion/knowledge/technique/process…- it is not a repetition of what we already know • Consistency and harmony – There must be a close relationship between variables which one is dependent on other (Mill, 1963) 30 The Research Hypotheses MaSiSS Types of research hypotheses Working hypothesis – not very specific, they can be easily modified, used with insufficient data; example: Chocolates before training ensures maximum performance Descriptive hypothesis – variable can be situation, event, organization, person, group, object ̶ Relation hypothesis (describes relationship – positive, negative or casual – between two variables) example: Children from higher incomes families spend more times at leisure physical activities ̶ Formalised hypothesis (cause and effect relationship between independent and dependent variable) example: If families have higher incomes, than they spend more money on leisure physical activities 1. Null hypothesis (H0) – predicts that there is no relationship between two variables; example: after a 3-month training program, there are no statistically significant differences in the muscular strength of the knee extensors between the experimental and control groups. 2. Alternative hypothesis (Ha/H1) – the opposite statement than H0. For acceptance of Ha, H0 must be rejected first. Exapmle: after a 3-month training program, there are statistically significant differences in the muscular strength of the knee extensors between the experimental and control groups 1. Directional (left or right tailed test) – hypothesis in which we can predict effect (positive/negative) of one variable on others. Example: Girls are more flexible than boys 2. Non-directional (two tailed test) – hypothesis in which we cannot predict effect, but stat a relationship between variables (we do not know what kind of difference); example: there will be a difference in the performance of experimental and control groups (Zháněl, 2014; Cauvery et al., 2010) 31 The Research Hypotheses MaSiSS Rejecting the null hypothesis Actual Value (reality) Positive Negative Conclusionfromhypothesis test Positive Positive Positive == TRUE Positive (Power, 1 – b) Negative Positive == FALSE Positive, Type I Error (a) Negative Positive Negative == FALSE Negative Type II Error (b) Negative Negative == TRUE Negative 32 The Research Hypotheses MaSiSS Rejecting the null hypothesis Actual Value (reality) Positive Negative Conclusionfromhypothesis test Positive Positive Positive == TRUE Positive Negative Positive == FALSE Positive, Type I Error (a) Negative Positive Negative == FALSE Negative Type II Error (b) Negative Negative == TRUE Negative Correct Correct 33 The Research Hypotheses MaSiSS Rejecting the null hypothesis ̶ β = probability of a Type II error, ̶ known as a "false negative" ̶ 1-β = probability of a "true positive" ̶ correctly rejecting the null hypothesis. ̶ also known as the Power of the test. ̶ α = probability of a Type I error ̶ known as a "false positive" ̶ 1-α = probability of a "true negative", ̶ correctly not rejecting the null hypothesis 34 The Research Hypotheses MaSiSS Rejecting the null hypothesis 35 Statistical Hypothesis Testing MaSiSS ̶ Null hypothesis (H0) ̶ Assumption about the outcome There is no relationship between two variables (for correlation) There is no difference between the means of two populations (for t-test) ̶ p-value ̶ Propability of observing the result given that the null hypothesis is true ̶ p <= a (0.05): reject H0, different distribution. ̶ p > a (0.05): fail to reject H0, same distribution. ̶ Type I Error ̶ Reject the null hypothesis when there is in fact no significant effect (false positive). ̶ The p-value is optimistically small. ̶ Type II Error ̶ Not reject the null hypothesis when there is a significant effect (false negative). ̶ The p-value is pessimistically large. 36 Statistical Power MaSiSS ̶ Statistical power (or the power of a hypothesis test) ̶ is the probability that the test correctly rejects the null hypothesis. ̶ Statistical power has relevance only when the null is false. (Ellis, 2010) ̶ The higher the statistical power = the lower the probability of making a Type II error ̶ (false negative) = That is the higher the probability of detecting an effect when there is an effect 1-a 1-b 37 Statistical Power MaSiSS ̶ Low Statistical Power: ̶ Large risk of committing Type II errors (a false negative) ̶ High Statistical Power: ̶ Small risk of committing Type II errors ̶ Experimental results with too low statistical power will lead to invalid conclusions about the meaning of the results. ̶ Therefore a minimum level of statistical power must be sought ̶ It is common to design experiments with a statistical power of 80% or better, e.g. 0.80. ̶ This means a 20% probability of encountering a Type II area. ̶ This different to the 5% likelihood of encountering a Type I error for the standard value for the significance level. 38 Statistical Power MaSiSS ̶ It is common to design experiments with a statistical power of 80% or better, e.g. 0.80. ̶ This means a 20% probability of encountering a Type II area. ̶ This different to the 5% likelihood of encountering a Type I error for the standard value for the significance level. G*Power 39 Statistical Power MaSiSS Power Analysis 40 Statistical Power MaSiSS Power Analysis ̶ Effect Size ̶ The quantified magnitude of a result present in the population Pearson’s correlation coefficient (r) for the relationship between variables Cohen’s d for the difference between groups ̶ Sample Size ̶ The number of observations in the sample. ̶ Significance ̶ The significance level used in the statistical test Alpha (a) = 5% or 0.05 Alpha (a) = 1% or 0.01 Alpha (a) = 0.1% or 0.001 ̶ Statistical Power ̶ The probability of accepting the alternative hypothesis if it is true. 41 Statistical Power MaSiSS Power Analysis All variables are related: a larger sample size can make an effect easier to detect the statistical power can be increased in a test by increasing the significance level 42 Statistical Power MaSiSS Power Analysis All variables are related: The statistical power can be estimated given an effect size, sample size and significance level The sample size can be estimated given different desired levels of significance 43 Statistical Power MaSiSS Power Analysis Power analysis answers questions: ̶ How much statistical power does my study have? ̶ How big a sample size do I need? ̶ Or for estimation of the minimum sample size reguired for an experiment Priori Power Analysis = Power analyses made before a study is conducted Post-hoc Power (Observed Power) = used as a follow-up analysis, if a finding is non significant 44 Data Collection Methods MaSiSS Collecting of a Primary and Secondary Data Collecting a Qualitative and Quantitative Data 45 Sampling MaSiSS ̶ The study of the total population is not possible and it is impracticable. ̶ The practical limitation cost, time, and other factors which are usually operative in the situation, stand in the way of studying the total population ̶ Sampling is the process of selecting a few (a sample) from a bigger group (the sampling population) to become the basis for estimating or predicting the prevalence of an unknown piece of information, situation, or outcome regarding the bigger group Basically we have two types of sample: ̶ Random sample ̶ Non-random sample (Singh, 2006; Kumar, 2011) 46 Sampling MaSiSS Non-random Sample Research Population Random Sample Population 47 Sampling MaSiSS Characteristics of a Good Sample • true representative of the population • free from bias • is an objective one • is comprehensive in nature • maintains accuracy • it yields accurate results Size of a Sample General rule is large sample is better (is likely more representative of population, data are more accurate and precise with smaller standard error). But this is not always true, because the chances of a Type I Error increase with the sample size (claim that something exists when it is not true). We can reduce this risk by using the Effect size tests or calculate Statistical Power of our sample to determinate adequate sample size. (Singh, 2006) 48 Statistical Analysis MaSiSS Statistical analysis is the science of collecting data and uncovering patterns and trends After collecting data you can analyse it to: 1. Summarize the data (make a pie chart) 1. By Measures of Distribution or frequency 2. Find key measures of location (median, mean, …) 1. By Measures of Central Tendency 3. Calculate measures of spread – find if your data are tightly or spread out cluster (R, SD, IQR) 1. By Measures of Variation and Position 4. Make future prediction based on past behavior 5. Test an experiment’s hypothesis (hypothesis testing) 1. p < 0.05 = statistically significant results, we reject H0 (because valid H0 occurred in less than 5% of cases) 2. p > 0.05 = statistically not significant results, we fail to reject H0 (because valid H0 occurred in more than 5% of cases) Selected Topics in Sport Science – Research Methodology and Statistics / Department of Kinesiology49 Statistical Analysis Research Methodology and Statistics Statistical analysis is the science of collecting data and uncovering patterns and trends note: you cannot accept H0 (only reject or fail to reject) or Ha … … after rejecting H0, we can add „we can accept Ha 50 Characteristics of a Good Test MaSiSS Reliability, Validity, Objectivity and Usability Reliability is consistency, dependence or trust measurement reliability is the consistency with which a test yields the same result in measuring whatever it does measure Example 1: If after first test mean score is 80, and after one week we used same test on same sample and the mean score is: ̶ 80 = test provided stable and dependent results ̶ 102 = test results are not consistent Example 2: how two different teachers will evaluate the same test results Types of reliability testing methods: ̶ Test-retest method (the same test over time) ̶ Interrater method (the same test conducted by different people) ̶ Parallel forms method (different version of a test which are designed to be equivalent) ̶ Split-half method (same test divides into two equivalent values) ̶ Internal consistency method (correlation between multiple items in a test that are intended to measure the same construct) Methods of determining reliability • Intraclass Correlation Coefficient (ICC) - defines a measure's ability to discriminate among subjects • Standard Error of Measurement (SEM) - quantifies error in the same units as the original measurement (Gronlund and Linn, 1995; Ebel and Frisbie, 1991) 51 Characteristics of a Good Test MaSiSS Reliability, Validity, Objectivity and Usability Factors that’s Affecting Reliabitity ̶ Length of the test ̶ Content of the test ̶ Spread of Scores ̶ Heterogeneity of the group ̶ Experience with the test ̶ Motivation ̶ testing procedure ̶ time limit of test ̶ Cheating opportunity How Higher Should Reliability be? Cronbach’s Coefficient Alpha (a) Reliability 0.80 to 0.95 Very good 0.70 to 0.80 Good 0.60 to 0.70 Fair < 0.60 Poor Note: what's wrong with this interval? 52 Characteristics of a Good Test MaSiSS Reliability, Validity, Objectivity and Usability • It means to what extent the test measures that, what the test maker intends to measure • means truthfulness of a test • Validity do not have different types. It is a unitary concept, based on various types of evidence Factors that affecting validity ̶ Unclear directions to the respond of the test ̶ Difficulty of the reading vocabulary and sentence structure ̶ Too easy or too difficult test items ̶ Ambiguous statements in the test items ̶ Inappropriate test items for measuring a particular outcome ̶ Inadequate time provided to take the test ̶ Length of the test ̶ Unfair aid to individual students (asking for help) ̶ Cheating during testing ̶ Unreliable scoring ̶ Anxiety, Physical or Psychological state of the pupil ̶ Response set (pattern in responding) 53 Characteristics of a Good Test MaSiSS Reliability, Validity, Objectivity and Usability • It means to what extent the test measures that, what the test maker intends to measure • means truthfulness of a test • Validity do not have different types. It is a unitary concept, based on various types of evidence Neither reliable or validReliable but not ValidBoth Reliable and Valid Not Reliable but Valid 54 Characteristics of a Good Test MaSiSS Reliability, Validity, Objectivity and Usability The extent to which the instrument is free from personal error (personal bias), that is subjectivity on the part of the scorer (Good, 1973) a test is considered objective when it makes for the elimination of the scorer’s personal opinion and bias judgement It affects both validity and reliability of test scores Selected Topics in Sport Science – Research Methodology and Statistics / Department of Kinesiology55 Characteristics of a Good Test MaSiSS Reliability, Validity, Objectivity and Usability While constructing a test, two main aspects of objectivity you need to keep in mind 1. Objectivity in scoring - same person or different persons scoring the test at any time arrives at the same result without may chance error (personal individual judgement should not affect the test scores), The scoring procedures must be clearly defined (without doubt and ambiguity) 2. Objectivity of test items - the item must call for a definite single answer 1. free from ambiguity and dual meaning sentences (it makes the test subjective) 56 Characteristics of a Good Test MaSiSS Reliability, Validity, Objectivity and Usability ̶ The test must have practical value from time, economy, and administration point of view ̶ Practical considerations cannot be neglected While constructing or selecting a test you must be taken into account: ̶ Ease of Administration – any trained person can use it and evaluated ̶ Time required for administration – Appropriate time limit for test (20–60 minute) ̶ Ease of Interpretation and Application – If the results are misinterpreted, it is harmful and not applied results are useless ̶ Availability of Equivalent Forms – You should have available equivalent forms of the same test in terms of content, level of difficulty and other characteristics ̶ Cost of Testing - A test should be economical from preparation, administration and scoring point of view 57 Variable Types and Scaling MaSiSS Research Variables Research variables are things you measure, manipulate and control in statistics and research ̶ Person, place, thing, idea, … The most common types of variable o Independent variable (IV) - is a singular characteristic that the other variables in your experiment cannot change, but IV can change other variables ▪ Age (eating or exercise habits are not changing your biological age, but you will not lift same weight as senior) o Dependent variable (DV) - relies on and can be changed by other components, IV can influence DV, DV can’t influence IV. Researchers goals are determine what makes the dependent variable change and how. ▪ A grade on exam (it depends on factors as how much your slept or how long you studied, but your test does not affect the time you spent studying) o Control (controlling) variables - are constant and do not change during a study, they have no effect on other variables. ▪ If we are investigating how much your slept (IV) effect a grade on exam (DV), we need to control time spent learning, the same level of students and more Other variables: Intervening (mediator) variables, Moderating (moderator) variables, Extraneous variables, Quantitative (numerical) variables, Qualitative (categorical) variables, Composite variables 58 Variable Types and Scaling MaSiSS Research Variables The most common types of variable o Independent variable (IV) - is a singular characteristic that the other variables in your experiment cannot change, but IV can change other variables ▪ Age (eating or exercise habits are not changing your biological age, but you will not lift same weight as senior) ▪ In t-test/ANOVA IV = categorical variable (nomina/ordinal) o Dependent variable (DV) - relies on and can be changed by other components, IV can influence DV, DV can’t influence IV. Researchers goals are determine what makes the dependent variable change and how. ▪ A grade on exam (it depends on factors as how much your slept or how long you studied, but your test does not affect the time you spent studying) ▪ In t-test/ANOVA DV = continuous measurement (interval/ratio) Dependent variableIndependent variable 59 Variable Types and Scaling MaSiSS Research Variables t-test (independent, one or two sampes) Dependent variableIndependent variable ValueGroup A/B IV (1x) DV (1x) One-Way ANOVA (Analysis of Variance) Dependent variable (Response Variable) Independent variable (Factor) Exam ScoreStudying Technique A/B/C IV (1x) DV (1x) 3+IVgroups One sample t-test Independent samples t-test Paired samples t-test 1 IV group 2 IV groups 2 IV groups Between-group variation (Diff amoug group means) Within-group variation (Variability within each group) 1-2IVgroups 60 Variable Types and Scaling MaSiSS Research Variables ANCOVA (Analysis of Covariance) One-Way MANOVA (Multivariate Analysis of Variance) Annual IncomeLevel of Education Student Loan Debt IV (1x) DV (2x) Exam ScoreStudying Technique Current Grade IV (2x) DV (1x) Covariates One-Way ANOVA (Analysis of Variance) Dependent variable (Response Variable) Independent variable (Factor) Exam ScoreStudying Technique A/B/C IV (1x) DV (1x) 3+IVgroups 61 Variable Types and Scaling MaSiSS Levels of Measurement is a classification that describes the nature of information within the values assigned to variables (Stevens, 1936) ̶ nominal, ordinal, and interval/ratio Named variables Named + Ordered variables Named + Orderd variables + Proportionate interval between variables Named + Orderd variables + Proportionate interval between variables + Absolute zero Nominal Ordinal Interval Ratio Qualitative data or Nonparametric data Quantitative data or Parametric data or Scale 62 Variable Types and Scaling MaSiSS Levels of Measurement Types of variables Qualitative variable (categorical, words) Division according to the possibility of arrangement Nominal variable (cannot by order) Ordinal variable (can by order) Division according to number of categories Dichotomous (2 variant) Multiples (> 2 variant) Quantitative variables (numerical, numbers) Discrete variable Continous variable Levels of Measurement (Scale of Measure) are Nominal, Ordinal, Interval, Ratio (Litschmannová, 2011) 63 Variable Types and Scaling MaSiSS Levels of Measurement Levels of Measurement (Scale of Measure) are Nominal, Ordinal, Interval, Ratio Nominal Ordinal Interval Ratio Qualitative Quantitative Discrete Continous Categorical Dichotomis Multiples (Řezanková et al., 2017) 64 Variable Types and Scaling MaSiSS Levels of Measurement • Nominal data: o The number of female athletes in football association o Your political party affiliation o The state/region/city where you were born o The color of your hair • Ordinal data: o Order of finish in race/contest/tournament o A school grades (A, B, C, D, F) o Ranking of chilli peppers (hot, hotter, hottest; not Scoville scale, SHU) o Student’s year of study (freshman, Sophomore, junior, senior) o Cancer stage (I, II, III, IV) • Interval data: o Intelligence Quotient scores o Dates on calendar o The heights of waves in the ocean o Shoe size o Longitudes on map/globe • Ratio data: o Height o Pulse o length o Money in your bank/wallet/pocket o Monthly Income/expenses Levels of Measurement (Scale of Measure) are Nominal, Ordinal, Interval, Ratio 65 Misuse of statistics MaSiSS ̶ Target (2002): Can we determine which customers are pregnant? ̶ Even if they don‘t want us to wnow. ̶ Andrew Pole: ̶ yes + also they expectd due date is = Target send right coupons at the right time 66 Misuse of statistics MaSiSS ̶ Elderly woman was robbed (1964): ̶ She saw: Blonde woman, ponytail ̶ a passing man saw: yellow car driven by a black man (had beard and a mustache) ̶ Police catche: Janet and Malcom Collins They matched all the descriptions Prove of guilt for Collins by mathematician ̶ Mathematician calculated the probability of just randomly selecting a couple that was inocent and also share all charakteristice ̶ 1 in 12 million chance that Collins are innocent Selected Topics in Sport Science – Research Methodology and Statistics / Department of Kinesiology67 Misuse of statistics MaSiSS ̶ Sally Clark: guilty of murdering her two infant son‘s in 90s. ̶ 1st son: died suddenly due to unknown causes ̶ 2nd son: found dead 8 week after birth of suddent unknown causes ̶ During the trial a pediatrician professor: chance of two suddent unknown causes is 1 in 73 million 68 Misuse of statistics MaSiSS Elderly woman was robbed (1964) ̶ People v. Collins ̶ Example of Prosecutor‘s fallacy ̶ P(A/B) ≠ P(B/A) 69 Misuse of statistics MaSiSS Given: ̶ Behind curtain is an animal with 4 leg What is the probability that it‘s a dog ̶ 1 in 1000 Given: ̶ Behind curtain is a dog What is the probability that it has 4 legs? ̶ Almost 100% By switching the given and the question = change of probability 70 Misuse of statistics MaSiSS Elderly woman was robbed (1964) ̶ People v. Collins ̶ Example of Prosecutor‘s fallacy ̶ P(A/B) ≠ P(B/A) Given: ̶ Couple fit all description (blonde woman,…). ̶ < 1 in 12 million chance of innocence WRONG 71 Misuse of statistics MaSiSS Given: ̶ Innocent couple ̶ < 1 in 12 million ̶ Probability of fiffing all descriptions Right ̶ Example: random couple (of a mall) had very small chance of fitting the description People v. Collins Given: ̶ Couple fit all description (blonde woman,…). ̶ < 1 in 12 million chance of innocence WRONG (eg. Prosecutors fallacy) 72 Misuse of statistics MaSiSS Bacterial test reveal more specific information that a simple multiplication of two probabilities They were independent of each other (genetic and enviromental factor) Lost childrens (due natural caused), accused of murdering, found guilty (due to a misuse of statistic), spent 3 years in prison, after release not able to recovery and died (alcohol poisoning) Sally Clark - guilty of murdering her two infant son‘s 73 Misuse of statistics MaSiSS 5% If high school dropout rates go from 5% to 10% 10% 5% increase 100% increase 74 Misuse of statistics MaSiSS 1 in 1 mil ̶ .0001% If dropout rates are 1 in a million people and then next year go to 2 in million people • Headlines 1: Dropout rates go up by 100% • Heallines 2: Dropout rates go up by .0001% 2 in 1 mil ̶ .0002% 100% increase + .001% 75 Misuse of statistics MaSiSS 1 in 7000 ̶ ≈.014% UK committee on sefty of medicines (1995) • certain type of birth control pill increaased the risk of life-threating blood clots by 100% 2 in 7000 ̶ ≈.028% 100% increase + .014% Older pill New pill 76 Misuse of statistics MaSiSS Correlation or Causation? ̶ Or both? ̶ If A correlated with B, it don‘t mean that A cause B ̶ Rotating turbines correlated with wind ̶ but tubine don‘t cause wind, wind cause rotation of turbine 77 Misuse of statistics MaSiSS Correlation or Causation? ̶ Or both? ̶ Violent show cause kids to be more violent? ̶ it is posible but … ̶ More violent kids watch more violent TV shows ̶ Also seens reasonable 78 Misuse of statistics MaSiSS Correlation or Causation? ̶ Or both? ̶ Peole how had head lice were healthy and sick people were ralely even had head lice ̶ Head lice cause bettet health ̶ But head lice is very sentive to the body temperature 79 Misuse of statistics MaSiSS Correlation or Causation? ̶ Or both? ̶ Ice cream sales do not cause increase in heat strokes or other way, even though are correlated ̶ Hot weather cause both (eg. Third-Cause Fallacy) 80 Misuse of statistics MaSiSS Correlation or Causation? ̶ Or both? ̶ CO2 production increased along with obesity. Does one cause the other? ̶ No, wealthy population eat more and produce more CO2 (eg. Third-Cause Fallacy) 81 Misuse of statistics MaSiSS Correlation or Causation? ̶ Or neither? ̶ Cheese consumption per person and the number of people who died as a result of tangling in the sheets (r = 0.947) 82 Misuse of statistics MaSiSS Correlation or Causation? ̶ Or neither? ̶ Consumption of mozzarella cheese per person with doctorates from engineering (r = 0.950) Both examples are an example of a false correlation 83 Descriptive Statistic MaSiSS Inferential Statistic Measures of Distribution or Frequency Measures of Central Tendency Measures of Dispersion or Variation Measures of Position Comparison tests Correlation tests Regression tests Presenting, organizing, simplifying, and summarizing data Drawing conclusions about a population base on data observed in a sample 1st step 2nd step 84 Descriptive Statistic MaSiSS 1. Measures of Distribution or Frequency 2. Measures of Central Tendency 3. Measures of Dispersion or Variation 4. Measures of Position 85 Descriptive Statistic MaSiSS Measures of Distribution or Frequency ̶ a graph or data set organized to show the frequency of occurrence of each possible outcome of a repeatable event observed many times ̶ compare one part of the distribution to another part of the distribution ̶ Count, Percent, Frequency 86 Descriptive Statistic MaSiSS Measures of Distribution or Frequency ̶ a graph or data set organized to show the frequency of occurrence of each possible outcome of a repeatable event observed many times ̶ compare one part of the distribution to another part of the distribution ̶ Count, Percent, Frequency 87 Descriptive Statistic MaSiSS Measures of Central Tendency is defined as the number used to represent the center or middle of a set of data values • Mean, Median, Mode 88 Descriptive Statistic MaSiSS Measures of Central Tendency Right or Positive skew Negative kurtosis or platykurtic Left or Negative skew Positive kurtosis or Leptokurtic Normal kurtosis or Mesokurtic 89 Descriptive Statistic MaSiSS Measures of Dispersion or Variation how far apart data points lie from each other and from the center of a distribution (mean±SD) • Range, IQR, Standard Deviation, Variance • Range (R) = xMax - xmin • Interquartile range (IQR) = Q3 - Q1 • Standard deviation (SD, s) is the average amount of variability in your dataset • s = sample standart diviation • s = population standard deviation • Variance (s2) is the average of squared deviations from the mean • To get variance, square the standard deviation (s) • s2 = sample variance • s2 = population variance 90 Descriptive Statistic MaSiSS Measures of Dispersion or Variation 91 Descriptive Statistic MaSiSS Measures of Dispersion or Variation 92 Descriptive Statistic MaSiSS 2.3.4 Measures of Position Determines the position of a single value in relation to other values in a sample or a population data set. • Quantiles (Percentile, Decile, Quartile), Outliers, standard scores o Percentile (Pi)= divide a rank-ordered data set (from the smallest to the largest) into 100 equal parts o Decile - divide a rank-ordered data set into ten equal parts o Quartiles (Qi) - divide a rank-ordered data set into four equal parts ▪ Q1 = P25, Q2 = P50 = median o Standard scores: raw scores that, for ease of interpretation, are converted to a common scale of measurement, or z distribution ▪ z-Score indicates how many standard deviations an element is from the mean ▪ t-score enables you to take an individual score and transform it into a standardized form, which helps you to compare scores. 93 Inferential or Analytical Statistic MaSiSS ̶ Inferential statistics takes data from a sample and makes inferences about the larger population from which the sample was drawn ̶ the goal of inferential statistics is to draw conclusions from a sample and generalize them to a population 94 Inferential statistic (p-value) MaSiSS Lady tasting tea Ronald Fisher (1890–1962) ̶ The experiment provides a subject with 8 randomly ordered cups of tea ̶ 4 prepared by first pouring the tea, then adding milk ̶ 4 prepared by first pouring the milk, then adding the tea n = 8 total cups k = 4 cup chosen 95 Inferential statistic (p-value) MaSiSS Lady tasting tea n = 8 total cups k = 4 cup chosen 96 Inferential statistic (p-value) MaSiSS Lady tasting tea How many cups must be correctly identified to concludes that subject can truly tell the difference? Lady Ottoline Violet Anne Morrell (1873–1938) English aristocrat and society hostess 97 Inferential statistic (p-value) MaSiSS Lady tasting tea 4 of 4 successe ̶ 1 / 70 = 0.0143 (1.14%) 3 of 4 success ̶ (16 + 1) / 70 = 0.243 (24.3%) Fisher willing to reject the null hypothesis … acknowledging the lady's ability at a 1.14% significance level n = 8 total cups k = 4 cup chosen Fisher's Exact Test 98 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparison tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … 99 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparasion tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … For Choosing adequate Comparasion test You need to know: - Which test answer the research question (Validity) - Scales of measurement o Nominal, Ordinal, Interval, Ratio data o Binary, Multiple, Discrete Continuous data o Binary, Nominal, Ordinal, Discrete, Continuous data - Parametric or nonparametric data o using the normal distribution test (just for quantitative data) - Paired (repeated measurements) or independent (unpaired) sample 100 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparasion tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … Example of Comparison Tests − Suitable tests for nominal data o Odds ratio test (OR), o Relative risk (RR) - Suitable tests for ordinal data o Chi-square test, c2 o Goodness of fit (observed vs. Expected) o Homogeneity (separate subgroups) o Independence (same population) - Suitable tests for interval and Ratio data (Scale data) o t-test o one sample o two independent samples o two paired samples o ANOVA test o for three or more sample o also for repeated measurements 101 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparasion tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … Example of Comparison Tests − Suitable tests for nominal data o Odds ratio test (OR), o Relative risk (RR) - Suitable tests for ordinal data o Chi-square test, c2 o Goodness of fit (observed vs. Expected) o Homogeneity (separate subgroups) o Independence (same population) - Suitable tests for interval and Ratio data (Scale data) o t-test o one sample o two independent samples o two paired samples o ANOVA test o for three or more sample o also for repeated measurements William Sealy Gosset (aka Student) 102 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparasion tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … ̶ Pearson’s r ̶ parametric, quantitative data ̶ Spearman’s rho ̶ nonparametric, quantitative which are handled like qualitative data ̶ Chi-square test (c2) ̶ qualitative data ̶ Dimension reduction techniques ̶ Factor analysis (use when you assume association if you want to understand the latent factors) ̶ Principal Component Analysis (seeks to identify, to predict using the factors, PCA) 103 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparasion tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … ̶ Simple linear regression (1 metric IV; 1 metric DV) ̶ Multiple linear regression (2+ metric IV; metric DV) ̶ Logistic regression (1 any IV; 1 binary variable) ̶ Nominal regression (1 any IV; 1 nominal variable), ̶ Ordinal logistic regression (1 any IV; 1 ordinal variable) 104 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparasion tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … ̶ Simple linear regression (1 metric IV; 1 metric DV) ̶ Multiple linear regression (2+ metric IVs; metric DV) ̶ Logistic regression (1 any IV; 1 binary variable) ̶ Nominal regression (1 any IV; 1 nominal variable), ̶ Ordinal logistic regression (1 any IV; 1 ordinal variable) 105 Inferential or Analytical Statistic MaSiSS Hypothesis (or Predictions) Testing • Comparasion tests • t-test, ANOVA, Mann-Whitney U test, Wilcoxon test, Kruskal-Wallis H test • Correlation tests • Pearson’s r, Spearman’s rho, Chi square test • Regression tests • Simple linear regression, Multiple linear regression, … ̶ Simple linear regression (1 metric IV; 1 metric DV) ̶ Multiple linear regression (2+ metric IVs; metric DV) ̶ Logistic regression (1 any IV; 1 binary variable) ̶ Nominal regression (1 any IV; 1 nominal variable), ̶ Ordinal logistic regression (1 any IV; 1 ordinal variable) Win/pass Loss/fail Serves speed / test points 106 Multiple comparisons problem MaSiSS If for 1 test: p = 0.04 (a < 0.05) If for 2 tests: p = 0.04 and 0.02 (a < 0.05) If for 4 tests: p = 0.04; 0.02; 0.01; 0.001 (a < 0.05) a ≠ 0.05 = 0.025 a ≠ 0.05 = 0.0125 107 Multiple comparisons problem MaSiSS 0.00% BrAC 0.11% BrAC Dif. 0.00% and 0.11% BrAC Females Males p (gender dif.) Females Males p (gender dif.) p (females) p (males) Mean SD Mean SD Mean SD Mean SD Foot rotation, ° 3.24 4.06 7.87 5.84 <0.001* 3.94 4.17 7.80 4.97 <0.001* 0.484 0.001* Stride length, cm 128.40 12.40 135.07 13.42 0.006* 134.87 13.84 138.47 14.75 0.167 0.572 0.401 Step width, cm 9.26 2.16 12.09 2.80 <0.001* 9.54 2.47 11.96 3.56 <0.001* 0.025* 0.508 Stance phase, % 63.04 2.12 62.55 2.19 0.132 61.90 2.36 62.34 2.55 0.620 0.013* 0.308 Load response, % 12.77 1.78 12.80 2.13 0.615 12.42 2.02 12.32 2.23 0.482 0.806 0.496 Single limb support, % 37.39 2.17 37.49 2.34 0.807 37.49 1.94 37.67 2.58 0.566 0.498 0.140 Pre-Swing, % 12.89 1.71 12.24 1.88 0.069 12.08 1.71 12.35 2.27 0.684 0.295 0.020* Swing phase, % 36.96 2.12 37.45 2.19 0.132 38.10 2.36 37.66 2.55 0.620 0.013* 0.308 Double stance phase, % 25.66 3.06 25.03 3.55 0.201 24.47 3.26 24.68 4.17 0.863 0.308 0.088 Stride time, sec 1.12 0.10 1.18 0.09 0.002* 1.13 0.10 1.19 0.09 <0.001* 0.057 0.134 Cadence, steps/min 107.58 9.45 102.20 7.53 0.002* 106.91 8.83 101.55 6.96 <0.001* 0.052 0.249 Velocity, km/h 4.17 0.65 4.14 0.48 0.885 4.34 0.67 4.22 0.49 0.391 0.521 0.197 Table 2. Mean and SD of analysed parameters in forward gait for 0.00% and 0.11% BrAC conditions in females and males. a = 0.05 Adjusted a = 0.0041 108 Multiple comparisons problem MaSiSS ̶ False Discovery Rate (FDR) – expected proportion of type I error ̶ Family-wise effer rate (FWER) – probability of making at least one type I error If you repeat a test enough times, you will always get a number of false positive 109 Multiple comparisons problem MaSiSS How to correct rist of type I error Benjamini-Hochberg (pk = (k / m) * Q) ̶ Controls the FDR Holm correction (pk = a / (m + 1 – k)) - Powerfull then Bonferroni - Control the FWER Bonferroni correction/adjustment (pk = a / m) ̶ Control the FWER k = rank of ordered p-value; m = number of tests; Q = FDR (0.05-0.25) 110 Multiple comparisons problem MaSiSS When not to correct If false negative are also importent for future research ̶ Example: you‘re researching a new AIDS vaccine. A high number of false positives may be hint to that you‘re on the right track When the results are not statistically significant (p > a) For a single test In the case of an exploratory study ̶ we do not know the hypotheses in advance (McDonald, 2014; Hendl & Remr, 2017) 111 Multiple comparisons problem MaSiSS What if …? Bonferroni correction (pk = a / m) ̶ Number of tests (m) = 6; a = 0.05 ̶ pk = 0.008 ̶ Number of tests (m) = 3; a = 0.05 ̶ pk = 0.017 Class Upper limb Lower limb Right Left p Right Left p 1st Mean±SD Mean±SD 0.015 Mean±SD Mean±SD 0.010 2nd Mean±SD Mean±SD 0.041 Mean±SD Mean±SD 0.032 3rd Mean±SD Mean±SD 0.05 Mean±SD Mean±SD 0.063 112 Multiple comparisons problem MaSiSS What if …? Bonferroni correction (pk = a / n) ̶ Number of tests (n) = 6; a = 0.05 ̶ pk = 0.008 ̶ Number of tests (n) = 3; a = 0.05 ̶ pk = 0.017 Class Upper limb Lower limb Right Left p Right Left p Max GS [Nm] Mean±SD Mean±SD 0.015 Mean±SD Mean±SD 0.010 Power GS [W] Mean±SD Mean±SD 0.041 Mean±SD Mean±SD 0.032 Time to Max GS [s] Mean±SD Mean±SD 0.05 Mean±SD Mean±SD 0.063 113 The Effect size (ES) MaSiSS • Is a quantitative measure of the magnitude of the experimental effect • Is considered an essential complement of the statistical significance test, because it allows to know the relevance of the difference and discerning between the statistical significance of a test and its practical importance. • The effect size allows to make comparisons between the statistical significant differences from groups with a very different number of items, and studying groups from different scientific works (meta-analysis) • Effect size hepl undestend the magnitude of differences found (statistical signifikance examines whether the findings are likely to be due to chance) Most common effect size tests ̶ Cohen‘s d (Effect size index d) • Appropriate to campared two means (t-test) • Small (d = 0.2; 58%*), Medium (d = 0.5; 69%*), Large (d = 0.8; 79%) ̶ Pearson correlation (r) ̶ Confidence interval (CI) • is a range of values that you can be (95%) certain contains the true mean of the population. • there are variants of 90 % CI, 95% CI, 99% CI • Cohen‘s d [95% CI] = d [lower bound; higher bound] = 0.9 [0.12; 1.52] ̶ Odds ratio test (OR) • For 2x2 table, (AD) / (BC) • Example for OR (95% CI): 2.1 (2.0 to 2.2) ̶ Others Effect size tests • Epsilon-squared (e2), Cohen‘s w (w), Eta-squared (h2), Cramer‘s V (V), Effect size Phi (j), Relative risk or risk ratio (RR), Coefficient of determination (r2), Common Language Effect Size (CLES), ... *Procentige of control group below the mean of experimental group 114 The Effect size (ES) MaSiSS Most common effect size tests ̶ Cohen‘s d (Effect size index d) • Appropriate to campared two means (t-test) • Small (d = 0.2; 58%*), Medium (d = 0.5; 69%*), Large (d = 0.8; 79%) ̶ Pearson correlation (r) ̶ Confidence interval (CI) • is a range of values that you can be (95%) certain contains the true mean of the population. • there are variants of 90 % CI, 95% CI, 99% CI • Cohen‘s d [95% CI] = d [lower bound; higher bound] = 0.9 [0.12; 1.52] ̶ Odds ratio test (OR) • For 2x2 table, (AD) / (BC) • Example for OR (95% CI): 2.1 (2.0 to 2.2) ̶ Others Effect size tests • Epsilon-squared (e2), Cohen‘s w (w), Eta-squared (h2), Cramer‘s V (V), Effect size Phi (j), Relative risk or risk ratio (RR), Coefficient of determination (r2), Common Language Effect Size (CLES), ... *Procentige of control group below the mean of experimental group 115 The Effect size (ES) MaSiSS Most common effect size tests ̶ Cohen‘s d (Effect size index d) • Appropriate to compared two means (t-test) • Small (d = 0.2; 58%*) • Medium (d = 0.5; 69%*) • Large (d = 0.8; 79%*) *Procentige of control group below the mean of experimental group 116 The Effect size (ES) MaSiSS Most common effect size tests ̶ Pearson correlation (r) ̶ Summarises the strenght of the bivariate relationship ̶ r correlation varies between -1 to +1 (A perfect negative to a perfect positive correlation) Correlation matrix for all variables (Banerje et al. (2009) 117 The Effect size (ES) MaSiSS Most common effect size tests ̶ Confidence interval (CI) • is a range of values that you can be (95%) certain contains the true mean of the population. • there are variants of 90 % CI, 95% CI, 99% CI • Cohen‘s d [95% CI] = d [lower bound; higher bound] = 0.9 [0.12; 1.52] Schwarz et al. (2013) 118 The Effect size (ES) MaSiSS Most common effect size tests ̶ Odds ratio test (OR) • For 2x2 table, (AD) / (BC) • Example for OR (95% CI): 2.1 (2.0 to 2.2) ̶ Relative Risk (RR 119 Probability MaSiSS ̶ Classic probability ̶ Frequentist probability ̶ Bayesian probability 120 Probability MaSiSS ̶ Classic probability ̶ Measure the likelihood (probability) of something happening. ̶ Frequentist probability ̶ Null hypothesis signifikance testing ̶ Probability of observed data under the assumption that the null hypothesis is true ̶ Bayesian probability ̶ Probabilities of both hypothesis ̶ Bayes theorem ̶ Uses the idea of updating belief with new information ̶ Tossing a coin or dice, … P(A) = f / N ̶ P(A) = probability of event A ̶ f = frequency (or number) of possible times the event could happen ̶ N = the number of times the event could happen ̶ For example, the odds of rolling a 6 on (a fair) die are 1 out of 6 (1/6) = one possible outcome divided by the number of possible outcomes 121 Probability MaSiSS ̶ Classic probability ̶ Measure the likelihood (probability) of something happening. ̶ Frequentist probability ̶ Null hypothesis signifikance testing ̶ Probability of observed data under the assumption that the null hypothesis is true ̶ Bayesian probability ̶ Probabilities of both hypothesis ̶ Bayes theorem ̶ Uses the idea of updating belief with new information P(E|Ho) Important: ̶ The p-value is not the probability of a theory or hypothesis, but the probability of the observed data 122 Probability MaSiSS ̶ Classic probability ̶ Measure the likelihood (probability) of something happening. ̶ Frequentist probability ̶ Null hypothesis signifikance testing ̶ Bayesian probability ̶ Probabilities of both hypothesis ̶ Bayes theorem ̶ Uses the idea of updating belief with new information P(H|E) 123 Probability MaSiSS Bayesian probability ̶ Treasure hunting Tommy Thimpson (1988) – SS Central America with $700,000,000 worth of gold ̶ Scientists use Bayes how new data validetes or invalidates their models ̶ Programmers use to building artificial intelligence (kvantification of machines belief) ̶ How you view yourself, ̶ your own opinions and what it takes for your mind to change (reframing your thought itself) ̶ Spam filter (looks at words) = probability that the email is spam, given that those words appear P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) P(Spam|Word) = 𝑃 𝑆𝑝𝑎𝑚 ∗ 𝑃(𝑊𝑜𝑟𝑑| 𝑆𝑝𝑎𝑚) 𝑃(𝑊𝑜𝑟𝑑)= 124 Probability MaSiSS Bayesian probability ̶ Bayes never publish his treorem ̶ He submit it to the Royal Society ̶ Publish after he died (Richard Price) ̶ Man coming first time out of a cave and saw the sun rise for the first time and ask himself: Is this one-off or does the sun always do this? An then every day after that, as the sun rose agan he yould get a little bit more confident, that the World works. Origin thought experiment ̶ He was sitting back to a perfectly flat, perfectly squared table, then he ask assistant to throw a ball onto the table and he wanted to figure out where it was. So, he asked his assistant to throw on another ball and then tell him if it land to the left, or to the right, front, behind of the first ball. He would note that down and then ask for more and more balls to be thrown on the table. Through this method he yould keep updating his idea of where the first ball was. He will never be completely certain, but with each new piece of evidence, he would get more and more accurate And that‘s how Bayes saw the world 125 Probability MaSiSS Bayesian probability P(H|E) = Posterior = probability a hypothesis is true given some evidence belief about the hypothesis after seeing the evidence P(H) = Prior = probability a hypothesis is true (before any evidence) hardest part of te equation (sometimes just guess) 𝑃(𝐸|𝐻) = likelihood = the probability that evidence given the hypothesis is true probability of seeing the evidence if the hypothesis is true P(E) = Marginalization = The probability evidence being true probability of seeing the evidence P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) 126 Probability MaSiSS Bayesian probability Prior P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃 𝐻 ∗𝑃 𝐸| 𝐻 +𝑃 ¬𝐻 ∗𝑃(𝐸|¬𝐻) P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) likelihood Posterior Marginalization ̶ P(H|E) = Posterior = probability a hypothesis is true given some evidence ̶ P(H) = Prior = probability a hypothesis is true ̶ 𝑃(𝐸|𝐻) = likelihood = the probability that evidence given the hypothesis is true ̶ P(E) = Marginalization = The probability evidence being true 127 Probability MaSiSS Bayesian probability Bayes‘ rules ̶ You have hypothesis ̶ You‘ve observed some evidence ̶ You want know the probability that the your hypothesis holds given that the evidence is true = P(Hypothesis given Evidence) Given (“|“) = restricting view only to the possibilities where the evidence holds P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃 𝐻 ∗𝑃 𝐸| 𝐻 +𝑃 ¬𝐻 ∗𝑃(𝐸|¬𝐻) 128 Probability MaSiSS Bayesian probability Prior Example: you feel a little bit sick, without sympstoms, just not 100%. Doctors run a battery of tests and results are … you tested positive for a rare disease that affects about 0.1% of population and it‘s a nasty disease. The test correclty identify 99% of people that have the disease and only incorrecly identify 1% of people who don‘t the disease. - What are the chances that you actually have this disease? 99%??? P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) likelihood Posterior Marginalization 129 Probability MaSiSS Bayesian probability Prior probability of having the disease P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) You would test + if you had the disease Probability of testing + Actually have the disease given you tested + Prior P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) likelihood Posterior Marginalization 130 Probability MaSiSS Bayesian probability Prior probability of having the disease = 9 % P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃 𝐻 ∗𝑃 𝐸| 𝐻 +𝑃 ¬𝐻 ∗𝑃(𝐸|¬𝐻) P(H|E) = 0.001 ∗ 0.99 0.001∗0.99+0.999∗0.01 You would test + if you had the disease You tested + Probability of testing + Actually have the disease given Actually having the disease after testing + Is it too Low? no, it is common sense applied to mathematics… Probability MaSiSS Bayesian probability o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o N = 1000; n = 1 (actually have the disease); n = 10 (1% of 999 people) Probability MaSiSS Bayesian probability o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o N = 1000; n = 1 (actually have the disease); n = 10 (1% of 999 people) 1 in 11 people = 9% 133 Probability MaSiSS Bayesian probability ̶ Bayes‘ Theorem wasn‘t a formula intended to be used just once. ̶ Each tome gaining new evidence and updating your probability ̶ That someting is true ̶ you should update prior beliefs (update a belief based on evidence) Back to example: ̶ You tested positive for a rare disease. ̶ You get another doctor opinion, get second test, but test aslo come back as positive … What is the prabability that you actually have the disease? 134 Probability MaSiSS Bayesian probability Prior probability of having the disease = 9 % P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃 𝐻 ∗𝑃 𝐸| 𝐻 +𝑃 ¬𝐻 ∗𝑃(𝐸|¬𝐻) P(H|E) = 𝟎.𝟎𝟎𝟏 ∗ 0.99 0.001∗0.99+0.999∗0.01 You would test + if you had the disease You tested + Probability of testing + Actually have the disease given Actually having the disease after testing + Posterion will be new prior 135 Probability MaSiSS Bayesian probability Prior probability of having the disease = 90.7% P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃 𝐻 ∗𝑃 𝐸| 𝐻 +𝑃 ¬𝐻 ∗𝑃(𝐸|¬𝐻) P(H|E) = 𝟎.𝟎𝟗 ∗ 0.99 𝟎.𝟎𝟗∗0.99+𝟎.𝟗𝟏∗0.01 You would test + if you had the disease You tested + Probability of testing + Actually have the disease given Actually having the disease after testing +(2 times) New probability (after 2 positive tests) 136 Probability MaSiSS Bayesian probability Kahneman and Tversky (2003), example 1 ̶ Steve is very shy and withdrawn, invariably helpful but with very little interest in people or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail. Is Steve a librarian or farmer? Most comman anser: Librarian But people hold biased views about the personalities (of librarians or farmars) = stereotypes (almost) no one incorporate information about ratio of farmers to librarians in their judgments 137 Probability MaSiSS Bayesian probability !Prior! P(H) = Prior = ratio of farmers to librarians in general population The probability of H being true (this is knowledge) P(E|H) = likelihood = proportion of librarians that fit this destriptions The probality of E being true, given H is true P(H|E) = Posterior = belief about the hypothesis after seeing the evidence The probability of H being true, given E is true P(E) = Marginalization = The probability evidence being true P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃 𝐻 ∗𝑃 𝐸| 𝐻 +𝑃 ¬𝐻 ∗𝑃(𝐸|¬𝐻) P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) likelihood Posterior Marginalization 138 Probability MaSiSS Bayesian probability Kahneman and Tversky (2003) ̶ Farmars to libratians (20:1) P(H|E) = 𝑃 𝐻 ∗ 𝑃(𝐸| 𝐻) 𝑃(𝐸) Population (10:200 = 1:20) You expect from sample 4 librarians and 20 farmers to fit the descriptions P(Librarian given Descriptions) = 4 4+20 =16.7% 139 Probability MaSiSS Bayesian probability Kahneman and Tversky (example 2) ̶ Linda is 31 yers old, single, outspoken, and very bright. She majored in philosiphy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations. Is Linda more likely: A) a bank teller. B) a bank teller and is active in the feminist movement. 85% chose B !!! even if it is a subsection Bank teller Active feminist bank teller 140 Probability MaSiSS Bayesian probability The same assignment (Linda) ̶ 100 people fit this description. How many are: A) Bank teller? ___of 100 B) Bank teller and is active in the feminist movement? ___ of 100 100% people assigns a higher number to the first option than to the second Results: people udestend “40 out 100“ better “40%“ less than “0.4“ much less that abstractly referencing the idea (Steve, Linda) 141 Probability MaSiSS Bayesian probability Example: ̶ At the Campus you meet a guy name Tom. Alfter few minutes you notice that Tom is shy. ̶ Is Tom more likely to be in IT program? ̶ Is Tom more likely to be in a MNG program? ̶ Shyness is more common in IT programs …but how many IT students are they relative to MNG students? 142 Probability MaSiSS Bayesian probability Example: At the Campus you meet a guy name Tom. Alfter few minutes you notice that Tom is shy. ̶ Is Tom more likely to be in IT program (Bc.)? ̶ Is Tom more likely to be in a MNG program (Bc.)? IT : MNG MNG IT Prior odds ratio 1 : 10 75% Shy 15% shy Likelihood odds ratio 75 : 15 10 1 75 : 150 1 : 2➔Posterio odds ratio 143 Probability MaSiSS Bayesian probability Example: ̶ The stove repairman looks suspiciously around the various rooms in the apartment ̶ Is repairman more likely to be a robber? Robber : Repairman Honest repairman Robber Prior odds ratio 1 : 100 80% Snooping Likelihood odds ratio 80 : 1 100 1 8 : 100 8.0%➔Posterio odds ratio 10%Snooping 144 Probability MaSiSS Bayesian probability How much a colleague is jealous and complains? Did I have more energy to the diet? ̶ Compering world in which the diet doesn‘t work to the world which it does Somebody evaluate your work and thing it‘s great. ̶ How comman is to him that bad ideas are great? ̶ How much evidence is his approval that my work is great? ? ? ?% ? ? ?%