Relations Between Two Variables Ľubica and Ján Krauskovi, Dominik Heger Masaryk University hegerd(3chemi. muni. cz STDT06 Rel Betw 2 Variables 1 -00.0 skovi, Dominik Heger (K Between Two Vari STDT06 Rel Betw 2 Variable: Multivariat Data Maternal ages Birthweights of their babies 20 25 3D 35 43 Tiaternal age (years) 100 '20 140 163 18C birth weight (ounces: skovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variable: Bivariate Data: Scatter Diagram .1 mA t 25 30 35 maternal age (years) 1 -00.0 skovi, Dominik Heger (K Relatior Between Two Variables STDT06 Rel Betw 2 Variables 3/12 Association iskovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variables 4/12 Association There is a association if in the slice of X the scatter of Y is smaller than SDy. iskovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variables 4/12 Association There is a association if in the slice of X the scatter of Y is smaller than SDy. Linear association: roughly, the scatter diagram is clustered around a straight line. skovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variable: Association There is a association if in the slice of X the scatter of Y is smaller than SDy. Linear association: roughly, the scatter diagram is clustered around a straight line. Positive association Above average values of one variable tend to go with above average values of the other; the scatterplot slopes up. skovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variable: Association There is a association if in the slice of X the scatter of Y is smaller than SDy. Linear association: roughly, the scatter diagram is clustered around a straight line. Positive association Above average values of one variable tend to go with above average values of the other; the scatterplot slopes up. Negative association Above average values of one variable tend to go with below average values of the other; the scatterplot slopes down. iskovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variables 4/12 Describing Scatterplots Point of averages in the scatter plot is the point with coordinates [mean of X, mean of Y] = [X, ?]. iskovi, Dominik Heger (K Relatii □ ► 4 S Between Two Variables STDT06 Rel Betw 2 Variables 5/12 Describing Scatterplots Point of averages in the scatter plot is the point with coordinates [mean of X, mean of Y] = [X, ?]. The point of averages is a measure of the "center'of a scatterplot, quite analogous to the mean as a measure of the center of a list. iskovi, Dominik Heger (K Relatii □ ► 4 S Between Two Variables STDT06 Rel Betw 2 Variables 5/12 Describing Scatterplots Point of averages in the scatter plot is the point with coordinates [mean of X, mean of Y] = [X, Ý]. The point of averages is a measure of the "center'of a scatterplot, quite analogous to the mean as a measure of the center of a list. Q Linearity and Nonlinearity iskovi, Dominik Heger (K Relatic □ ► 4 S Between Two Variables STDT06 Rel Betw 2 Variables 5/12 Describing Scatterplots Point of averages in the scatter plot is the point with coordinates [mean of X, mean of Y] = [X, Ý]. The point of averages is a measure of the "center'of a scatterplot, quite analogous to the mean as a measure of the center of a list. O Linearity and Nonlinearity Q Homoscedasticity and Heteroscedasticity iskovi, Dominik Heger (K Relatic □ ► 4 S Between Two Variables STDT06 Rel Betw 2 Variables 5/12 Describing Scatterplots Point of averages in the scatter plot is the point with coordinates [mean of X, mean of Y] = [X, Ý]. The point of averages is a measure of the "center'of a scatterplot, quite analogous to the mean as a measure of the center of a list. O Linearity and Nonlinearity Q Homoscedasticity and Heteroscedasticity Q Outlier iskovi, Dominik Heger (K Relatic □ ► 4 S Between Two Variables STDT06 Rel Betw 2 Variables 5/12 Describing Scatterplots Point of averages in the scatter plot is the point with coordinates [mean of X, mean of Y] = [X, Ý]. The point of averages is a measure of the "center'of a scatterplot, quite analogous to the mean as a measure of the center of a list. O Linearity and Nonlinearity Q Homoscedasticity and Heteroscedasticity Q Outlier If a scatterplot shows linear association (or no association), homoscedasticity, and no outliers, it is said to be football-shaped (bivariant normal). Look on scatter diagram - see if there is a association, if it is linear and if there are outliers. iskovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variables 5/12 Association iskovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variables 6/12 Association There is a association if in the slice of X the scatter of Y is smaller than SDy. 1 -00.0 skovi, Dominik Heger (K Between Two Vari STDT06 Rel Betw 2 Variable: Association There is a association if in the slice of X the scatter of Y is smaller than SDy. Positive association: The individuals with larger than average values of one variable tend to have larger than average value of the other and individuals with smaller then average values of X tend to have smaller then average values of Y. Optically examine if there is an association. If yes - is it linear? If yes - talk about correlation. Korelace (je podmnožinou) C asociace. Correlation (is subset of, is included in) C association. Association (is superset of or includes) D correlation. iskovi, Dominik Heger (K Relations Between Two Variables STDT06 Rel Betw 2 Variables 6/12 Post Hoc Ergo Propter Hoc fallacy After this, therefore because of this. Association between two variables is often used as evidence that there is a causal relationship between variables - erroneously. NOT Truth = Fallacy: If two things are associated, there is some causal relationship between them. One causes the other. • Jeníček, Pepíček, Mařenka. • Readibility and shoe size have positive association. • Money spend on healthcare and life expectancy have negative association. • Waxing of the car and its maximum speed have positive association. • Ratio of a Na+/K+ is the same as a size of the moons of ... The variables are related in some way, but that does not mean that one causes the other. Association is not causation! <□►