Answers for Module 1, Exercise 1:

Module 1, Exercise 1a:

The correlation (r) is .837, the slope of the line (b) is .700, and the Intercept (a) is 2.200, taken from the right panel in the applet for Regression Module 1, Exercise 1.

Module 1, Exercise 1b:

4.3, 5.7, and 7.1.  Calculation for the last value is 2.2 + .7x7 = 2.2 + 4.9 = 7.1.

Module 1, Exercise 1c:

SS Total = 14.  The squared deviations from the mean for the four cases are 9, 0, 4, and 1, respectively.

Module 1, Exercise 1d:

The largest deviation from the mean is -3, for Case 1.

Module 1, Exercise 1e:

The contribution of the second case to SS Total is zero, because the Y value of 5 is exactly equal to the mean.

Module 1, Exercise 1f:

SS Total = 14. You can find this as the sum of the last column in your table, and this value is also shown in the applet in the SS column in the Analysis of Variance section.

Module 1, Exercise 1g:

SS Total is the sum of the squared deviations of Y scores from the mean of Y. If SS Total was much smaller, then all of the Y values must be close to the mean. SS Total could be much larger for several reasons: many of the Y values could be somewhat farther from the mean, a few values, or even one value, could be very far from the mean, or we could simply have many more Y values. Note that a single Y value that differed from the mean by 10 points would contribute 100 to SS Total.

There is a close relationship between SS Total and variance. An estimate of the population variance taken from a sample is calculated at the sum of the squared deviations from the mean divided by the degrees of freedom, which is (SS Total) / (n-1) for a single sample.  In our example, this is 14/3 = 4.667. The standard deviation is the square root of variance = 2.16, the value shown in the applet as the std dev for the DV.

Module 1, Exercise 1h:

For the second case, Y′ = 4.3, Y-Y’ = (5 – 4.3) = .7, and (Y – Y′)2 = .49.  For the third case, Y′ = 5.7, (Y-Y′)=(7 – 5.7) = 1.3, and (Y – Y′)2 = 1.69.

Module 1, Exercise 1i:

The largest deviation is for Case __3__, and the size of the deviation is _1.3_.

The smallest deviation is for Case __2__, and the size of the deviation is __.7__.

Module 1, Exercise 1j:

The calculated value for the Sum of Squares Error = SS Error = 4.200.

Module 1, Exercise 1k:

SS Error is the sum of the squared deviations of observed scores from the predicted scores. If SS Error is very small, every observed score is close to the predicted score, so the plot of every observed score is close to the regression line.

If SS Error is much smaller than SS Total, then the sum of deviations around the regression line is much smaller than the sum of deviations around the mean. Thus, the regression equation gives much more accurate predictions of scores than simply using the mean as the prediction for all scores. The plot would show a strong linear relationship between X and Y.

If SS Error is about the same size as SS Total, then the regression equation has not improved our prediction of Y scores. The regression line would be close to horizontal at the mean. The plot would not show any indication of a linear relationship between X and Y.

Module 1, Exercise 1L:

For the second case, the predicted score is 4.3, which is .7 below the mean of 5.0, so the squared deviation of the predicted score from the mean is .49. For the third case, the deviation is +.7, and for the fourth case the deviation is +2.1. The sum of the squared deviations is 9.80.

Module 1, Exercise 1m:

Yes, it appears that X is useful in predicting Y in our plot. The blue lines, which indicate predictive ability, are substantial. They are relatively long, compared to the red lines we observed for error deviations, and the blue squares are relatively large compared to the red squares. Thus, it appears that the SS Predicted is substantial.

Module 1, Exercise 1n:

The Sum of Squares Predicted from the Analysis of Variance table in the applet is 9.800, which is also the sum of the last column in the table in part L.

Module 1, Exercise 1o:

SS Predicted is the sum of the squared deviations of predicted scores from the mean. If the regression model is not at all useful, then the predicted score will be the mean for each case, and SS Predicted will be zero. If the regression model is only slightly helpful, then the predicted scores will be only slightly different from the mean, and SS Predicted will be small relative to SS Total. This plot would show virtually no linear relationship between X and Y, and the regression line would be close to the horizontal line for the mean of Y.

If there is a strong linear relationship in the data, SS Predicted is large relative to SS Error, and the observed data fall close to the regression line.

Module 1, Exercise 1p:

[SS Predicted / SS Total] = 9.800 / 14.000 = .700.

The applet reports r = .837 and  r squared = .700.

This sample data shows a strong linear relationship, as measured by r=.837. The plot shows this strong positive relationship, with larger values of X generally associated with larger values of Y. In this sample, 70% of the variance in Y can be explained by the linear relationship with X.

We should note that this is an extremely small sample, and that we would not be able to generalize to the relationship in a population of X and Y values, even if there four cases are a random sample from that population.

Back