Module #2, Interactive Exercise# 2

Distribute the data points so that you have a correlation of about +.30.

Move the points around until you have a correlation of about .30. Note: this example works better if you move several of the points a little rather than just moving one to an extreme value on the distribution. After getting the correct correlation (or close to the correct value) place a check mark in the box titled Show mean of Y. (Note: Y is the dependent variable).

a.       What does the distribution of scores look like now, how does this compare to the data in the previous problem (r = .90)?

b.      Do the points deviate a lot from the mean on Y?  (Check ‘Show SS total’ to see the deviations, and check ‘Show Error as Squares’ to see how much the deviation of each data point from the mean contributes to SS Total.)

c.       Record the numeric value for SS total  here _____________________ (The SS values are shown in the applet.)

d.      Now place a check mark in the box titled Show Regression Line.  Do your points seem to deviate a lot from the regression line?  (Remove the check from the ‘Show SS total’ box and check ‘Show SS error’ to see deviations of observed points from the regression line, shown in red.).

e.  Record the SS error here  ____________________

f.  Compare the error results to the results from the previous problem (r = .90).  How do these compare?  Compare proportion of error variance for each by taking the SSerror / SStotal.

g.  Now click the box marked Show SS predicted and remove the check from the box Show SS error.  The blue lines that appear represent the difference between the mean and predicted scores.  How are these scores distributed? Do the predicted scores deviate a lot from the mean?

h. Record the correct numeric value for SS Predicted here ____________________ .

i. How does the difference between predicted scores and the mean differ from the previous problem (r = .90)?  Focus on the proportion of predicted out of total (SSPredicted / SStotal) rather than the SS value itself.

j.  Why is the distribution of predicted vs. mean scores so different between the situations where r = .90 versus r = .30?

Go to the Regression Applet