Some of the following were adapted from problems suggested by Kathleen Wong Nirei of Iolani School, Honolulu and Bill Harrington.
1. Circle the correct answer:
If a correlation coefficient is 0.80, then:
a. The explanatory variable is usually less than the response variable.
b. The explanatory variable is usually more than the response variable.
c. Below average values of the explanatory variable are more often associated with below average values of the response variable.
d. Below average values of the explanatory variable are more often associated with above average values of the response variable.
e. None of the above.
2. Circle the correct answer:
a. The closer a correlation coefficient is to 1
or 1, the more evidence there is of a causal relationship between the explanatory
variable and the response variable.
b. The closer a correlation coefficient is to 0,
the more evidence there is of a causal relationship between the explanatory
variable and the response variable.
c. The closer the value of r^2 is to 1 or 1, the
more evidence there is of a causal relationship between the explanatory
variable and the response variable.
d. The closer the value of r^2 is to 0, the more
evidence there is of a causal relationship between the explanatory variable
and the response variable.
e. None of the above.
3. One of the following statements is better than
the others. Circle that statement. VERY BRIEFLY explain why you did not
choose each of the other statements:
When comparing the size the residuals from two different models for the same data:
a. Use the range of each set of residuals as a basis for comparison.
b. Use the mean of each set of residuals as a basis for comparison.
c. Use the sum of each set of residuals as a basis for comparison.
d. Use the standard deviation of of each set of
residuals as a basis for comparison.
4. Below is a plot of the 1986 profits versus sales
(each in ten of thousands of dollars) of 12 large US companies, the results
of a least squares regression performed on a TI83, and some other summary
data. Note that some of the data with lower Sales values overlap on the
graph.

a. Demonstrating your knowledge of the definition of r^2, explain what the value of r^2 means in the context of this problem.
b. Annotate, i.e. fully add labels and lines, at any one point on the plot to help a reader understand what r^2 measures.
c. The teacher who supplied this data set suggested that even though
r^2 is close to one there is reason to doubt some of the interpolative predictive
value of this model. He came to this conclusion with no further computation
or residual analysis. Explain his reasoning.
5 . Note: The data for this problem is stored in a program named AIDS which is available from Mr. Coons. Do NOT enter this data by hand.
Consider the following data on the number of AIDS cases reported in the US by state health departments between 1982 and 1986:
Year  1982  1983  1984  1985  1986 
Number of Cases  434  1,416  3,196  6,242  10,620 
a. Using year as the independent variable, state the value of and interpret
the slope of the least squares regression line in the context of this data.
b. State the value of and interpret the yintercept of the regression
line in the context of this data.
c. Use the least squares regression line to predict the number of aids
cases in the year 2000.
d. Assuming this data was an adequate and representative sample, how
confident are you in the prediction you made in part c? Your answer must
include conclusions from a residual analysis. Include a rough residual plot.
e. State the equation of a quadratic model and compare it fully to your previous model. Include a rough plot(s).
b) Create a numerical example of Simpson's Paradox. Briefly point out
how your example demonstrates this deceiving situation.