# Relationship between two variables and regression analysis

3 [8pts] Annual Wine Consumption and Heart Disease Death Rate data was collected in 1994 from some selected countries by University of California, San Diego. Use this data to answer the questions below. Use SPSS to do your computations (See the document on the SPSS practice for this segment under reading assignment and study material as guide).

(a) Identify the response variable and the independent variable in this study.

(b) Make a scatterplot of Average Annual Consumption (AAC) against Heart Disease Death Rate (HDDR). Describe the pattern in the plot. (Hint: What does the scatterplot tell you, interpret in simple language)

(c) Does the scatterplot display any outliers? Explain.

(d) Compute the correlation coefficient and draw conclusions (in simple language), on the strength of the association between the two variables.

4 [16pts]

Continue the analysis of the data in part 3 above.

(a) Find the regression line of HDDR on AAC. Write down the regression line to predict HDDR from AAC. (Remember to save the Standardized Residuals, you’ll need them later to check the validity of your analysis)

(b) What is the slope of the fitted regression line? Explain in simple language what the slope value indicates about the relationship between HDDR and AAC

(c) What is the y-intercept of this line? Express in simple language what the y-intercept says about the relationship of HDDR and AAC.

(d) What is the coefficient of determination R2 for the regression line? Explain in simple language what the R2 value says about the regression line to predict HDDR.

(e) Use SPSS to create the residual plot defined as the scatterplot of the standardized residuals versus AAC. Describe (interpret) the pattern displayed by the residuals. Are the points randomly scattered around the zero line? If not, do they show deviations from linearity?