# Relationship between two variables and regression analysis

3 [8pts] Annual Wine Consumption and Heart Disease Death Rate data was collected in 1994 from some selected countries by University of California, San Diego. Use this data to answer the questions below. Use SPSS to do your computations (See the document on the SPSS practice for this segment under reading assignment and study material as guide).

(a) Identify the response variable and the independent variable in this study.

(b) Make a scatterplot of Average Annual Consumption (AAC) against Heart Disease Death Rate (HDDR). Describe the pattern in the plot. (Hint: What does the scatterplot tell you, interpret in simple language)

(c) Does the scatterplot display any outliers? Explain.

(d) Compute the correlation coefficient and draw conclusions (in simple language), on the strength of the association between the two variables.

4 [16pts]

Continue the analysis of the data in part 3 above.

(a) Find the regression line of HDDR on AAC. Write down the regression line to predict HDDR from AAC. (Remember to save the Standardized Residuals, you’ll need them later to check the validity of your analysis)

(b) What is the slope of the fitted regression line? Explain in simple language what the slope value indicates about the relationship between HDDR and AAC

(c) What is the y-intercept of this line? Express in simple language what the y-intercept says about the relationship of HDDR and AAC.

(d) What is the coefficient of determination R2 for the regression line? Explain in simple language what the R2 value says about the regression line to predict HDDR.

(e) Use SPSS to create the residual plot defined as the scatterplot of the standardized residuals versus AAC. Describe (interpret) the pattern displayed by the residuals. Are the points randomly scattered around the zero line? If not, do they show deviations from linearity?

(f) Are there observations with large residuals? Explain your answer.

(g) You want to predict HDDR given that the AAC is 3.5liters. Use the regression line to compute the predicted HDDR.

(h) Suppose you want to predict HDDR given that the AAC is 10 liters. Explain if you can still use the regression line to compute the prediction.

(i) From your Analysis, What did you found out? Do you think wine consumption help health disease? Explain. According to your R2 what percentage is due to other factors? What are your suggested factors that could contribute to healthy heart? 