The goal of the project is to model and understand the socio-economic factors affecting cancer mortality. The data were aggregated from a number of sources including the American Community Survey (census.gov (http://census.gov)), clinicaltrials.gov (http://clinicaltrials.gov), and cancer.gov (http://cancer.gov). The data dictionary is provided in the Appendix. We will attempt to predict cancer mortality in different counties in the nation (TARGET_deathRate) and try to understand how different socio-economic factors might influence health and mortality. The data has been portioned into two (1) CancerData.CSV, and (2) CancerHoldoutData.csv. Use CancerData.csv for model training, parameter tuning (if any), etc. CancerHoldoutData.csv should only be used for evaluation of model performance. It should not be used in anyway in the model development process.
WE WILL USE R OR R-STUDIO SOFTWARE TO DO THIS ASSIGNMENT