Then NYPD dataset will be combined with the BJS statistics to further prevent biased data source. BJS dataset includes statistics on the characteristics of crimes and victims and consequences of victimization. Resource of BJS statistics comes from the National Crime Victimization Survey (hereafter NCVS), which collects information on nonfatal crimes, reported and not reported to the police, against persons age 12 or older from a nationally representative sample of U.S. households.
The median household income data will be acquired from NYC census tract dataset on a census tract level. Due to time limitation, this project will only invest incidents throughout 2015.
To test if rape is less or more likely to happen in lower household income areas, a linear regression model between rape incidents and household level by census tract will be conducted. Additionally, I will generate a plot to visualize the median household income distribution and rape incidence occurrence locations.
1. Map (most likely a heat map) of median household income by census tract and rape occurence within the corresponding zip codes
2. Linear regression model table