what analytical tools were used, why were they the
appropriate tool? Give references here to the use of these tools in
similar contexts and the strengths and weaknesses of the methods. What
methods could not be used because the data was not supporting them, but
would have been able to answer the question. (You probably want to
include plots and tables here too).
Conclusions:
The correlation analysis showed that there is a strong positive correlation between banking status on one hand, and unemployment on the other. Similarly, there is a strong negative correlation between banking status and median income. These findings are in-line with Urban Institute's results. The UI report identifies the Bronx as the borough highest unbanked population, highest poverty and unemployment level and lowest median incomes. 4
Surprisingly, there is a weak correlation between banking status, and the availability of banks and percentage of immigrants in a neighborhood. When comparing boroughs, the Bronx has one of the lowest number of banks, and a relatively high percentage of immigrants. So, I expected that these two variables to be strongly correlated to unbanking.
The regression model showed that the effect of unemployment rate is insignificant. With a 95% level of certainty and controlling for the number of banks and ratio of foreign born residents, the model concluded the following:
- The average ratio of unbanked households in a neighborhood increases by 0.85 per 1% increase in unemployment ratio
- The average ratio of unbanked households in a neighborhood increases by 0.53 per 1% increase in poverty ratio.
Future work: what improvements to the analysis, or what data
would be needed to improve the result?