Data was further processed by determining the latitude and longitude of the nearest polls sites to the centroid of each election district. The code to do this is computationally expensive and takes plenty of time. Data was then separated into has nycha and no nycha dataframes to perform statistical analyses.