The OLS regression tells us that that distance to a poll site is not a good indicator for voter turnout in NYC, as seen in Figure 6. Visualizing the data in scatter plots in actual and log scales also doesn't appear to visualize any relationship (Figures 7 and 8). One reason results are as is could be that dropped NaNs from the poll sites data resulted in key polling sites being removed from analysis. However, this is unlikely. Polling distance affecting voter turnout is usually scaled at the mile level, where the average distance between poll sites is 0.3 km (Dyck & Gimple, 2005). In fact, the furthest distance to a poll site from a ED centroid is 2.2 km. The distances here are not that great, where in other regions of the US, some voters must travel miles to vote. This model can be expanded to look at other regions, such as the statewide level which incorporates both densely urban and sparsely rural populations. Other improvement include less computationally demanding code to perform these tests. Optimizing time to find the nearest poll sites, and calculating distance between them and ED centroids are a must.
References: