From Fig.4, we can tell that roughly, there is no significant relationship between the independent and dependent variables. If the dataset is more massive, like all the neighbors in the United States, Random Forest will be a good choice. RF is an ensemble machine-learning method used in classification and regression (Breiman 2001). The main advantage of the RandonForest algorithm, in this case, is that the model is possible to interpret the importance of the features.

Conclusion

Citizens' criterion of the perceived safety the on the urban street is not able to reflect the location is safe or not. There might be many reasons. The first one is there are too many crimes in New York City, especially Manhattan, while people’ feeling varies with location, the crimes are nearly ubiquitous. The second one when people are feeling dangerous, he or she might not thinking about violence. In figure 1, it is evident that most highways and bridges are marked as most dangerous. This can be explained as the spaciousness of open places makes people feeling anxiety not really related to crime.

Future

To find out the relationship between perceived safety and dangerous event in the real world, I need more data to generate a reliable model — at least data from a city which does have as dense crime as New York. Moreover, the perceived safety score shows a significant spatial autocorrelation even after whiten, geographically weighted regression might be useful.