From the plots it can be seen that both the models do not perform well when it comes to the higher sales price. The Random forest model can explain more variance in the data than the Gradient Boosting model. The models also does not answer the question which features are more important well. The feature importance may not be accurate due to correlation between features like gross square feet and number of residential units.
Weakness
The model performance can be improved by using additional features in the model. These features can incorporate some missing information that will help train the model better like the number of rooms, number of bathrooms, presence of a garage, other utilities present .
Links
Data :
Code :
Bibliography
Yusof, A., & Ismail, S. (2012). Multiple Regressions in Analyzing House Price Variations. Communications of the IBIMA, 1–9.
Fan, G. Z., Ong, S. E., & Koh, H. C. (2006). Determinants of house price: A decision tree approach. Urban Studies, 43(12), 2301-2315.