Methodology 

The scaled dataset has been split into a training and test data set in the 80 / 20 ratio. Two models have been trained on the the training dataset. The features used for training the models are number of residential units, gross square feet, land square feet, year built, building class category, mean income levels in the zipcode area, number of people with bachelor degree or higher in the zipcode area, number of employed people in the zipcode area and school enrollment as a measure of number of schools in the area. Decision trees have been used to build the model over multiple linear regression because if the relationship between between the predictor variable and the target variable is not linear,  linear regression needs a lot of feature engineering which is not the case with decision trees. Decision Trees can work with mixed feature types and are robust to outliers. Furthermore decision trees find the best way to split data according to the importance of features which is required in this case. Random forest decision tree and Gradient Boosting decision tree models have been trained on the data. The built models are used to make predictions on the test set. The results of the model have been visualized as plots. The built models also give the importance of each feature in the dataset. Decision trees are still susceptible to overfitting and are not that easy to interpret. 

Results

The Random Forest model gives an in sample R of  0.9369  and an out sample  R 2  of  0.6701.  The difference in the performance of the model on  the training and test dataset suggests overfitting.