Discussion of results:
The R-squared value of 0.188 means that the OLS model managed to explain almost 20% of the test scores with the selected input independent variables. Of the 5 independent variables selected, Crimes is not statistically significant as the P-value of 0.735 is higher than the alpha of 0.05. This is the same with izone, with the P-value of 0.209. These means that whether a school having more crimes in its premises and whether the school is under the special izone scheme where it receives more funding for teaching software does not affect students' English test scores at all. On the other hand, both class size and budget have a significant positive correlation with english test scores, which is interesting in the class size context because this means more students per teacher receives better test scores, which is counter-intuitive.
Note from qg412:
Dear Professor, I am really sorry I cannot complete this entire EC. I did try my best to clean all the dataset to the best of my ability, but the OLS is refusing to run for me and I can't do anything beyond that subsequently. I only discovered that there are non-numerical value 's' in my y-variable after the submission deadline and managed to come out with the regression results after the deadline. In any case I have attached the regression results in this report for completeness sake. I will complete this entire EC again after finals with plots, just so I won't be a letdown to myself.