loading page

Accurate prediction of standard enthalpy of formation based on semiempirical quantum chemistry methods with artificial neural network and molecular descriptors
  • zhongyu Wan,
  • Quan-de Wang,
  • Jinhu Liang
zhongyu Wan
Xuzhou Institute of Technology
Author Profile
Quan-de Wang
China University of Mining and Technology
Author Profile
Jinhu Liang
North University of China
Author Profile

Peer review status:UNDER REVIEW

05 Jun 2020Submitted to International Journal of Quantum Chemistry
05 Jun 2020Assigned to Editor
05 Jun 2020Submission Checks Completed
30 Jun 2020Reviewer(s) Assigned


This work investigates possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods for the prediction of standard enthalpy of formation (Δ_f H^o) through the use of artificial neural network (ANN) with molecular descriptors. A total number of 142 organic compounds with enough structural diversity has been considered in the training set. Standard enthalpy of formation for the selected compounds at the semiempirical PM3 and PM6 quantum chemistry methods is collected from literature, and is calculated by using semiempirical PM7 method in this work. The multiple stepwise regression is first employed to screen effective molecular descriptors, which are highly correlated with the error terms of the standard enthalpy of formation compared with experimental values. The obtained 7 effective molecular descriptors are then used as input set to establish three 7-11-1 neural network-based correction models to improve the accuracy of SQC methods. By using the developed correction models, the mean absolute errors (MAE) for Δ_f H^oof PM3, PM6, and PM7 methods are reduced from 22.36, 18.60, 17.27to 9.86, 9.83, 8.95, respectively in kJ/mol. Meanwhile, the results of the test set show that the neural network does not have the problem of over-fitting. Detailed analysis of the 7 effective molecular descriptors indicates that the major source to the correction models is from the electron withdrawing effect. The developed ANN models for the three selected SQC methods provide an efficient method for the quick and accurate prediction of thermodynamic properties.