loading page

A systematic DNN-based QSPR modeling methodology for rapid and reliable prediction on flashpoints of chemicals
  • +4
  • Huaqiang Wen,
  • Yang Su,
  • Zihao Wang,
  • saimeng Jin,
  • Jingzheng Ren,
  • Weifeng Shen,
  • Mario Eden
Huaqiang Wen
Chongqing University
Author Profile
Yang Su
Chongqing University of Science and Technology
Author Profile
Zihao Wang
Max Planck Institute for Dynamics of Complex Technical Systems
Author Profile
saimeng Jin
Chongqing University
Author Profile
Jingzheng Ren
The Hong Kong Polytechnic University
Author Profile
Weifeng Shen
Chongqing University
Author Profile
Mario Eden
Auburn University
Author Profile

Abstract

Quantitative structure-property relationship (QSPR) studies based on deep neural networks (DNN) are receiving increasing attention due to their excellent performances. A systematic methodology coupling multiple machine learning technologies is proposed to solve vital problems including applicability domain and prediction uncertainty in DNN-based QSPRs. Key features are rapidly extracted from plentiful but chaotic descriptors by principal component analysis (PCA) and kernel PCA. Then, a detailed applicability domain (AD) is defined by K-means algorithm to avoid unreliable predictions and discover its potential impact on uncertainty. Moreover, prediction uncertainty is analyzed with dropout-embedded DNN by thousands of independent tests to assess the reliability of predictions. The prediction of flashpoint temperature is employed as a case study demonstrating that the model accuracy is remarkably improved comparing with the referenced model. More importantly, the proposed methodology breaks through difficulties in analyzing the uncertainty of DNN-based QSPRs and presents an AD correlated with the uncertainty.

Peer review status:IN REVISION

14 May 2021Submitted to AIChE Journal
19 May 2021Assigned to Editor
19 May 2021Submission Checks Completed
26 May 2021Reviewer(s) Assigned
15 Jun 2021Editorial Decision: Revise Major