Modeling the effects of media formulated with various yeast extracts on
heterologous protein production by Escherichia coli using machine
learning
Abstract
In microbial manufacturing, yeast extract is an important component of
growth media. The production of heterologous proteins is often varied
because of yeast extract composition. To identify why this reduces
protein production, the effects of yeast extract compositions on the
growth and green fluorescent protein (GFP) production of engineered
Escherichia coli were investigated using a deep neural network
(DNN)-mediated metabolomics approach. We observed 205 peaks from various
yeast extracts using gas chromatography-mass spectrometry. Principal
component analyses of the peaks identified at least three different
clusters. Using 20 different compositions of yeast extract in M9 media,
the yields of cells and GFP in the yeast extract-containing media were
higher than those in the control without yeast extract by approximately
3.0–5.0 fold and 1.5–2.0 fold, respectively. We compared machine
learning models and found that DNN best fit the data. To estimate the
importance of each variable, we performed DNN with a mean increase error
calculation based on a permutation algorithm. This method identified the
significant components of yeast extract. DNN learning with varying
numbers of input variables provided numbers of the significant
components. The influence of specific components on cell growth and GFP
production was confirmed with a validation cultivation.