Introduction
Biologically-derived drugs have comprised a notable sector in the
pharmaceutical industry in the past 20 years. Prokaryotic systems are
incapable of effectively expressing glycosylated biologically-derived
drugs. Nevertheless, 90% of pharmaceutical proteins are typically
terminated at the initial steps of clinical development because of their
low solubility (Dai et al., 2014). In many cases, solubilization of
proteins in inclusion bodies is considered undesirable to obtain active
recombinant protein conformation. The solubility of a recombinant
protein can indicate the quality of its function. Generally, 30% of
recombinant proteins are expressed in aggregate or insoluble form
(Malaei, Rasaee, Latifi, & Rahbarizadeh, 2019; Sørensen & Mortensen,
2005). The production of soluble, pure and functional proteins is a high
demand in biotechnology of vaccine development or biologically-derived
drugs. Low natural protein sources, complex purification steps and high
price are the factors favoring the application of recombinant cells as
suitable tools for protein production. Due to its short lifetime,
high-density culture, well-known genetics and cost effectiveness, the
Gram-negative Escherichia coli (E. coli ), is an attractive
host for the expression of recombinant proteins. In spite of all these
qualities, expression of recombinant proteins in E. coli mostly
yields insoluble or inclusion body forms (Esmaili, Sadeghi, & Akbari,
2018; Fakruddin, Mohammad Mazumdar, Bin Mannan, Chowdhury, & Hossain,
2012; Singhvi, Saneja, Srichandan, & Panda, 2020; Terol, Gallego-Jara,
Martínez, Díaz, & de Diego Puente, 2019). Although, forming inclusion
body can simplify protein purification steps and increase recombinant
protein yield, a series of onerous tasks are involved in the protein
refolding process (Hamidi, Safdari, & Arabi, 2019; He & Ohnishi, 2017;
Leong, Chua, Samah, & Chew, 2019), and the majority of refolded
proteins lack any biological activity, while soluble protein with proper
folding is necessary for the structural and functional studies of a
protein (Rosano, Morales, & Ceccarelli, 2019). Hence, bioinformatics
tools can be considered a useful approach to predict the solubility of
overexpressed proteins in E. coli .
To our knowledge, this is the first report comparing bioinformatics
prediction and experimental results in overexpression of soluble
recombinant proteins in E. coli (Habibi, Hashim, Norouzi, &
Samian, 2014). Here, the advised strategies were categorized into the
following three sections for consideration to improve soluble expression
of a protein of interest: (1) gene design and bioinformatics prediction
tools; (2) selection of vector and host strain; and (3) cell culture
condition.