[insert Figure 2 here]
Step 1: Problem Definition
When a new cosmetic design project is launched, the product type and
form such as a facial powder or lipstick are first decided by the
marketing team based on the target market, potential consumers,
competing products, etc. The quality of the new cosmetic product depends
on its sensorial and functional attributes. Usually, the
consumer-desired attributes can be specified through interview and
survey with potential consumers. The sensorial perceptions given by a
cosmetic are the most essential for its satisfaction and repeated use by
consumers.6 In practice, sensorial perception is
assessed through sensorial evaluation. A number of panelists assess
various cosmetic samples using well-defined protocols and their
perceptions are quantified using sensorial ratings. Then, an overall
sensorial rating can be obtained to represent the degree of satisfaction
of the cosmetic.7 Note that in addition to perception,
other factors such as packaging and price affect consumer’s purchase
decision. These factors are not considered in this work and the
objective function is to maximize the overall sensorial rating (\(q\)).
\(\max\text{\ \ \ q}\) (1)
In addition, the functional attributes are also needed to be satisfied.
Each cosmetic has its unique functional attributes. For instance, a hair
spray should dry rapidly and perfume should be transparent (Table 2).
The product attributes can be translated into relevant physicochemical
properties (e.g., melting point for lipstick) and product specifications
(e.g., sun protection factor for sunscreen product) using engineering
know-how. For the four cosmetics in Table 2, the last column lists
various properties related to their sensorial and functional attributes.
How a lipstick is sensed by the lips is affected by its viscosity. The
pH of a skin cream affects its safety. Then, a set of design targets
(i.e., lower and upper bounds) on the properties can be identified based
on the engineering know-how and product in-house data. These bounds
serve as constraints in the optimization problem.
\(PL^{k}\leq P^{k}\leq PU^{k},\ \ \ k\in K\) (2)
where \(P^{k}\) is the k -th desired property. \(K\) is the set of
properties. \(PL^{k}\) and \(PU^{k}\) are the lower and upper bounds,
respectively. Note that the nomenclature is presented in Supporting
Information.
Step 2: Ingredient Candidate
Generation
To provide multiple desired attributes, many chemical ingredients are
needed. Cosmetic ingredients are classified into different types based
on their functionalities. Table S1 lists the ingredient types that are
widely used in various cosmetics and their
functions.7,28 For instance, an abrasive in a facial
cleanser is made up of solid particles used for physically cleaning hard
surface such as epidermis. Three types of moisturizers (i.e., emollient,
humectant, and occlusive) can be used to provide hydration effect.
Emollient can improve the skin’s water-oil balance, humectant inhibits
water evaporation, and occlusive can form a water-repellent layer to
reduce water loss. For a cosmetic, the needed ingredient types can be
identified based on the fundamental formulation science and the desired
product attributes.
For each ingredient type, a set of ingredient candidates can be
generated using databases29,30 and computer-aided
tools.14,31 Regarding each of the ingredient types in
Table S1, the last column lists two commonly used ingredient candidates.
For instance, lactic acid and triethanolamine are often used as an
acidic and alkaline pH buffers, respectively. With the years of
development in the cosmetic industry, hundreds of ingredient candidates
exist for each ingredient type. To reduce the search space, the
candidates can be pre-screened using ingredient screening tools based on
cost, regulations, availability, etc. to generate a more organized pool
of ingredient candidates
\(\left.\ \par
\begin{matrix}I_{A}=\left\{I_{A,1},I_{A,2},\ldots,I_{A,a}\right\}\\
I_{B}=\left\{I_{B,1},I_{B,2},\ldots,I_{B,b}\right\}\\
\par
\begin{matrix}\cdots\\
I_{Z}=\left\{I_{Z,1},I_{Z,2},\ldots,I_{Z,z}\right\}\\
\end{matrix}\\
\end{matrix}\text{\ \ \ }\right\}\) (3)
where \(I_{A}\), \(I_{B}\),…, \(I_{Z}\) are ingredient types.\(I_{A,1}\), \(I_{A,2}\),…, \(I_{A,a}\), etc. represent the
generated ingredient candidates. Here, the subscripts \(a\), \(b\), and\(z\) denote the number of candidates in each ingredient type. Each
candidate has different properties (e.g., density, solubility and pH)
which can be collected from the literature, database, and experiment.
The selection of ingredients is intuitively a discrete-continuous
optimization problem. Each ingredient candidate can be assigned a binary
variable \(S_{i}\) to control ingredient selection and a continuous
variable (e.g., volume fraction \(V_{i}\)) to denote its composition. If
the i -th candidate is selected, \(S_{i}\) is equal to 1 and\(V_{i}\) is constrained by its lower (\(VL_{i}\)) and upper
(\(VU_{i}\)) bounds. Otherwise, \(S_{i}\) and \(V_{i}\) are equal to 0.
\(\sum_{i}{V_{i}=1}\) (4)
\(VL_{i}\bullet S_{i}\leq V_{i}\leq VU_{i}\bullet S_{i}\),\(i\in\left\{I_{A,1},I_{A,2},\ldots,I_{Z,z}\right\}\) (5)
In addition to ingredients, microstructure can affect the properties
when certain product forms are used. Typically, the major
microstructural features can be characterized by some geometric
descriptors that can be correlated with the mixture properties by
experiment and multi-scale modeling to account for the
microstructure-property relationship.32 The last
column of Table 1 lists the relevant microstructure descriptors for
various commonly used cosmetic product forms. For example, the oil
droplet size affects the viscosity and texture of a moisturizing lotion
in the form of an oil-in-water emulsion.33 The
emulsion type and particle shape can be decided using
heuristics.34 Geometric descriptors ms such
as particle size are continuous variables
\(msL\leq ms\leq msU\) (6)
where \(m\text{sL}\) and \(m\text{sU}\) are the lower and upper bounds,
respectively. The microstructure is decided by both the formulation and
manufacturing process design.35
Step 3. Model
Identification
Model for Sensorial
Perception
Surrogate model that captures the input-output data is built to predict
the sensorial rating. After a surrogate model is trained, its analytical
form can be used for optimization.
\(q=f\left(V_{I_{A,1}},I_{I_{A,2}}\ldots,V_{I_{Z,z}},ms\right)\) (7)
The first task is to collect training data. The input data can be the
cosmetic recipes and the microstructures, namely
(\(V_{I_{A,1}},V_{I_{A,2}},\ldots,V_{I_{Z,z}},ms\)). The output data is
the corresponding sensorial rating (\(q\)). Here, the historical data of
sensorial evaluations can be utilized. When the historical data is
scarce, additional data sampling is required. By far, many efficient
sampling approaches have been used in the cosmetic industry such as
Latin-hypercube sampling, Plackett-Burman, full-fractional, etc.
Referring to the “one in ten” rule, the number of data samples is
preferably ten times more than the number of ingredient candidates. The
second task is to build an accurate surrogate model. Currently, multiple
types of surrogate models can be utilized such as linear regression,
kriging, artificial neural network (ANN), radial basis function, etc.
Among them, some surrogate models (e.g., random forest) cannot provide
available derivative information while the derivatives of many other
surrogate models are symbolically available such as linear regression,
ANN with tansig kernel function, etc.36 Here, a
surrogate model with available derivative information is preferred
because solving a discrete-continuous optimization problem with no
derivative information is very challenging. The hyperparameters of the
surrogate model structure should be carefully tuned. The heuristics and
experience reported in the literature can be
consulted.36,37 Afterward, model accuracy needs to be
validated. The widely used validation methods include K-fold cross
validation and holdout method. If the model is not sufficiently
accurate, the type of surrogate model and the hyperparameters should be
re-selected.
Models for Target
Properties
Three types of models can be applied for predicting the target
properties: rigorous mechanistic model, short-cut model, and surrogate
model. Typically, the formulation and application of cosmetics involve
various phenomena (e.g., kinetics, thermodynamics, and transport). For
any property, the associated phenomena should be first identified based
on the basic engineering sciences and domain knowledge, followed by the
identification of the relevant mechanistic models. Generally, rigorous
models are the most accurate but more complex and sometimes with unknown
parameters. The perfume diffusion model38 and
ingredient percutaneous absorption model39 are
examples. Instead of accounting fully the physical phenomena, simple
short-cut model captures the property’s dependence on the most
influential factors. Usually, short-cut model is sufficiently accurate
within pre-specified conditions. Note that both rigorous and short-cut
models involve many intermediate variables for describing the relevant
phenomenon. The rigorous or short model for k -th desired property
(\(P^{k}\)) can be represented as
\(P^{k}=G^{k}(IM_{I_{A,1}}^{k},IM_{I_{A,2}}^{k},\ldots,IM_{I_{Z,z}}^{k})\),\(k\in K\) (8)
\(\text{IM}_{i}^{k}=\text{IMG}^{k}\left(V_{i},ms\right)\),\(i\in\left\{I_{A,1},I_{A,2},\ldots,I_{Z,z}\right\}\) (9)
where \(\text{IM}_{i}^{m}\) denotes the intermediate variable related to
the i -th ingredient candidate (e.g., vapor pressure and activity
coefficient). If there are no suitable mechanistic models but data are
available, surrogate models can be adopted,40 although
the model validity is often limited to the range of available data. The
input data should be the sampled cosmetic recipes and microstructure.
The output data are the target properties. For the k -th property
(\(P^{k}\)), its surrogate model is
\(P^{k}=g^{k}(V_{I_{A,1}},I_{I_{A,2}}\ldots,V_{I_{Z,z}},ms)\),\(k\in K\) (10)
Accordingly, for any desired property, a set of models (rigorous,
short-cut, and surrogate) should be identified for use in the
optimization.
The use of heuristics is often inevitable in cosmetic
formulation.24,41 The reason is that some phenomena
have not been identified or are poorly understood. For instance, a
hydrocolloid thickener with a weak gel network structure is preferred
for use in emulsion-based product to generate thixotropic behavior,
although no formal justification has been given.34 In
addition, heuristics can effectively help reduce the search space. Many
heuristics, although not all, can be transformed into mathematical
design constraints for use in the optimization. Table 3 shows the widely
used forms of heuristics and the associated equations for formulated
product design. For instance, if the number of ingredients for certain
type of ingredient is suggested, an inequality constraint\(TL\leq\sum_{i}S_{i}\leq TU,\ \ i\in I_{X}\) can be generated.