Machine learning
implementation
The aforementioned algorithms belong to the classical ML techniques and
can be quickly implemented as they are readily available in software
libraries for e.g. Matlab, Python or R . The following procedure
to employ ML techniques in a digital twin framework was demonstrated by
Min et al. and can be roughly applied in most cases :
- Preprocessing
- Feature extraction
- Modell training and validation
- Tryout and optimization
- Online deployment
Preprocessing describes for example the temporal alignment, de-noising
or scaling of data , , feature extraction, the selection or
transformation of data by means of a correlation matrix , PCA or even
operator experience . Some common modelling techniques were already
presented in Table 1. Other noteworthy approaches include
genetic/evolutionary algorithms, fuzzy logic, probability-based
techniques (e.g. Gaussian processes, ), semi-supervised and
reinforcement learning or artificial neural networks. For the latter
kind, structures with convolutional layers and long short-term memory
units (LSTM) have proven to be effective for image analysis and time
series data, respectively. These neural networks capture the spatial or
temporal structure of the data and led to the remarkable advances in
image classification and speech recognition . As the vast amount of
choices for a ML model can be overwhelming at first, it is common
practice to test different models and compare their performance based on
chosen metrics. In case of a regression problem, the root mean squared
error or the coefficient of determination (R²) are often used. If an
adaptive online mechanism is desired, the training and forecast speed
can be a deciding factor as well.
During the tryout and optimization stage, the trained model is tested in
a real-time operating environment. Since the model is trained on
historical data, it is important to verify its operational reliability
in the latest environment and adapt the model if necessary. Finally, the
virtual model is deployed with connection to the real-time data and the
process control or monitoring system. The optimal set of control
parameters can be found by means of search algorithms like depth-first
search, breadth-first search or grid search or by employing model
predictive control or other control strategies , . For security reasons,
it is advisable to implement visible recommendations from the ML model
for an operator rather than a direct access to the process control
system, especially at an experimental stage , . This is related to the
veracity problem that is often associated with ML solutions . It can be
difficult to generate interpretable suggestions made by purely
data-driven models and justify the adaptation of e.g. a control
strategy based on these models. A possible solution lies in the
incorporation of first-principle models to form hybrid models that
support rational decision-making , . Such advances could revolutionize
the perception of ML solutions in process engineering, but are still
considered as a “long, adventurous, and intellectually exciting
journey” .
Another opportunity exists in the integration of edge and cloud
computing solutions for the facilitated access and treatment of data .
With the increasing computational power of microcontrollers, it becomes
more and more possible to locally preprocess and analyze sensor datae.g. for each unit operation and forward the processed
information to a higher-level control or data handling structure, which
in turn can function more efficiently due to the reduced amount, but
higher quality of data , . The development of this type of ”smart
equipment” by using new sensor or ML solutions to determine
hard-to-measure variables and offer more process flexibility has
attracted considerable attention in academic and industrial research ,
, .
Some examples include the digitalization of extraction columns by means
of novel measurement techniques complemented by modeling and simulation
methods to create tools for predictive online monitoring, which is
reviewed in Hlawitschka et al. . Other approaches work on the
integration of novel sensors and actuators to obtain valuable process
information and create more responsive equipment . In order to
facilitate the integration of the given examples and other ML solutions
into existing processes, new concepts with standardized interfaces and
communication protocols are emerging . One promising concept that stands
out in these aspects is the Module Type Package (MTP), which is a
module-based approach with embedded process knowledge and standardized
interfaces according to VDI/VDE/NAMUR 2658 part 1-4 , , , . MTP enables
a quick and flexible design of processes and integration of modules, so
called process equipment assemblies (PEAs), compare with VDI 2776 , into
a higher-level control system, which is referred to as process
orchestration layer (POL) . Due to the standardized interfaces, the data
from every PEA or the entire process is easily accessible and ML
solutions can be quickly implemented. A special MTP feature is the
service oriented architecture that provides the possibility to run
recipes with the predefined services each PEA offers to the POL . This
feature could be used to run the process with many different control
variables, study the respective effects of control variables and observe
states that are usually undesired. For most processes this kind of data
is scarce as it is not the optimal way to operate the process, but it is
useful for the training of ML models that are supposed to prevent those
states . The recipe feature could be further used for the automated
conduction of experiments via “Design of Experiments” (DoE) in
conjunction with ML algorithms to optimize a product or process . The
implementation of such ML solutions via a service-like architecture as
proposed by Soto et al. is also an interesting concept, which
would greatly benefit from accepted standards.