The activation functions are basically the functions that control the behavior of the artificial network. The transmission of the input is known as forward propagation. Inputs, weights, and biases are transformed by the activation functions and the results are the input for the next layer. Examples of activation functions are the following:
a) Linear: function where the dependent variable has a direct, proportional relationship with the independent variable;
b)Sigmoid: a logistic function that converts independent variable of near-infinite range into simple probabilities between 0 and 1. Its characteristic is then to reduce values or outliers without removing them;
c)Tanh: a hyperbolic trigonometric function, it transforms the independent variable to a range between -1 and 1. Its advantage is that can deal easily with negative number;
d) Softmax: it is a generalization of logistic regression, it can be applied to continuous data and can contain multiple decision boundaries. This function is often used in a classification problem;
e) Rectified Linear unit (ReLu): a function that activates a node only if the input is above a certain positive quantity. Above this threshold, the function has a linear relationship with the dependent variable.