where \(\mathbf{x}_{t+1}\) is (the position of) a solution at time step \(t+1\), \(\alpha\) is the step size, and \(\mathbf{d}_{t}\) is the direction of the movement.
Since there is no general a priori clue about which direction to take to progress towards the optimal solution, various mathematical properties are used to define the right track and the step size, better leading to optimal points.
First-order derivative methods use the opposite direction of the gradient vector with a certain step size to find minimal points. Despite first-order methods have fast convergence, sometimes, depending of the step size, an oscillatory process can start, and the optimal point is never found. The classical well-known gradient method, a first-order method, is shown in Eq. (2):