6. Fit/train modelsimilar to previous example, we use the training data to train the model and find model parameters, such the predicted outcome as close as possible to desired outcome:
kmodel.fit(X_train, y_train)
7. Evaluate Model:
scores = kmodel.evaluate(X_test, y_test)
print('Accuracy: %' + str(scores[1]*100))

Putting it together

All of the above steps can be summarized into few lines of code. This easy and fast implementation is the most conspicuous characteristic of Keras, which has made it a mainstream tool for researchers to implement deep learning models. Fast and easy implementation allows researchers a fast experimentation cycle, where they can decide whether deep learning is a good option for their application.
kmodel = Sequential()
kmodel.add(Dense(nb_classes, input_shape=(dims,), activation='sigmoid'))
kmodel.add(Activation('softmax'))
kmodel.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['acc'])
kmodel.fit(X_train, y_train)
scores = kmodel.evaluate(X_test, y_test)
print('Accuracy: %' + str(scores[1]*100))

Deep Learning Hyperparameters

We talked about neural networks parameters including weights and biases. Parameters are the numbers that the machine learning algorithm learns in the learning process. For example, in logistic and linear regression, parameters are the coefficients that the algorithm learns. Hyperparameters are any knobs and numbers that you as the human are in control of. There are some obvious examples of hyperparameters, for example in regularization it's L1, L2 and drop out rate, both choosing the number for those parameters, but also the mere decision of whether to use those regularization parameters. In neural networks, number of hidden layers, number of neurons in each layer, activation function, cost function, network optimizer, metrics to evaluate model goodness and any other knob that you as the human specify about the algorithm are hyperparameters. In following sections, you will learn what is the role of each of these properties, and your options for each properties.

Activation Functions

As explained in previous chapter, a neuron is a unit that calculates the weighted sum of its inputs, adds a bias and depending on this calculated value (Y) and its activation function, decides whether it should fire or not. Without an activation function, the weighted sum of inputs plus bias could have any values between negative infinity to positive infinity. The neuron does not know the bounds of this value, therefore by adding an activation function, we can map these values to a certain range, based on which the neuron decides whether to fire or not. The question is, how do we choose what range to map the output values to, and which activation function to use?
The simplest form of an activation function is a step function, where the neuron fires if Y is greater than a threshold and does not otherwise. Step functions are a good choice for binary classification problems, however, they fall short when we have several categories or a continuous outcome.