linear classifier
max negative log-likelihood, max prob of positive examples to be classified to positive examples
Ordinal logistics regression
use the same parameter(beta) for each class, but the intercept(alpha) will be different
Linear models
pros:
cons