Data Leakage
- two types: Leakage in the training data, and leakage in features
- when the data you're using to train contains information about what you're trying to predict
- any of these leaked features is highly predictive of the target, but not legitimately available at the time prediction needs to be done.
Linear classifier
max negative log-likelihood, max prob of positive examples to be classified to positive examples
- results are easy to interpret(gives a measure of how relevant a predictor is (coefficient size) but also its direction of association)
- works well with high-dimensional sparse data.
Ordinal logistics regression
- use the same parameter(beta) for each class, but the intercept(alpha) will be different
Linear models
pros:
- simple and easy to train
- fast prediction
- scales well to very large datasets
- works well with sparse data
- reasons for prediction are relatively easy to interpret
cons
- may not perform well on lower-dimensional data,
- data may not be linearly separable.