Random variation happens in two ways
-   bootstrap sample: bootstrap sample has N rows just like the original training set but with possibly some rows from the original dataset missing and others occurring multiple times just due to the nature of the random selection with replacement.
-   instead of finding the best split across all possible features, a random subset of features is chosen and the best split is found within that smaller subset of features. 
Prediction
    -    regression: mean of individual tree predictions
    -    classification
Model Complexity
-   Learning is sensitive to max_features
-   max_features = 1 → trees will be very different and possibly with many levels (can not pick the most informative feature)
-   max_features ≈  no. of features → similar trees with fewer levels (because can use the most informative feature)