The learning rate controls how hard each new tree tries to correct the remaining mistakes from the previous tree. High: more complex trees.
pros
(1) Achieving excellent accuracy, make fast predictions without using a lot of memory.
(2) doesn't require normalization of features.
(3) handle a mixture of feature types. (binary, continuous, categorical types)
This method does have several downsides.
(1) difficult for people to interpret.
(2) requires careful tuning of the learning rate and other parameters.
(3)like decision trees, not recommended for text classification and other problems with very high dimensional sparse features, for accuracy and computational cost reasons.