\documentclass[10pt]{article}
\usepackage{fullpage}
\usepackage{setspace}
\usepackage{parskip}
\usepackage{titlesec}
\usepackage[section]{placeins}
\usepackage{xcolor}
\usepackage{breakcites}
\usepackage{lineno}
\usepackage{hyphenat}
\PassOptionsToPackage{hyphens}{url}
\usepackage[colorlinks = true,
linkcolor = blue,
urlcolor = blue,
citecolor = blue,
anchorcolor = blue]{hyperref}
\usepackage{etoolbox}
\makeatletter
\patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{%
\errmessage{\noexpand\@combinedblfloats could not be patched}%
}%
\makeatother
\usepackage[round]{natbib}
\let\cite\citep
\renewenvironment{abstract}
{{\bfseries\noindent{\abstractname}\par\nobreak}\footnotesize}
{\bigskip}
\titlespacing{\section}{0pt}{*3}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.5}
\titlespacing{\subsubsection}{0pt}{*1.5}{0pt}
\usepackage{authblk}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\providecommand\citet{\cite}
\providecommand\citep{\cite}
\providecommand\citealt{\cite}
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\providecommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}%
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[greek,english]{babel}
\begin{document}
\title{Multi-class Adaboost}
\author[1]{Adlane SAHED}%
\affil[1]{Affiliation not available}%
\vspace{-1em}
\date{\today}
\begingroup
\let\center\flushleft
\let\endcenter\endflushleft
\maketitle
\endgroup
\sloppy
\section{INTRODUCTION}\label{introduction}
The name of the analysed article is \textbf{``Multi-class AdaBoost''}
witch is a special issue on data mining and machine learning. It was
written by \textbf{Ji Zhu} who is a professor of statistics and EECS in
department of statistics at Michigan University, \textbf{Hui Zou} who is
a professor of statistics at Minnesota University, \textbf{Saharon
Rosset} who is associate professor in the department of statistics and
Operations research at Tel Aviv University and \textbf{TREVOR HASTIE}
who is a professor of statistics and biomedical data science at Stanford
University.
The article was published in the \textbf{``Statistics and Its Interface
(2009)'', Volume 2 p(349--360)}. This journal as defined in {[}5{]}:
``is a quarterly peer-reviewed open access scientific journal covering
the interface between the field of statistics and other disciplines. The
journal was established in 2008 and is published by International Press.
The editor-in-chief is \textbf{Heping Zhang} (Yale University)''.
In this paper, the authors develops a new algorithm that directly
extends the \textbf{AdaBoost algorithm} from solving the two-class
classification problems to solving the multi-class classification
problems. Before the authors spoke about their new algorithm, they
started by giving us some information about other similar existent
algorithms and their problems, then some information about the
\textbf{AdaBoost algorithm} .
The paper is organized as follows: In \textbf{section 2} , we will see
the scientific context of the article. Some of the existing works on the
subject are listed in \textbf{section 3} . \textbf{Section 4} will
present the contribution of the article. In \textbf{section 5}, we will
see the experiments and the validations that are presented in the
article. The last Section (\textbf{section 6}) will summarize all the
work.
\section{Context of the work}\label{context-of-the-work}
As cited in {[}1{]}:" \textbf{Boosting} is a machine learning ensemble
meta-algorithm for primarily reducing bias, and also variance in
supervised learning, the original algorithm, proposed by \textbf{Robert
Schapire} and \textbf{Yoav Freund} were not adaptive and could not take
full advantage of the weak learners``.
Few years after, \textbf{Schapire} and \textbf{Freund} developed
\textbf{AdaBoost} {[}2{]}, a new boosting algorithm witch can take full
advantage of the weak learners, but this one was a good technique for
only solving a two-class classification problems where the random
guessing error rate is equal to 1/2, in the multi-class case the random
guessing error rate is equal to (k-1)/K, where K is the number of class
(if we have K=2 (two class) we will have the error rate equal to 1/2),
in this case AdaBoost may fail, it's the main problem of this algorithm.
AdaBoost is a good technique for solving classification problem, it's
why the authors try to directly extend this algorithm and create a new
one named (SAMME: Stagewise Additive Modeling using a Multi-class
Exponential loss function).
\section{Positioning}\label{positioning}
Several different algorithms have been developed to extend the Boosting
algorithm, but most of them have been restricted to reducing the
multi-class classification problem to multiple two-class problems, e.g.:
Schapire, R. (1997). Using output codes to boost multiclass learning
problems and Schapire, R. and Singer, Y. (1999). Improved boosting
algorithms using confidence-rated prediction.
Another multi-class boosting algorithm was proposed by Friedman, J.,
Hastie, T., and Tibshirani, R {[}3{]}, Based on the statistical
explanation that the success of AdaBoost can be understood by the fact
that the population minimizer of exponential loss is one half of the
log-odds. This algorithm looks very different from AdaBoost, hence it is
not clear if the statistical view of AdaBoost still works in the
multi-class case.
\section{Contributions}\label{contributions}
To minimize the test error produced by using directly \textbf{AdaBoost}
algorithm for a multi-class classification, the authors developed a new
algorithm using the same characteristic of AdaBoost algorithm with small
modification to make them able to use it on the multi-class case without
reducing the problem to multiple two-class classification problems like
others algorithms did.
\section{Experiments}\label{experiments}
As I said in part4, the authors creates new algorithm (SAMME{[}4{]}) by
using the same characteristic of AdaBoost algorithm with one
modification in the equation of \selectlanguage{greek}α \selectlanguage{english}(witch is the equation of the boost
weight), this change was to add the term ( +Log(K-1) ) at the end of the
equation, with K the number of class, as we can see if we have two-class
(K=2), this additional term will be equal to 0, by the way the new
algorithm will be the same as AdaBoost.
To demonstrate the efficiency of their Multi-class classification
algorithm (SAMME), the authors makes comparison between AdaBoost and
SAMME algorithm.
\subsection{Example:}\label{example}
In this comparison the authors used as cited in {[}4{]} ``a simple
three-class simulation example, Each input x [?] R10, and the ten input
variables for all training examples are randomly drawn from a
ten-dimensional standard normal distribution.''
\textbf{Description of Figure 1\&2} * The left panels represents the
test error of each algorithm as a function of the number of iterations.
* The middle panels represents the classification error. * The Right
panels represents the boost weight of each tree.
\subsubsection{For ADABOOST algorithm:}\label{for-adaboost-algorithm}
\textbf{Figure 1 (upper row)}
This figure shows the bad result of AdaBoost algorithm when it was used
on a multi-class case . As we can see (left panel), the test error of
AdaBoost starts to increase after some iterations, after about 50
iterations the test error stop around one value, then it remains at this
value until the end of the test.
This result can be understood from the middle and right panels: As we
can see, the err(m) (witch is the classification error) starts below
0.5; after about 50 iterations (after same number of iteration where the
test error stops at one value), the classification error became equal to
0.5. Once err(m) is equal to 0.5, we have \selectlanguage{greek}α\selectlanguage{english}(m) = 0 it means that the
weights of the training samples do not get updated , therefore the same
low classifier is adjusted again and again, but is not added to the
existing fit, it's why the test error stop around one value.
\subsubsection{For SAMME algorithm:}\label{for-samme-algorithm}
\textbf{Figure 1 (lower row)}
As we can see (left panel), the test error of SAMME in opposite of
AdaBoost, starts to decrease after few iterations and keeps decreasing
until the end of the test, \textbf{this result means that this algorithm
is a successful Boosting Algorithm for multi-class case}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.98\columnwidth]{figures/Z---Copie1/Z---Copie1}
\caption{{\textbf{Figure 1:} Comparison of AdaBoost and the new algorithm
\textbf{SAMME} on a simple three-class simulation example. The upper row
is for \textbf{AdaBoost} and the lower row is for \textbf{SAMME}.%
}}
\end{center}
\end{figure}
\subsection{Example 2:}\label{example-2}
As described in {[}4{]}: ``The \textbf{SAMME} algorithm expects the weak
learner to deliver a classifier T(x) [?] \{1,\ldots{},K\}. An alternative
is to use real-valued confidence-rated predictions such as weighted
probability estimates, to update the additive model, rather than the
classifications themselves''.
This hypothesis allowed the author to create a new variant of SAMME
algorithm called SAMME.R (R for Real) The main characteristic of this
new variant is the ability to use probability estimates to update the
additive model.
To compare between the two variants I played the code of python present
on the toolbox, this test take the same example of figure 1 but only
between \textbf{SAMME} and \textbf{SAMME.R}
As we can see in \textbf{figure 2} , after few iterations the test error
of the \textbf{SAMME.R} algorithm decreases more quickly than the test
error of \textbf{SAMME}, but at the end they will be at the same value.
This result shows how the SAMME.R is more efficient than SAMME algorithm
on this example.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/Z1/Z1}
\caption{{\textbf{Figure 2:} Comparison of SAMME and SAMME.R algorithm on simple
three-class simulation example%
}}
\end{center}
\end{figure}
\section{Conclusion}\label{conclusion}
The work of the authors in this article, by creating SAMME algorithm to
solve the multi-class classification problem, and comparing with other
existent algorithms (like AdaBoost and AdaBoost.MH) was to illustrate
that \textbf{SAMME} is the natural extension of the AdaBoost algorithm
to the multi-class case. In conclusion the SAMME and the SAMME.R
algorithms are very efficient for solving the multi-class classification
problems.
\textbf{REFERENCES}
\begin{itemize}
\tightlist
\item
{[}1{]} https://en.wikipedia.org/wiki/Boosting\_(machine\_learning).
\item
{[}2{]} https://en.wikipedia.org/wiki/AdaBoost
\item
{[}3{]} Friedman, J., Hastie, T., and Tibshirani, R. (2000). Additive
logistic regression: a statistical view of boosting. Annals of
Statistics 28 337--407. MR1790002
\item
{[}4{]}
https://web.stanford.edu/\textasciitilde{}hastie/Papers/samme.pdf
\item
{[}5{]} https://en.wikipedia.org/wiki/Statistics\_and\_Its\_Interface
\end{itemize}
\selectlanguage{english}
\FloatBarrier
\end{document}