3.1.1 Algorithmic steps for Agglomerative Hierarchical
clustering
Let \(X=\{x_{1},x_{2},x_{3},\ldots\ldots.,x_{n}\}\) be the set of
data points.
1) Begin with the disjoint clustering having level \(L(0)\ =\ 0\ \)and
sequence number \(m\ =\ 0.\)
2) Find the least distance pair of clusters in the current clustering,
say pair \((r),\ (s),\) according to\(d[(r),(s)]\ =\ min\ d[(i),(j)]\) where the
minimum is over all pairs of clusters in the current clustering.
3) Increment the sequence number\(:\ m\ =\ m\ +1.\)Merge clusters
(r) and (s) into a single cluster to form the next clustering m. Set
the level of this clustering to \(L(m)\ =\ d[(r),(s)].\)
4) Update the distance matrix, D, by deleting the rows and columns
corresponding to clusters (r) and (s) and adding a row and
column corresponding to the newly formed cluster. The distance between
the new clusters, denoted \((r,s)\ \)and \(old\ cluster(k)\) is defined
in this way:\(d[(k),\ (r,s)]\ =\ min\ (d[(k),(r)],\ d[(k),(s)]).\)
5) If all the data points are in one cluster then stop, else repeat from
step 2).
If points (objects) i and j are agglomerated
into cluster\(i\ \cup\ j\), then we must simply specify the new
dissimilarity between the cluster and all other points (objects or
clusters). The formula is:
\begin{equation}
d\left(i\cup j,k\right)=\alpha_{i}d\left(i,k\right)+\alpha_{j}d\left(j,k\right)+\beta d\left(i,j\right)+\gamma|d\left(i,k\right)-d\left(j,k\right)|\nonumber \\
\end{equation}Where \(\alpha_{i},\alpha_{j},\beta\ and\ \gamma\) define the
agglomerative criterion. In the case of the complete link method, using\(\alpha_{i}=\alpha_{j}=\gamma=\frac{1}{2},\ \ and\ \beta=0\ \)given us
\(d\left(i\cup j,k\right)=\frac{1}{2}d\left(i,k\right)+\frac{1}{2}d\left(j,k\right)+\frac{1}{2}\left|d\left(i,k\right)-d\left(j,k\right)\right|\text{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ }\)(1)