Hierarchical clustering fall into two types
Agglomerative: This is a "bottom-up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy.
Divisive: This is a "top-down" approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.
Agglomerative Approach:
1) Each data point is taken as one cluster(n clusters are formed)
2) Take 2 closest points(clusters) and make them as one cluster(n-1 clusters formed with this step)
3) Again take 2 closest points(clusters) and make them as one cluster and repeat this step until one cluster remains
Dendrogram:
Vertical axes is the distance between points(clusters) which represents the dissimilarity between them.
Highest Dissimilarity between 2 clusters is the threshold and gives the no: of clusters.
Value around 20 can be the threshold in above dendrogram based on highest dissimilarity. Therefore 3 clusters.
Simpleway of selecting clusters from dendrogram:
Select the highest vertical line that will not be crossed by horizontal line, then draw a horizontal line across that vertical line. Then count the no of vertical crossed by the horizontal line, which gives the no:of clusters. (as shown in below fig)
No comments:
Post a Comment