K-means algorithm step by step:
1) Choose number of K clusters
2) Select random k points as centroids
3) Assign each point to closest(based on distance) centroid and recompute the centroid until last point
4) repeat step 3 until centroid doesn't change.
K-means random initialization Trap:
What
would happen if we did bad random initialization?
If we select different centroids we may get different clusters.
how to tackle this?
K-means++ algorithm.
Within Cluster Sum of Squares(WCSS):
p--> point c- centriod
choose low WCSS. that leads to n(total points) clusters.
Then how to choose number of clusters?
Elbow method:
4 can be chosen as number of clusters as graph(WCSS) drops after 4
Python implementation:
see the python implementation file in git

No comments:
Post a Comment