Partitioning cluster analysis
: Quick start guide - Unsupervised Machine Learning andClustering
is a data exploratory technique used for discovering groups or pattern in a dataset. There are two standard clustering strategies: partitioning methods
and hierarchical clustering
This article describes the most well-known and commonly used partitioning algorithms
- K-means clustering
(MacQueen, 1967), in which, each cluster is represented by the center or means of the data points belonging to the cluster.
- K-medoids clustering
(Partitioning Around Medoids
, Kaufman & Rousseeuw, 1990), in which, each cluster is represented by one of the objects in the cluster. We'll describe also a variant of PAM
(Clustering Large Applications
) which is used for analyzing large data sets.
For each of these methods, we provide:
- the basic idea and the key mathematical concepts
- the clustering algorithm and implementation in R software
- R lab sections with many examples for computing clustering methods and visualizing the outputshttp://www.sthda.com/english/wiki/partitioning-cluster-analysis-quick-start-guide-unsupervised-machine-learning