4.3 Clustering
Tableau Video
Begin by learning how Tableau clustering works by following along with this video: Clustering (2:25 min)
-
Download this Tableau workbook below to follow along with the Clustering video
Clustering.twbx
As you can see, Tableau makes clustering incredibly easy. But it's not enough to simply now how to perform the cluster analysis unless you also understand exactly how clustering works.
The objective of cluster analysis is to assign observations to groups (“clusters” or “segments”) so that observations within each group are similar to one another with respect to variables or attributes of interest, and the groups themselves stand apart from one another. In other words, the objective is to divide the observations into homogeneous and distinct groups.
Clustering Explained
Cluster analysis is a form of "unsupervised machine learning" which means that we don't know or differentiate between cause and effect variables (a.k.a. "independent" and "dependent") such as cause="income, age, commute distance" and effect="purchased bike". Rather, we don't know yet whether they will purchase a bike and, instead, we want to group them into unique groups with potentially different interests in bikes so that we can market uniquely to each of them and help determine our product offerings.
In addition to the "market segmentation" example above, other examples of clustering might include grouping customers based on credit risk, employees based on performance, insurance policy holders based on claim history, houses based on type and value, and much more.
Cluster analysis includes the following five steps:
- Select a distance measure
- Select a clustering algorithm
- Define the distance between two clusters
- Determine the number of clusters
- Validate the analysis
Review the slides below to see how cluster algorithms and distance measures are calculated:
Now that you understand how the clustering algorithm works and how distance is measured, download the dataset below and follow along with the video to see a more realistic and practical application of clustering in Tableau: