Towards a Visual Perception-Based Analysis of Clustering Quality Metrics

Two scatter plots shown side by side. Each plot has dots in three different colors. — Does one scatter plot do a better job at representing the clusters than the other?

Abstract

Clustering is an essential technique across various domains, such as data science, machine learning, and explainable artificial intelligence. Information visualization and visual analytics techniques have been proven to effectively support human involvement in the visual exploration of clustered data to enhance the understanding and refinement of cluster assignments. To support the human involvement, several perceptual studies and visual quality metrics have already been proposed. However, the visual perception of clustering quality metrics, also known as Cluster Validity Indexes (CVIs), still remains to be further explored. This paper presents the first attempt of a deep and exhaustive evaluation of the perceptive aspects of clustering quality metrics, focusing on the Davies-Bouldin Index, Dunn Index, Calinski-Harabasz Index, and Silhouette Score. Our research is centered around two main objectives: a) assessing the human perception of common CVIs in 2D scatterplots and b) exploring the potential of Large Multimodal Models, in particular GPT-4o, to emulate the assessed human perception. To this end, we conducted two systematic data studies and a user study covering a broad collection of datasets. By discussing the obtained results, highlighting limitations, and areas for further exploration, this paper aims to propose a foundation for future research activities.

Materials

DOI | Code | BibTeX

Authors

Graziano Blasilli

Daniel Kerrigan

Enrico Bertini

Giuseppe Santucci

Citation

Towards a Visual Perception-Based Analysis of Clustering Quality Metrics

Graziano Blasilli, Daniel Kerrigan, Enrico Bertini, and Giuseppe Santucci. IEEE Visualization in Data Science (VDS). 2024. DOI: 10.1109/VDS63897.2024.00007

DOI | Code | BibTeX

Khoury Vis Lab — Northeastern University
* West Village H, Room 302, 440 Huntington Ave, Boston, MA 02115, USA
* 100 Fore Street, Portland, ME 04101, USA
* Carnegie Hall, 201, 5000 MacArthur Blvd, Oakland, CA 94613, USA