On the \(k\)-medoids model for semi-supervised clustering

Randel, Rodrigo Alves; Aloise, Daniel; Mladenović, Nenad; Hansen, Pierre

Clustering is an automated and powerful technique for data analysis. It aims to divide a given set of data points into clusters which are homogeneous and/or well separated. The biggest challenge of data clustering is indeed to find a clustering criterion to express good separation of data into homogeneous groups so that they bring useful information to the user. To overcome this issue, it is suggested that the user provides a priori information about the data set. Clustering under this assumption is often called semi-supervised clustering. This work explores semi-supervised clustering through the \(k\)-medoids model. Results obtained by a Variable Neighborhood Search (VNS) heuristic show that the \(k\)-medoids model presents classification accuracy compared to that of the typical \(k\)-means approach. Furthermore, the model demonstrates high flexibility and performance by combining kernel projections with clustering constraints.

Paru en mars 2018 , 14 pages

Axe de recherche

Axe 1 : Valorisation des données pour la prise de décision

Application de recherche

Logistique intelligente (conception d’horaires, chaînes d’approvisionnement, logistique, systèmes manufacturiers)

Document

G1823.pdf (460 Ko)

GERAD

G-2018-23

On the \(k\)-medoids model for semi-supervised clustering

Rodrigo Alves Randel, Daniel Aloise, Nenad Mladenović et Pierre Hansen

Axe de recherche

Application de recherche

Document