G-2026-09
A variable neighborhood search heuristic for semi-supervised minimum sum-of-squares clustering
, , and
BibTeX referenceSemi-supervised clustering is a learning approach that primarily relies on unlabeled data but incorporates some prior information to improve the clustering results. Among various clustering objectives, the minimum sum-of-squares clustering (MSSC) is widely used to partition data by minimizing intra-cluster variances. In our work, we propose a Variable Neighborhood Search (VNS) heuristic for semi-supervised MSSC, where prior information is given in the form of pairwise must-link and cannot-link constraints. Our approach reformulates the optimization problem by representing must-link constraints through the construction of super-points, which implicitly satisfy these constraints, while cannot-link constraints are incorporated as penalties in the objective function. Computational experiments indicate that, in the majority of tested cases, our proposed VNS heuristic outperforms the solutions obtained by the state-of-the-art heuristic algorithm found in the literature within the same computational time.
Published March 2026 , 11 pages
Research Axis
Research application
Document
G2609.pdf (400 KB)