Veronica Piccialli – Sapienza Università di Roma, Italy
The minimum sum-of-squares clustering (MSSC), or k-means type clustering, traditionally belongs to the unsupervised learning domain. However, using background knowledge to improve the cluster quality and promote the interpretability of the clustering process has attracted the attention of both the mathematical optimization and the machine learning community. Including background information leads to the so-called semi-supervised or constrained clustering. In this talk, we focus on exact algorithms for different variants of the MSSC problem: vanilla or unconstrained MSSC, MSSC with pairwise constraints (must-link and cannot-link constraints) and strict cardinality constraints. In all cases, the main ingredient is the use of semidefinite programming (SDP) tools for solving large-scale clustering problems to global optimality. The numerical results show that this class of approaches significantly increases the size of instances that can be solved to global optimality and represents state of the art for this class of problems.
This is a joint work with Anna Russo Russo, Antonio Maria Sudoso, and Angelika WIegele.
Campus de l'Université de Montréal
2920, chemin de la Tour
Montréal Québec H3T 1J4