Given a set of entities associated with points in Euclidean space, minimum sum-of-squares clustering (MSSC) consist in partitioning this set into clusters such that the sum of squared distances from each point to the centroid of its cluster is minimized. A column generation algorithm for MSSC was given by du Merle et al. (2000). The bottleneck of that algorithm is the resolution of the auxiliary problem of finding a column with negative reduced cost. We propose a new way to solve this auxiliary problem based on geometric arguments. This greatly improves the efficiency of the whole algorithm and leads to exact solution of instances with over 2300 entities, i.e., more than 10 times as much as previously done.
Published May 2009 , 26 pages