Groupe d’études et de recherche en analyse des décisions

G-2015-140

An iterative algorithm for the solution of very large-scale diameter clustering problems

et

We introduce an iterative algorithm for the solution of the diameter minimization clustering problem (DMCP). Our algorithm is based upon two observations: 1) subsets induce lower bounds on the value of the optimal solution of the original problem; and 2) there exists a subset whose optimal clustering has the same value as that of the original problem. We also describe how to adapt our algorithmic framework for the solution of other clustering problems, namely the minimum sum-of-diameters clustering problem (MSDCP), the split maximization clustering problem (SMCP) and the maximum sum-of-splits clustering problem (MSSCP). A parallel implementation of our algorithm can solve problems containing almost 600,000 entities while consuming only moderate amounts of time and memory. The size of the problems that can be solved using our algorithm is two orders of magnitude larger than the largest problems solved by the current state-of-the-art algorithm.

, 18 pages