The minimum sum-of-stars (or p-median, or p-medianoïd) clustering problem consists in partitioning a given set of entities into P clusters such that the sum for all clusters of the minimum sum of dissimilarities between an entity of that cluster (the cluster center) and all others is as small as possible. This problem is studied under the additional constraint that the sum of weights of the entities in each cluster is bounded by a given value. An algorithm is proposed to solve it, which combines the column generation technique of linear programming with branch-and-bound. A theoretical comparison is made between the bounds obtained by column generation and by Lagrangean relaxation. The auxiliary problem solved to determine the entering column for the linear program corresponding to each node in the branch-and-bound search tree is a new combinatorial one. It is called knapsack problem with incompatibilities. A specific algorithm is provided for its solution. Computational experience with randomly generated data and with problems from the literature is reported.
Published February 1993 , 36 pages
This cahier was revised in February 1994