How to Choose <i>K</i> Entities Among <i>N</i>

Hansen, Pierre; Jaumard, Brigitte; Mladenović, Nenad

A new paradigm for cluster analysis is outlined. It is called Sequential Clustering. Given a set O of N entities a best cluster of K entities (where K is a parameter) is found and its entities removed from O. The procedure is iterated until all entities have been removed or the set of remaining ones presents no more structure. Various dissimilarity-based criteria for finding a best cluster are considered. Complexity of the resulting problems is examined and algorithms recalled, when some have already been proposed, or outlined when none appear to be yet available. Experiments with threshold type criteria and Ruspini's data are reported.

Published September 1994 , 16 pages

GERAD

G-94-38

How to Choose K Entities Among N

Pierre Hansen, Brigitte Jaumard, and Nenad Mladenović