Groupe d’études et de recherche en analyse des décisions

# No-means clustering: A Stochastic variant of $$k$$-means

## Vahid Partovi Nia, Martin Lysy et Geoffroy Mouret

Simple, intuitive, and scalable to large problems, $$k$$-means clustering is perhaps the most frequently-used technique for unsupervised learning. However, global optimization of the $$k$$-means objective function is challenging, as the clustering algorithm is highly sensitive to its initial value. Exploiting the connection between $$k$$-means and Bayesian clustering, we explore the benefits of stochastic optimization to address this issue. Our "$$\texttt{no means}$$" algorithm has provably superior mixing time to a natural Gibbs sampler with auxiliary cluster centroids. Yet, it retains the same computational complexity as the original $$k$$-means approach. Comparisons on two benchmark datasets indicate that stochastic search usually produces more homogeneous clusters than the steepest descent algorithm employed by $$k$$-means. Our $$\texttt{no means}$$ method objective function has multiple modes which are not too far apart.

, 13 pages