Groupe d’études et de recherche en analyse des décisions

G-2005-61

Variable Neighborhood Search for Least Squares Clusterwise Regression

et

Clusterwise regression is a technique for clustering data. Instead of using the classical homogeneity or separation criterion, clusterwise regression is based upon the accuracy of a linear regression model associated to each cluster. This model has many advantages, specially for the purpose of data mining, however, the underlying mathematical model is difficult to solve due to its large number of local optima. In this paper, we propose the use of the Variable Neighborhood Search metaheuristic (VNS) to improve the quality of the solution. Two perturbation strategies are described and one of them yields a substantial improvement if compared to multistart (the error is reduced by a factor of more than 1.5 on average for the 10 clusters problem).

, 18 pages

Ce cahier a été révisé en décembre 2007