Back

Session TC2 - Recherche à Voisinage Variable/ Variable Neighborhood Search

Day Tuesday, May 8, 2007
Room Van Houtte
Chair Nenad Mladenovic

Presentations

03h30 PM-
03h55 PM
Optimization by Combining EM and VNS Algorithms for Fitting Finite Gaussian Mixture Model Clustering
  Bessadok Adel, Faculté des Sciences Economiques et de Gestion de Sfax, Methodes quantitatives, 4 rue de la Jeunesse, Gabès, Gabès, Tunisia, 6001
Pierre Hansen, GERAD et HEC Montréal, Méthodes quantitatives de gestion, 3000, chemin de la Côte-Sainte-Catherine, Montréal, Québec, Canada, H3T 2A7

This paper is aimed at a new improvement of the expectation-maximization algorithm (EM) used for fitting finite Gaussian mixture model clustering by combining it with a metaheuristic variable neighbourhood search (VNS). As a data mining method, this new approach is potentially useful in many fields where identification of a finite set of categories of similar objects or clusters to describe a given data is sought. This new approach is in fact applied for solving combinatorial and global optimization problems based on expectation-maximization algorithm (EM) for maximum likelihood estimation (MLE) using a metaheuristic variable neighbourhood search (VNS). In clustering method, to reach suitable and useful grouping we need to estimate the finite mixtures parameters using the maximum likelihood estimation (MLE). Thus in practice the likelihood function could not be quadratic giving rise to nonlinearity problems in MLE. However, having multiple local maximums, the likelihood function is estimated using numerical iterative methods. The EM algorithm is now a popular tool for iterative MLE in a variety of problems involving missing data or incomplete information. As an iterative algorithm, EM depends on the first initialization and may not lead to the best parameter estimation. The main idea is to reformulate EM to be not totally dependent on starting value. After performing EM with a poor initialization, VNS is introduced as a systematic change of neighbourhood within a local search that provides a new perturbation leading to a better initial parameter. Therefore, we generate a sample of finite Gaussian Mixture Model (GMM) used to estimate the parameters using two initializations leading to two different results. Comparing to the real data we take the poor choice of starting value and we define neighbourhoods until a best local maxima is reached. It comes into sight that EMVNS guarantees better parameter estimation free from initial starting value for fitting GMM clustering. Keywords: EM algorithm; Metaheuristic, Variable Neighbourhood Search; Maximum Likelihood Estimation; Finite Gaussian Mixture Model; Clustering.


03h55 PM-
04h20 PM
Primal-Dual Variable Neighborhood Search for Data Clustering
  Jack Brimberg, Kingston, Canada
Dragan Urosevic, Mathematical Institute SANU, Belgrade, Serbia
Pierre Hansen, GERAD et HEC Montréal, Méthodes quantitatives de gestion, 3000, chemin de la Côte-Sainte-Catherine, Montréal, Québec, Canada, H3T 2A7
Nenad Mladenovic, Brunel University, School of Mathematics and GERAD, Uxbridge, Middlesex, West London, United Kingdom, UB8 3PH

Data clustering methods have been developed extensively in the data mining literature for detecting useful patterns in large datasets in the form of densely populated regions in a multi-dimensional Euclidean space. One of the challenges in dealing with these large databases is to obtain quality solutions within reasonable CPU time and memory requirements. Earlier partitioning methods such as PAM (1990) become inefficient on these larger sets due to repetitive and lengthy neighborhood searches in the solution space. To get around this problem, CLARA (1990) randomly samples the solution space first, and then solves a clustering problem ($p$-median) on the much smaller representative sample using PAM. Meanwhile CLARANS (2002) randomly selects neighborhood points up to a maximum number instead of searching the entire neighborhood. BIRCH (1996) uses hierarchical approach to cluster the data in a single pass. Improvements are made at the user's option by further passes through the (classified) data.


04h20 PM-
04h45 PM
Continuous Variable Neighborhood Search
  Milan Drazic, Faculty of Mathematics, University of Belgrade, Belgrade, Serbia
Vera Kovacevic-Vujcic, Faculty of Organizational Sciences, University of Belgrade, Belgrade, Serbia
Mirjana Cangalovic, Faculty of Organizational Sciences, University of Belgrade, Belgrade, Serbia
Nenad Mladenovic, Brunel University, School of Mathematics and GERAD, Uxbridge, Middlesex, West London, United Kingdom, UB8 3PH

We suggest heuristic for solving unconstrained continuous optimization problems. It is based on a generalized version of the variable neighborhood search metaheuristic. Different neighborhoods and distributions, induced from different metrics are ranked and used to get random points in the shaking step. We also propose VNS for solving constrained optimization problems. The constraints are handled using exterior point penalty functions within an algorithm that combines sequential and exact penalty transformations. The extensive computer analysis that includes the comparison with genetic algorithm and some other approaches on standard test functions are given.


Back