We present a new column generation algorithm for the determination of a classifier in the two classes LAD (Logical Analysis of Data) model. Unlike existing algorithms who seek a classifier that at the same time maximizes the margin of correctly classified observations and minimizes the amount of violations of incorrectly classified observations, we fix the margin to a difficult-to-achieve target and minimize a piecewise convex linear function of the violation of incorrectly classified observations. Moreover a part of the training set, called control set, is reserved to select, among all feasible classifiers found by the algorithm, the one with highest performance on that set. Computational results are presented that show the effectiveness of this approach.
Published October 2009 , 28 pages
This cahier was revised in January 2011