Genetic algorithms are used for feature selection through a fitness function that drives the evolution of populations. With parallel universes, an importance score may be produced for each feature to determine subjectively from a plot which to retain. The authors derive the distribution of those importance scores under the null hypothesis that none of the features has predictive power and they determine an objective threshold for feature selection. The authors discuss the parameters for which the theoretical results hold. They illustrate their method on real data and run simulation studies to describe its performance.
Published September 2018 , 18 pages
G1870.pdf (500 KB)