Mixture-of-experts (MOE) models have attracted much attention in machine learning, statistics, and health sciences. Statistically, the MOE models are used to estimate the conditional distribution of a random variable
\(Y\) conditioning on certain features
\(x= (x_1, x_2,...,x_p)\). It is assumed that the data
\(Y\) arise from a heterogeneous population decomposable into sub-populations defined on (possibly overlapping) regions of the space of
\(x\). In the first part of the talk I will discuss estimation and feature selection problems in MOE models. A penalized maximum likelihood estimator is proposed as an alternative to the ordinary maximum likelihood estimator. The new estimator is particularly advantageous when fitting an MOE model to data with many correlated features. It is shown that the proposed estimator is root-
\(n\) consistent, and simulations show its superior finite sample behaviour compared to that of the maximum likelihood estimator. I'll then discuss feature selection problems in MOE models. A computationally efficient and statistically optimal feature selection method is proposed for MOE models. Properties of the method will be discussed. A real-data example will be presented to illustrate the usage of the new method.
In the second part of the talk I will present some new developments on feature selection problems in mixture regression models when the dimension of the model can grow with the sample size.