In this article, we consider nonparametric regression analysis between two variables when data are sampled through a complex survey. While nonparametric regression analysis has been widely used with data that may be assumed to be generated from independently and identically distributed (iid) random variables, the methods and asymptotic analyses established for iid data need to be extended in the framework of complex survey designs. Local polynomial regression estimators are studied, which include as particular cases design-based versions of the Nadaraya-Watson estimator and of the local linear regression estimator. In this paper, special emphasis is given to the local linear regression estimator. Our estimators incorporate both the sampling weights and the kernel weights. We derive the asymptotic mean squared error (MSE) of the kernel estimators using a combined inference framework, and as a corollary consistency of the estimators is deduced. Selection of a bandwidth is necessary for the resulting estimators; an optimal bandwidth can be determined, according to the MSE criterion in the combined mode of inference. Simulation experiments are conducted to illustrate the proposed methodology and an application with the Canadian Survey of Labour and Income Dynamics is presented.
Published December 2008 , 29 pages