Group for Research in Decision Analysis

Feature selection in high-dimensional heterogeneous time-to-event data; a study on ovarian cancer

Farhad Shokoohi McGill University, Canada

Ovarian cancer is among the leading causes of cancer deaths and extremely difficult to detect at early stages. DNA methylation prole of a genome may provide valuable information, and may even be part of a genetic signature for survival time after the surgery. It is therefore of interest to study how time to the recurrence of ovarian cancer is related to methylation levels of gene promoters. A recent study on relationship between genes and time to the recurrence of ovarian cancer after surgery has been carried out and the methylations of a large number of genes have been measured. The observations are subject to right censoring and indicate some signs of heterogeneity. To take this latter feature into account in our analysis, we consider a nite mixture of accelerated failure time models. Since a large number of genes measured in the study, we apply screening and penalization methods to identify genes that can have considerable effect on survival after the surgery. There are many methods for dealing with time-to-event data with large number of features (p) and small sample size (n). There is, however, no method in the current literature for analyzing censored data with substructure in large p - small n setting. In this talk, we consider variable selection in nite mixture models when observations are subject to right censoring. We propose a penalized likelihood method for this problem. Large sample properties of the proposed method are studied. Simulation studies are carried out to evaluate the performance of the proposed method. The ovarian cancer data are analyzed and the results of the study are discussed.

Free entrance.
Welcome to everyone!