Tree-based methods are frequently used in studies with censored survival time. Their structure and ease of interpretability make them useful to identify prognostic factor and to predict conditional survival probabilities given an individual's covariates. The existing methods are tailored made to deal with a survival time variable that is measured continuously. However, survival variables measured on a discrete scale are often encountered in practice. We propose a new tree construction method specifically adapted to such discrete-time survival variables. The splitting procedure can be seen as an extension, to the case of right-censored data, of the entropy criterion for a categorical outcome. The selection of the final tree is made through a pruning algorithm combined with a selection procedure based on a penalized likelihood criterion similar to the AIC and BIC criteria. We also present a simple way of potentially improving the predictive performance of a single tree through bagging. A simulation study shows that single trees and bagged-trees perform well compared to a parametric model. A real data example investigating the usefulness of personality dimensions in predicting early onset of cigarette smoking is presented.
Published September 2007 , 24 pages