Groupe d’études et de recherche en analyse des décisions

The statistical analysis of cross-sectional survival data with applications to the study of dementia

Marco Carone

To study the natural history of a disease, investigators at times resort to cross-sectional cohort studies, whereby participants are randomly sampled at a fixed point in time. Once their baseline information and disease history are recorded, these participants are followed until death, loss to follow-up or study termination. This design is attractive because it generally requires limited resources. The data it generates, however, suffer from various systematic biases. For example, individuals with longer lifetimes tend to be excessively sampled, and among observed cases, individuals with longer disease durations and more recent onsets are usually overrepresented. In this presentation, the analysis of data emerging from either a cross-sectional cohort study, or its related subdesign, the prevalent cohort study, is discussed. The latter arises when only diseased individuals may be recruited at sampling time. Novel methodologies for estimating important epidemiologic measures of disease risk, such as the incidence rate and the lifetime risk, are presented. The conventional approach to the analysis of prevalent cohort survival data is then revisited and shown to generally result in biased inference. A more appropriate nonparametric methodology for estimating the bivariate distribution of age-at-onset and residual lifetime is proposed as an alternative. Finally, a semiparametric model capturing the impact of age-at-onset on lifetime in a scientifically meaningful manner is constructed, along with an estimating equations framework for inference. All these methodologies are illustrated using data from the Canadian Study of Health and Aging, a large study of dementia in the Canadian elderly population.