This work proposes a statistical methodology to handle aggregate data. Aggregate data arises in many fields such as medical science, ecology, social science, reliability, etc. They can be described as followed: individuals are moving progressively along a finite set of states and observations are made in a time window split in several intervals. At each observation time, the only available information is the number of individuals in each state and the history of each item viewed as a stochastic process is thus lost. The time spent in a given state is unknown. Using a data completion technique, we obtain estimation of the hazard rate in each state based on sojourn times and deduce an estimation of the survival function. Unlike other studies on the same topic, the Markov assumption is not required to obtain these estimates. We study the methods through simulations and apply it to a data set that originally motivated this research.
Published November 2008 , 21 pages