We consider a general channel sensing problem where a user has access to a communication system consisting of multiple channels. Each channel is modeled as a multi-state Markov chain (M.C.). At each time instant the system user can select only one channel to sense and uses it to transmit information. A reward depending on the state of the selected channel is obtained for each transmission. The objective is to design a channel sensing policy that maximizes the expected total reward collected over a finite or infinite horizon. The above channel sensing problem arises in opportunistic scheduling over fading channels and cognitive radio networks. In opportunistic transmission over fading channels, the states of the M.C. describe, at any time instant, the quality of the fading channel. In cognitive radio networks a secondary user may transmit over a channel only when the channel is not occupied by the primary user. Thus, at any time instant, one state of the channel can indicate that the channel is occupied at this time by the primary user, and other channel states indicate the quality of the channel that is available to the secondary user.
The same problem also arises in many other areas of science and technology as it is an instance of restless bandit problems, for which the form of optimal policies is unknown in general. We discover sets of conditions sufficient to guarantee the optimality of a myopic sensing policy; we show that under one particular set of conditions the myopic policy coincides with the Gittins index rule.
Welcome to everyone!