Markov Decision Processes: the control of high-dimensional systems

Bhulai, Sandjai
Markov Decision Processes: the control of high-dimensional systems
Ph.D. Thesis, Vrije Universiteit, 2002 (Postscript - )

Abstract: We develop algorithms for the computation of (nearly) optimal decision rules in high-dimensional systems. We start with a survey of the literature on this topic. Next, the formal theory of Markov decision processes is discussed. Extra emphasis is put on the role of value functions within this theory. The value function plays a central role in one-step policy improvement and in neuro-dynamic programming; these are two techniques to approximate optimal decision rules. We systematically analyze the value function of birth-death processes. The well-known queueing systems, such as the multi-server queue, the single server queue, and the infinite server queue, are special cases of it. We also study more complex queueing systems, such as the priority queue and the tandem queue. The second part considers Markov decision processes in which the decision maker is confronted with additional uncertainty, apart from the uncertainty of the future development of the system. This uncertainty can be due to partial observation of the state of the system, or because of partial specification of transitions between states of the system when a decision is taken. We show that the latter is a special case of the former, and that the problem can be cast as a standard Markov decision process. However, the reformulation causes the system to be high-dimensional again. Although the number of state configurations is very large, only a limited fraction of them can be actually attained. This make it possible to reduce the system in dimensionality and to analyze it. These techniques will be illustrated by means of routing problems and multi-armed bandit problems.