7,083 views
A Markov chain is a random dynamic system; the state at time t+1 depends on the state at time t and a random draw. Markov chains model a large number of problems, meteorology, industrial processes, or even demographic models. Adding the concept of decision leads to a Markov decision process: the state at time t+1 depends on (1) the state at time t, (2) a random draw and (3) the decision made. We then speak of a Markov decision process. We can also add the notion of reward; for example, for an economic system, it can be money earned. For an industrial system, it can be pollution (negative reward here). There can possibly be several actors making decisions (possibly having different objectives, or even totally antagonistic, as is often the case in games). There can also be a difficulty of observation: we must then make a decision knowing only part of the state. We will discuss the models (a lot), the theory (the broad outlines) and the scope of application (briefly).