Example:
Imagine a player keeps rolling a die until they either stop or roll a . Their score is the last die they’ve rolled, or if they rolled a .

What’s the optimal strategy if the player wants to maximize their expected payoff?

Define if ,

Let be the expected winnings of the player given that the first roll is if the player takes the optimal strategy

Let be the expected payoff continuing after the first roll

If , the player should take the money, otherwise the player should roll again

Since , we have

We now know a bit more about this game, that if the first roll is a 1 or a 2 then the player should roll again

Suppose the first roll is a 4. If the optimal strategy was continuing, then the game would look like rolling until a 5 or 6 is rolled, meaning the expected winnings is . This is less than 4, so clearly the optimal strategy is stopping on 4.

Let be the expected winnings after continuing and assume we continue after 3,

So we can stop or continue after rolling a 3 for the same result

Optimal Stopping of Markov Chains

Suppose is the transition matrix for a discrete-time Markov chain with state space , and define a payoff function for all

We are interested in cases where is not irreducible, since otherwise we can always keep going until any condition is satisfied

A stopping rule or stopping time is a random variable that gives the time at which the chain is stopped, based only on what has happened up to step

All reasonable rules divide the state space into two sets and and stop when the chain reaches a state in

Our aim is to maximize expected payoff over all legal stopping rules

We generalize the rules defined in the initial example,


Assuming the optimal strategy, and

We call a function superharmonic with respect to if
Let

We can prove this by induction

Since is a bounded function, we can say

Suppose for all ,

Hence every superharmonic function larger than is greater than or equal to the value function

, the infimum over all superharmonic functions , i.e. is the smallest superharmonic function with respect to that is greater than or equal to

We use this fact to determine . Start with which equals if is absorbing and otherwise equals the maximum value of

Let
We see and therefore , meaning is also superharmonic and greater than


It’s already clear from this equation that

Suppose we know the optimal strategy and ,
if otherwise

For a finite-state Markov chain, the solution can be found directly from this system of linear equations

Now to prove the above, we see that is a superharmonic function and for all , so .

Define and
On ,
Therefore,
Since is the largest expected value over all possible stopping sets, we get

So

Optimal Stopping with Cost

Assume continuing has a cost of

The optimal stopping rule is to stop when the chain enters

is the smallest function greater than satisfying

Our algorithm is similar,

and

Including a cost function means irreducible Markov chains become more interesting

Optimal Stopping with Discounting

In financial matters, we often assume the value of money decreases over time


is the smallest function with and
We can use the same recursive algorithm