Example:
Imagine a player keeps rolling a die until they either stop or roll a . Their score is the last die they’ve rolled, or if they rolled a .
What’s the optimal strategy if the player wants to maximize their expected payoff?
Define if ,
Let be the expected winnings of the player given that the first roll is if the player takes the optimal strategy
Let be the expected payoff continuing after the first roll
If , the player should take the money, otherwise the player should roll again
Since , we have
We now know a bit more about this game, that if the first roll is a 1 or a 2 then the player should roll again
Suppose the first roll is a 4. If the optimal strategy was continuing, then the game would look like rolling until a 5 or 6 is rolled, meaning the expected winnings is . This is less than 4, so clearly the optimal strategy is stopping on 4.
Let be the expected winnings after continuing and assume we continue after 3,
So we can stop or continue after rolling a 3 for the same result
Optimal Stopping of Markov Chains
Suppose is the transition matrix for a discrete-time Markov chain with state space , and define a payoff function for all
We are interested in cases where is not irreducible, since otherwise we can always keep going until any condition is satisfied
A stopping rule or stopping time is a random variable that gives the time at which the chain is stopped, based only on what has happened up to step
All reasonable rules divide the state space into two sets and and stop when the chain reaches a state in
Our aim is to maximize expected payoff over all legal stopping rules
We generalize the rules defined in the initial example,
Assuming the optimal strategy, and
We call a function superharmonic with respect to if
Let
We can prove this by induction
Since is a bounded function, we can say
Suppose for all ,
Hence every superharmonic function larger than is greater than or equal to the value function
, the infimum over all superharmonic functions , i.e. is the smallest superharmonic function with respect to that is greater than or equal to
We use this fact to determine . Start with which equals if is absorbing and otherwise equals the maximum value of
Let
We see and therefore , meaning is also superharmonic and greater than
It’s already clear from this equation that
Suppose we know the optimal strategy and ,
if otherwise
For a finite-state Markov chain, the solution can be found directly from this system of linear equations
Now to prove the above, we see that is a superharmonic function and for all , so .
Define and
On ,
Therefore,
Since is the largest expected value over all possible stopping sets, we get
So
Optimal Stopping with Cost
Assume continuing has a cost of
The optimal stopping rule is to stop when the chain enters
is the smallest function greater than satisfying
Our algorithm is similar,
and
Including a cost function means irreducible Markov chains become more interesting
Optimal Stopping with Discounting
In financial matters, we often assume the value of money decreases over time
is the smallest function with and
We can use the same recursive algorithm