Mathematics

Markov Chains

Markov chains are mathematical structures which describe a sequence of states. Transitions can occur between states, and the probability of a state transitioning into another state is determined only by the current state itself. In this sense, the Markov chain is memoryless. Markov chains are extremely important in quantitative finance and are applied in a wide range of areas, such as asset price modelling, portfolio management, and risk management. Markov chains can be used to solve a wide range of problems and their use is commonplace in technical interview questions. Therefore, a strong working knowledge of Markov chains is an indispensable tool for landing a job and excelling in the industry.

Consider the discrete state space $S=\{S_1, S_2, ..., S_n\}$. For each pair of states, $S_i$ and $S_j$, there is an associated transition probability for the transition $S_i \rightarrow S_j$. The transition probabilities between all pairs of states can be organised into a {\bf transition matrix}. For a state space consisting of $n$ states, the transition matrix is an $n \times n$ matrix where each entry $T_{ij}$ represents the probability of the transition $S_i \rightarrow S_j$ occurring. For example, the transition matrix of two states $A$ and $B$ $$ T = \begin{pmatrix} 0.8 & 0.2\\ 0.7 & 0.3\\ \end{pmatrix} $$ represents a Markov chain where the probabilities of transition between A and B are $$ p_{A \rightarrow A} = 0.8, $$ $$ p_{A \rightarrow B} = 0.2, $$ $$ p_{B \rightarrow A} = 0.7, $$ $$ p_{B \rightarrow B} = 0.3. $$ An intuitive way to visualise a Markov chain is with a Markov chain diagram. A Markov chain diagram represents each state as a node on a graph, with connected states being linked by an arrow. Each transition arrow is labelled with the corresponding transition probability.

There are typically two types of Markov chain questions asked in technical interviews. The first are questions which involve solving for the expected number of transitions before first reaching some state. Consider a Markov chain with state space $S = \{ S_1, S_2, ..., S_n \}$. Let $A$ be a set of states which is a subset of $S$. Let $T$ be the first time the chain visits a state in $A$. For all states in $S$, we can define the expected number of transitions before reaching a state in $A$, $$ t_i = \text{E}\left[T \hspace{1mm} | \hspace{1mm} X = S_i\right]. $$ Clearly, if a state is in $A$ then $t_i = 0$. The expected number of transitions until we reach a state in $A$ is $$ t_i = 1 + \sum_k t_k p_{ik} $$

$$ t_A = 1 + \sum_{k} t_k p_{ik} $$ $$ t_A = 1 + t_A p_{AA} + t_B p_{AB} = 1 + 0.8p_{AA} $$ Clearly, $t_{BB} = 0$. Therefore, we can solve the above equation to get $$ E[A] = 5. $$

Now, let's consider a more challenging problem - the expected number of die rolls until we roll two consecutive sixes. The Markov chain for this problem is shown below.

The sample space is $\{X_S, X_6, X_{66} \}$, denoting the starting node, the state where we have just rolled the first six, and the the state where we have just rolled two consecutive sixes respectively. The quantity we want to calculate is $$ t_S = \text{E}\left[ T \hspace{1mm} | \hspace{1mm} X = X_S \right] $$ So, let's set up the equations, keeping in mind that $t_{66} = 0$, $$ t_S = 1 + t_6 p_{S \rightarrow 6} + t_{S} p_{S \rightarrow S} = 1 + \frac{1}{6} t_6 + \frac{5}{6} t_{S}, $$ $$ t_6 = 1 + t_{66} p_{6 \rightarrow 66} + t_{S} p_{6 \rightarrow S} = 1 + \frac{5}{6} t_{S}. $$ We solve these linear equations and get $t_S = 42$. It takes 42 dice rolls on average to get two sixes in a row.

Now let's consider a follow-up question. What is the expected number of dice rolls to roll a $1$ and then roll a $6$ immediately afterwards? Before we look at the solution, you should seriously consider this question. Is the expected number of dice rolls to roll a $1$ and then a $6$ more, less or equal to rolling two consecutive sixes, and why? The difference between these two questions becomes clear when looking at the Markov chain diagram. Try drawing out the Markov chain yourself. The solution is displayed below.

We see that the intermediate state has a transition back to itself. This transition occurs when we roll a $1$ immediately after rolling a $1$. In light of this, do you expect the expected number of rolls to be greater, less than or equal to $42$? Why?

$$ t_S = \text{E}\left[ T \: | \: X = X_S \right] $$ $$ t_S = 1 + t_6 p_{S \rightarrow 1} + t_{S} p_{S \rightarrow S} = 1 + \frac{1}{6} t_1 + \frac{5}{6} t_{S} $$ $$ t_1 = 1 + t_{S} p_{1 \rightarrow S} + t_{6} p_{1 \rightarrow 6} + t_{1} p_{1 \rightarrow 1} = 1 + \frac{4}{6} t_{S} + \frac{1}{6}t_{1} $$ Solving these simultaneous equations give us $t_S = 36$. The expected number of die rolls to roll a $1$ and then a $6$ is less than rolling two consecutive sixes because the probability of returning to the starting state in the former case is less, with $1/6$ probability of returning to intermediate state instead.

Now let's consider the other type of question: the probability of absorption by a particular state in a Markov chain. We describe a state as absorbing if it has the property $$ P_{S_i \rightarrow S_j} = 0 \hspace{2mm} \text{for all} \hspace{2mm} j \neq i. $$ That is, the probability of transition from an absorbing state $S_i$ to another state $S_j$ is zero for all states except for itself. These are the final states where no further transitions to a different state can occur. Suppose that we have a Markov chain with sample space $S = \{A_1, A_2, .., A_n, S_1, S_2, ..., S_n\}$. The subset $A = \{ A_1, A_2,...,A_n \}$ are absorbing states which are accessible from some of the states $\{ S_1, S_2, ... S_n\}$. The question we are concerned with solving is - what is the probability of absorption at $A_1$ from some initial state $S_i$? Let us denote the probability of absorption at $A_1$ from an initial state $S_i$ as $a_i$, $$ a_i = P( \text{absorption at } A_1 \: | \: X = S_i) $$ The key idea is that if the current state is $S_i$, then the next state will be $S_j$ with probability $p_{ij}$. Therefore we can write $a_i$ as a function of other absorption probabilities $$ a_i = \sum_k a_k p_{ik} $$ We can derive a set of simultaneous equations of this form, and solve for $a_i$.

Consider a Markov chain consisting of 4 available states $\{ A, B, C, D\}$, which has transition matrix $$ T = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0.5 & 0 & 0.5 & 0\\ 0 & 0.5 & 0.2 & 0.5\\ 0 & 0 & 0 & 1\\ \end{pmatrix} $$

We want to calculate $$ a_C = \sum_k a_k p_{Ck} $$ $$ a_C = a_A p_{CA} + a_B p_{CB} + a_C p_{CC} + a_D p_{CD} $$ where $a_A = 1$, $a_D = 0$. $$ a_C = 0.5 a_B + 0.2 a_C $$ $$ a_B = 0.5 + 0.2 a_C $$ Solving gives us $$ a_C = 0.5 (0.5 + 0.2 a_C) + 0.2 a_C $$ $$ a_c = 35.7\%. $$