Let \(X_0\) be a random variable taking values in a countable set \(I\), and \(Y_1, Y_2, \ldots\) be independent and identically distributed (say \(Y_i \sim U[0,1]\)). I think we also need to assume they are independent of \(X_0\). Suppose \(G: I \times [0,1] \to I\) is a function, and define \(X_{i+1} = G(X_i, Y_{i+1})\). The problem (from Richard Weber’s Markov chain course) is to prove that \((X_i)_{i\geq 1}\) is a Markov chain - the point being that by choosing \(G\) and \(X_0\) appropriately we can simulate any Markov chain starting at \(X_0\) given only the ability to draw from \(X_0\) and to draw uniform random numbers.

Unpacking the definitions, the first step is to show that for any \(i,j,k \in I\),

\[\begin{gather*} P(G(G(X_0, Y_1), Y_2) = k | X_0 = i, G(X_0, Y_1) = j) =\\ P(G(G(X_0, Y_1), Y_2)=k | G(X_0, Y_1) = j).\end{gather*}\]

This is “intuitively” clear because once we know \(G(X_0, Y_1) = j\) the probability \(P(G(j,Y_2) = k)\) isn’t affected by \(X_0\). I can’t find a nice formal proof though. Here’s an attempt.

First we note that this would follow if we knew the events \(X_0=i\) and \(G(G(X_0, Y_1), Y_2) =k\) were conditionally independent given \(G(X_0, Y_1) = j\).

Let \(E\) be the event \(G(X_0, Y_1) = j\). The fact that \(X_0\) and \(Y_2\) are independent doesn’t immediately imply that they are conditionally independent given \(E\), but they are: for any \(x\) and \(y\),

\[\begin{align*} P(X_0=x, Y_2 = y | E) &= \frac{P(X_0=x, E, Y_2 = y)}{P(E)} \\ &= \frac{P(Y_2=y) P(X_0=x, E)}{P(E)} \\ &= P(Y_2=y | E) P(X_0=x | E) \end{align*}\]

where the last two equalities are because \(Y_2\) is independent of \(X_0, Y_1\).

(There are some nice examples around conditional independence at this MSE question and in the sidebar there. In general independence of \(A\) and \(B\) needn’t imply they’re conditionally independent given \(C\) because \(C\) might give you some information connecting them - in Nate Eldridge’s example at the link \(A\) is one coin being heads, \(B\) is another coin being heads, \(C\) is the event the coins show the same face - but in our case the event \(E\) we’re conditioning on can’t possibly tell us anything about \(Y_2\) and so we are safe).

Now we can finish:

\[\begin{align*} P( X_0 = i, G(G(X_0, Y_1), Y_2)= k | E) & = P(X_0=i, G(j, Y_2)=k | E) \\ &= P(X_0 = i | E) P(G(j,Y_2) = k |E) \end{align*}\]

by conditional independence of \(X_0\) and \(Y_2\).

I guess it’s not as bad as all that.