however they have their own issues
but obviously, we need to massage this a little bit, because if it were trivial, someone else woulda did it!
This is where they vary from traditional "linear" kernels who are only dependent on the Markov Chain.
\[x_n = \frac{1}{n+1} \sum^{n}_{k=0} \delta_{X_k}\]
where $\delta$ is the measure of change of an entry
If we set this parameter to be too high, we might be too strict in our selection of new states.
Similarly if we set this parameter to be too low, we will basically select random states
Which is incredible!
I don't blame the authors for this though, what they did was complex and hard. I came into this paper without the background required to understand it.