Skip to content

Commit

Permalink
refs #92 MCMC
Browse files Browse the repository at this point in the history
  • Loading branch information
AnthonyLim23 committed Nov 14, 2024
1 parent 8c36e1c commit 2ac1d26
Showing 1 changed file with 54 additions and 2 deletions.
56 changes: 54 additions & 2 deletions docs/source/cf_methods.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,15 +91,67 @@ To evaluate the odds factor, the probability of the data given the model needs t
This is written as

.. math::
:label: P_int
P(D | M) = \int_\Omega d\underline{\theta} \quad P(D| \underline{\theta}, M)P(\underline{\theta} | M)
where the integral over :math:`\Omega` is over the available parameter space for :math:`\underline{\theta}`.
This quantity can be evaluated using either Markov Chain Monte Carlo (MCMC) or nested sampling.



MCMC to eval integral
---------------------
Makov Chain Monte Carlo (MCMC)
------------------------------

The integral in equation :math:numref:'P_int' typically requires a numberical method to evaluate it.
Markov Chain Monte Carlo (MCMC) is a method that uses random walkers to estimate the probability distribution.
The MCMC has two main components, the first defines how the walkers select their new positions and the second is how to determine if to accept the new values.
There are several options for each of these parts, leading to numerous possible MCMC simulations.
For this section, the differential evolution and Metropolis Hastings methods will be considered.
These methods are relatively straightforward and provide insight into MCMC methods.

The differential evolution algorithm provides an equation that determines how the random walkers will be updated.
Each walker describes the potential parameters for the model, :math:`\underline{\theta}_i^t`, where :math:`i` is used to label the different walkers and :math:`t` labels the step of the algorithm.
The walkers are then evolve to their new positions according to the equation

.. math::
:label:MCMCDE
\underline{\theta}_i^{t+1} = \underline{\theta}_i^t + \gamma(\underline{\theta}_j^t - \underline{\theta}_k^t) + \underline{\epsilon}
where :math:`i\ne j\ne k`, `gamma` gives the strength of the coupling between the walkers and :math:`\epsilon` provides a random change to the parameters.
The first term provides some history to the method, so the new parameter values are not completely random.
This prevents the walker from picking new guesses that are significantly worse than the current one.
The second term can be thought of as a diffusion term, it determines how much the walker will move.
It depends on the distance between two different walkers and then multiplies the result by a scalar :math:`gamma`.
So if the MCMC has a bad estimate of the PDF, then the distance between the walkers is probably large.
Hence, the next guess will move more in an attempt to find a better set of parameters.
This part of the algorithm is described as the burn in period, and is the time for the walkers to find a good PDF.
If the walkers provide a good description of the PDF, then the difference is small.
This results in the walker moving very far from its original position.
The competing effects from the second term, depending on the distance between two walkers, has the effect of pulling walkers towards good parameter values.
The final term just randoms a bit of randomness into the position update.

Once a walker has a new position, it will not automatically move to it.
Instead it has a finite probability of its values being updated.
One method for determining if the walker position should be updated is the Metropolis Hastings algorithm.
For a given walker position it is possible to calculate the likelihood (typically a Gausssian).
The likelihoods are calculated for both the proposed new (:math:`\underline{\theta}_j^{t+1}`) and current (:math:`\underline{\theta}_j^t`) walker positions.
Then the ratio is taken and compared to a random number.
The proposed value is accepted if the random number is smaller than the ratio.
As a result the walker is not guranteed to update to the new position if its current values are good.

The combined effect of the two algorithms is that the walkers will eventually converge onto a distribution that describes the PDF.
To make this tractable, a few extra conditions are needed.
The first is to limit the probability space, by placing limits on the potential parameter values.
This is the prior for the problem and the starting distribution for the walkers is normally flat across the parameter space.
The second is to define the behaviour of the walkers at the boundaries of the parameter space.
Typically they are chosen to be either reflective or a periodic boundary.
For a Mertropolis Hastings algorithm both options are suitable, because the results are independent of the path taken by the walker.

MCMC will eventually give a very good represnetation of the data, even if it has a complex PDF.
However, it can be very computationally expensive to evaluate due to the burn in period.
It is also difficult to know prior to the calculation how long the burn in period should be.
Hence, it can be too short leading to poor results or too long wasting valuable computational time.


`MCMC <https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo/>`_

Expand Down

0 comments on commit 2ac1d26

Please sign in to comment.