Merge pull request #128 from TuringLang/change_api_function_names

Change api function names
TuringLang · May 31, 2021 · 568fa9c · 568fa9c · HarrisonWilde · May 31, 2021
2 parents 92703ea + bfe9f36
commit 568fa9c
Show file tree

Hide file tree

Showing 5 changed files with 87 additions and 49 deletions.
diff --git a/Project.toml b/Project.toml
@@ -1,7 +1,7 @@
 name = "MCMCTempering"
 uuid = "ce233488-44ea-4441-b732-192676ce2298"
 authors = ["Harrison Wilde <[email protected]> and contributors"]
-version = "0.1.0"
+version = "0.1.1"
 
 [deps]
 AbstractMCMC = "80f14c24-f653-4e6a-9b94-39d6b0f70001"

diff --git a/README.md b/README.md
@@ -9,6 +9,20 @@ We offer support for `MCMCTempering`'s sampling functionality through extrogenou
 
 To illustrate this, we step through an example showing all of the work required in order to get `AdvancedHMC` working with `MCMCTempering`.
 
+## First Steps
+
+We should define a new file, `tempering.jl` is a reasonable name and the choice we made for all of our supported packages, to contain the `MCMCTempering` function implementations. All of the functions discussed below are to be written in this file, and then should be exported such that they can be accessed during sampling. To do this correctly, `MCMCTempering` should also be added as a project dependency to the sampling package in question. Then something along the lines below should be included such that the implemented API can be accessed by the samplers of the package during the calls to `MCMCTempering`'s functionality:
+
+```julia
+##### ...
+import MCMCTempering
+include("tempering.jl")
+export Joint, TemperedJoint, make_tempered_model, make_tempered_loglikelihood, get_params
+##### ...
+```
+
+With this in mind, we should populate `tempering.jl` with the aforementioned implementations. The first major consideration to deal with is how to carry through all of the required tempering information upon calling `sample` on our sampler.
+
 ## Tempering a sampler
 
 Firstly, observing the signature of the generic `sample` call exposed by `AbstractMCMC`, we see that we minimally require a `model`, a `sampler` and other args such as the number of samples `N` to return, etc. Given the aforementioned base assumption that your sampler should conform to `AbstractMCMC`'s `sample` and `step` structure, it is sufficient here to ensure these methods will be called as expected. In general, we carry through the tempering schedule and other information via the `sampler` as this can be presumed to be present at each step of a sampling routine; to do this in `AdvancedHMC`'s case, we must circumnavigate the internal definition of `AbstractMCMC.sample` (which requires a user to provide a `kernel`, `metric` and `adaptor`) to build a `sampler` object ourselves (in this case we want to temper the `HMCSampler` from `AdvancedHMC` which is a struct containing the aforementioned three components), this can be done as in this minimal working example based on standard usage of the `AdvancedHMC` package. These first lines are to setup sampling, as in any other standard usage of `AdvancedHMC`:
@@ -53,90 +67,114 @@ So usage is fairly simple provided we stick with the expected `AbstractMCMC.samp
 
 ## Stepping using the sampler and a tempered model
 
-The first requirement for tempering is to be able to call the `model`'s log likelihood density function in product with an inverse temperature. For this we define the `make_tempered_model` function that returns an instance of the relevant model type; in `AdvancedHMC`'s case this is a `DifferentiableDensityModel`, but should be whatever model type your sampler expects, where the internals of the `model`'s log likelihood are adjusted according to an inverse temperature multiplier `β`:
+The first requirement for tempering is to be able to call the `model`'s **log-likelihood** density function in product with an inverse temperature. For this we define the `make_tempered_model` function that returns an instance of the relevant model type - in `AdvancedHMC`'s case this is a `DifferentiableDensityModel` (but this should of course be whatever model type your sampler expects) - where the internals of the `model`'s log-likelihood are adjusted according to an inverse temperature multiplier `β`. To achieve the desired functionality and act *only* on the log-likelihood whilst leaving the log-prior as is, we define two callabale structs `Joint` and `TemperedJoint` containing a log-prior and log-likelihood (and a temperature `β`). This is necessary so as to adhere correctly with `AdvancedHMC`'s expectation of a `model` that wraps only a **joint** density function rather than the two components we require for tempering, and the `Joint` structs allow us to by default return the value of the log-joint as expected uon calling the `model`'s density:
 
 ```julia
-function MCMCTempering.make_tempered_model(model::DifferentiableDensityModel, β::T) where {T<:AbstractFloat}
-    ℓπ_β(θ) = model.ℓπ(θ) * β
-    ∂ℓπ∂θ_β(θ) = model.∂ℓπ∂θ(θ) * β
-    model = DifferentiableDensityModel(ℓπ_β, ∂ℓπ∂θ_β)
-    return model
+struct Joint{Tℓprior, Tℓll} <: Function
+    ℓprior      :: Tℓprior
+    ℓlikelihood :: Tℓll
+end
+
+function (joint::Joint)(θ)
+    return joint.ℓprior(θ) .+ joint.ℓlikelihood(θ)
+end
+
+
+struct TemperedJoint{Tℓprior, Tℓll, T<:Real} <: Function
+    ℓprior      :: Tℓprior
+    ℓlikelihood :: Tℓll
+    β           :: Real
+end
+
+function (tj::TemperedJoint)(θ)
+    return tj.ℓprior(θ) .+ (tj.ℓlikelihood(θ) .* tj.β)
+end
+
+
+function MCMCTempering.make_tempered_model(
+    model::DifferentiableDensityModel,
+    β::Real
+)
+    ℓπ_β = TemperedJoint(model.ℓπ.ℓprior, model.ℓπ.ℓlikelihood, β)
+    ∂ℓπ∂θ_β = TemperedJoint(model.∂ℓπ∂θ.ℓprior, model.∂ℓπ∂θ.ℓlikelihood, β)
+    model_β = DifferentiableDensityModel(ℓπ_β, ∂ℓπ∂θ_β)
+    return model_β
 end
 ```
 
 This is all that is required to ensure `MCMCTempering`'s functionality injects between each step and ensures each chain in our implementation of parallel tempering can step according to the correct inverse temperature `β`.
 
 ## Carrying out temperature swap steps
 
-For the tempering specific "swap steps" between pairs of chains' temperature levels, we must first offer a way to temper the *density* of the model; for this we implement the `make_tempered_logπ` function which accepts the `model` and a temperature `β`; then it returns a function `logπ(z)` which is a transformation of the `model`'s log likelihood function:
+For the tempering specific "swap steps" between pairs of chains' temperature levels, we must similarly offer a way to temper the **log-likelihood** of the model independently from the model itself; for this we implement the `make_tempered_loglikelihood` function which accepts the `model` and a temperature `β`; then it returns a function `logπ(z)` which is a transformation of the `model`'s log-likelihood function contained in the aforementioned `Joint`:
 
 ```julia
-function MCMCTempering.make_tempered_logπ(
+function MCMCTempering.make_tempered_loglikelihood(
     model::DifferentiableDensityModel,
-    β::T
-) where {T<:AbstractFloat}
+    β::Real
+)
     function logπ(z)
-        return model.ℓπ(z) * β
+        return model.ℓπ.ℓlikelihood(z) * β
     end
     return logπ
 end
 ```
 
-Access to the current proposed parameter values is required, and this should be a relatively simple getter function accessing the current `state` of the sampler in most cases to return `θ`:
+Access to the current proposed parameter values is required, and this should be a relatively simple getter function, accessing the current `state` of the sampler in most cases, to then return `θ`:
 
 ```julia
-function MCMCTempering.get_θ(trans::Transition)
+function MCMCTempering.get_params(trans::Transition)
     return trans.z.θ
 end
 ```
 
-Both of these parts should then be used in a function called `get_densities_and_θs` that returns the densities and parameters for the `k`th and `k+1`th chains, the interface is built in this way as the requirements for accessing the two aforementioned components can reasonably change, with some samplers being built such that they require state information, sampler information, model information etc. to access these properties, this allows for flexibility in implementation.
+Both of these parts should then be used in a function called `get_tempered_loglikelihoods_and_params` that returns the densities and parameters for the `k`th and `k+1`th chains, the interface is built in this way as the requirements for accessing the two aforementioned components can reasonably change, with some samplers being built such that they require state information, sampler information, model information etc. to access these properties, this allows for flexibility in implementation.
 
-In this case, the implementation of `get_densities_and_θs` is relatively simple, in fact, the code below is the default "fallback" implementation of this function and so provided your sampler submits to this fairly standard set of arguments you do not need to implement this method, as is the case for `AdvancedHMC`:
+In this case, the implementation of `get_tempered_loglikelihoods_and_params` is relatively simple, in fact, the code below is the default "fallback" implementation of this function and so provided your sampler submits to this fairly standard set of arguments you do not need to implement this method at all, as is the case for `AdvancedHMC`:
 
 ```julia
-function get_densities_and_θs(
+function get_tempered_loglikelihoods_and_params(
     model,
     sampler::AbstractMCMC.AbstractSampler,
     states,
     k::Integer,
-    Δ::Vector{T},
+    Δ::Vector{Real},
     Δ_state::Vector{<:Integer}
-) where {T<:AbstractFloat}
+)
 
-    logπk = make_tempered_logπ(model, Δ[Δ_state[k]])
-    logπkp1 = make_tempered_logπ(model, Δ[Δ_state[k + 1]])
+    logπk = make_tempered_loglikelihood(model, Δ[Δ_state[k]])
+    logπkp1 = make_tempered_loglikelihood(model, Δ[Δ_state[k + 1]])
 
-    θk = get_θ(states[k][1])
-    θkp1 = get_θ(states[k + 1][1])
+    θk = get_params(states[k][1])
+    θkp1 = get_params(states[k + 1][1])
 
     return logπk, logπkp1, θk, θkp1
 end
 ```
 
-In some cases we do need to override this functionality. This is necessary in `Turing.jl` for example where the `sampler` and `VarInfo` are required to access the density and parameters resulting in the following implementations:
+Note that in some cases we **do** need to override this functionality. This is necessary in `Turing.jl` for example where the `sampler` and `VarInfo` are required to access the density and parameters resulting in the following implementations:
 
 ```julia
-function MCMCTempering.get_densities_and_θs(
+function MCMCTempering.get_tempered_loglikelihoods_and_params(
     model::Model,
     sampler::Sampler{<:TemperedAlgorithm},
     states,
     k::Integer,
-    Δ::Vector{T},
+    Δ::Vector{Real},
     Δ_state::Vector{<:Integer}
-) where {T<:AbstractFloat}
+)
 
-    logπk = MCMCTempering.make_tempered_logπ(model, Δ[Δ_state[k]], sampler, get_vi(states[k][2]))
-    logπkp1 = MCMCTempering.make_tempered_logπ(model, Δ[Δ_state[k + 1]], sampler, get_vi(states[k + 1][2]))
+    logπk = MCMCTempering.make_tempered_loglikelihood(model, Δ[Δ_state[k]], sampler, get_vi(states[k][2]))
+    logπkp1 = MCMCTempering.make_tempered_loglikelihood(model, Δ[Δ_state[k + 1]], sampler, get_vi(states[k + 1][2]))
 
-    θk = MCMCTempering.get_θ(states[k][2], sampler)
-    θkp1 = MCMCTempering.get_θ(states[k + 1][2], sampler)
+    θk = MCMCTempering.get_params(states[k][2], sampler)
+    θkp1 = MCMCTempering.get_params(states[k + 1][2], sampler)
 
     return logπk, logπkp1, θk, θkp1
 end
 
 
-function MCMCTempering.make_tempered_logπ(model::Model, β::T, sampler::DynamicPPL.Sampler, varinfo_init::DynamicPPL.VarInfo) where {T<:AbstractFloat}
+function MCMCTempering.make_tempered_loglikelihood(model::Model, β::Real, sampler::DynamicPPL.Sampler, varinfo_init::DynamicPPL.VarInfo)
 
     function logπ(z)
         varinfo = DynamicPPL.VarInfo(varinfo_init, sampler, z)
@@ -150,7 +188,7 @@ end
 get_vi(state::Union{HMCState,GibbsState,EmceeState,SMCState}) = state.vi
 get_vi(vi::DynamicPPL.VarInfo) = vi
 
-MCMCTempering.get_θ(state, sampler::DynamicPPL.Sampler) = get_vi(state)[sampler]
+MCMCTempering.get_params(state, sampler::DynamicPPL.Sampler) = get_vi(state)[sampler]
 ```
 
 At this point we have implemented all of the necessary components such that the first code block will run and we can temper `AdvancedHMC`'s samplers successfully.
diff --git a/src/MCMCTempering.jl b/src/MCMCTempering.jl
@@ -11,7 +11,7 @@ include("model.jl")
 include("swapping.jl")
 include("plotting.jl")
 
-export Tempered, TemperedSampler, plot_swaps, make_tempered_model, get_densities_and_θs, make_tempered_logπ, get_θ
+export Tempered, TemperedSampler, plot_swaps, make_tempered_model, get_tempered_loglikelihoods_and_params, make_tempered_loglikelihood, get_params
 
 function AbstractMCMC.bundle_samples(
     ts::Vector,

diff --git a/src/swapping.jl b/src/swapping.jl
@@ -6,30 +6,30 @@ function swap_βs(Δ_state, k)
     return Δ_state
 end
 
-function make_tempered_logπ end
-function get_θ end
+function make_tempered_loglikelihood end
+function get_params end
 
 """
-    get_densities_and_θs
+    get_tempered_loglikelihoods_and_params
 
 Temper the `model`'s density using the `k`th and `k + 1`th chains' temperatures 
 selected using `Δ` and `Δ_state`. Then retrieve the parameters using the chains'
 current transitions via extracted from the collection of `states`
 """
-function get_densities_and_θs(
+function get_tempered_loglikelihoods_and_params(
     model,
     sampler::AbstractMCMC.AbstractSampler,
     states,
     k::Integer,
-    Δ::Vector{T},
+    Δ::Vector{Real},
     Δ_state::Vector{<:Integer}
-) where {T<:AbstractFloat}
+)
 
-    logπk = make_tempered_logπ(model, Δ[Δ_state[k]])
-    logπkp1 = make_tempered_logπ(model, Δ[Δ_state[k + 1]])
+    logπk = make_tempered_loglikelihood(model, Δ[Δ_state[k]])
+    logπkp1 = make_tempered_loglikelihood(model, Δ[Δ_state[k + 1]])
 
-    θk = get_θ(states[k][1])
-    θkp1 = get_θ(states[k + 1][1])
+    θk = get_params(states[k][1])
+    θkp1 = get_params(states[k + 1][1])
 
     return logπk, logπkp1, θk, θkp1
 end
@@ -56,7 +56,7 @@ end
 
 function swap_attempt(model, sampler, states, k, Δ, Δ_state)
 
-    logπk, logπkp1, θk, θkp1 = get_densities_and_θs(model, sampler, states, k, Δ, Δ_state)
+    logπk, logπkp1, θk, θkp1 = get_tempered_loglikelihoods_and_params(model, sampler, states, k, Δ, Δ_state)
 
     A = swap_acceptance_pt(logπk, logπkp1, θk, θkp1)
     U = rand(Distributions.Uniform(0, 1))

diff --git a/src/tempered.jl b/src/tempered.jl
@@ -1,7 +1,7 @@
 """
     struct TemperedSampler{A} <: AbstractSampler
         internal_sampler :: A
-        Δ                :: Vector{<:AbstractFloat}
+        Δ                :: Vector{<:Real}
         Δ_init           :: Vector{<:Integer}
         N_swap           :: Integer
         swap_strategy    :: Symbol
@@ -15,7 +15,7 @@ A `TemperedSampler` struct wraps an `internal_sampler` (could just be an algorit
 """
 struct TemperedSampler{A} <: AbstractMCMC.AbstractSampler
     internal_sampler :: A
-    Δ                :: Vector{<:AbstractFloat}
+    Δ                :: Vector{<:Real}
     Δ_init           :: Vector{<:Integer}
     N_swap           :: Integer
     swap_strategy    :: Symbol
@@ -24,7 +24,7 @@ end
 
 function Tempered(
     internal_sampler,
-    Δ::Vector{<:AbstractFloat};
+    Δ::Vector{<:Real};
     swap_strategy::Symbol = :standard,
     kwargs...
 )
@@ -61,7 +61,7 @@ end
 """
 function Tempered(
     internal_sampler,
-    Δ::Vector{<:AbstractFloat},
+    Δ::Vector{<:Real},
     swap_strategy::Symbol;
     Δ_init = collect(1:length(Δ)),
     N_swap::Integer = 1,