Simulation Based Calibration

A PyMC and Bambi implementation of the algorithms from:

Sean Talts, Michael Betancourt, Daniel Simpson, Aki Vehtari, Andrew Gelman: “Validating Bayesian Inference Algorithms with Simulation-Based Calibration”, 2018; arXiv:1804.06788

Many thanks to the authors for providing open, reproducible code and implementations in rstan and PyStan (link).

Installation

May be pip installed from github:

pip install git+https://github.com/ColCarroll/simulation_based_calibration

Quickstart

Define a PyMC or Bambi model. For example, the centered eight schools model:

import numpy as np
import pymc as pm

data = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])
                
with pm.Model() as centered_eight:
    obs = pm.MutableData("obs", data)
    mu = pm.Normal('mu', mu=0, sigma=5)
    tau = pm.HalfCauchy('tau', beta=5)
    theta = pm.Normal('theta', mu=mu, sigma=tau, shape=8)
    y_obs = pm.Normal('y', mu=theta, sigma=sigma, observed=obs)

Pass the model to the SBC class, and run the simulations. This will take a while, as it is running the model many times.

sbc = SBC(centered_eight,
        num_simulations=100, # ideally this should be higher, like 1000
        sample_kwargs={'draws': 25, 'tune': 50})

sbc.run_simulations()

79%|███████▉  | 79/100 [05:36<01:29,  4.27s/it]

Plot the empirical CDF for the difference between prior and posterior. The lines should be close to uniform and within the oval envelope.
```
sbc.plot_results()
```

What is going on here?

The paper on the arXiv is very well written, and explains the algorithm quite well.

Morally, the example below is exactly what this library does, but it generalizes to more complicated models:

with pm.Model() as model:
    x = pm.Normal('x')
    pm.Normal('y', mu=x, observed=y)

Then what this library does is compute

with my_model():
    prior_samples = pm.sample_prior_predictive(num_trials)

simulations = {'x': []}
for idx in range(num_trials):
    y_tilde = prior_samples['y'][idx]
    x_tilde = prior_samples['x'][idx]
    with model(y=y_tilde):
        idata = pm.sample()
    simulations['x'].append((idata.posterior['x'] < x_tilde).sum())

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
simulation_based_calibration		simulation_based_calibration
LICENSE		LICENSE
README.md		README.md
ecdf.png		ecdf.png
hist.png		hist.png
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simulation Based Calibration

Installation

Quickstart

What is going on here?

About

Releases

Sponsor this project

Packages

Contributors 4

Languages

License

arviz-devs/simulation_based_calibration

Folders and files

Latest commit

History

Repository files navigation

Simulation Based Calibration

Installation

Quickstart

What is going on here?

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 4

Languages

Packages