-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' into lauren_branch
- Loading branch information
Showing
33 changed files
with
1,564 additions
and
201 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
name: CI | ||
|
||
on: push | ||
|
||
jobs: | ||
RunTests: | ||
runs-on: ${{ matrix.os }} | ||
|
||
# Don't run tests when merging PRs, they would have been run already and tests | ||
# should have passed. The following checks that the committer's and author's username differ; | ||
# if so, the event was probably triggered by github actions. | ||
# NOTE: Directly editing code on GitHub on the main branch pybasses tests, and is | ||
# discouraged. If editing code on GitHub, please submit a PR by creating a temporary branch. | ||
if: | | ||
github.ref_name != github.event.repository.default_branch || | ||
github.event.head_commit.committer.username != | ||
github.event.head_commit.author.username | ||
continue-on-error: ${{ matrix.python-version == 3.6 || matrix.os == 'windows-latest' }} | ||
strategy: | ||
matrix: | ||
os: [ubuntu-latest, macos-latest, windows-latest] | ||
python-version: ['3.12'] | ||
include: | ||
- os: ubuntu-20.04 | ||
python-version: 3.6 | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install . | ||
pip install pytest | ||
- name: Run Pytest | ||
run: pytest -s tests/ | ||
|
||
CreateTag: | ||
permissions: | ||
contents: write | ||
|
||
needs: RunTests | ||
runs-on: ubuntu-latest | ||
defaults: | ||
run: | ||
shell: bash | ||
|
||
# Create tag only on master branch and tests pass or tests were skipped | ||
# (i.e. skipped due to a PR merge; in that case tests should already have | ||
# passed). | ||
if: | | ||
always() && | ||
github.ref == 'refs/heads/master' && | ||
(needs.RunTests.result == 'success' || needs.RunTests.result == 'skipped') | ||
steps: | ||
- uses: actions/checkout@v3 | ||
|
||
- name: Install Python | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: '3.12' | ||
|
||
- name: Install setuptools | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install setuptools | ||
- name: Get python package metadata | ||
id: pkg_info | ||
run: | | ||
echo "version=v$(python setup.py --version)" >> "$GITHUB_OUTPUT" | ||
- name: Get git info | ||
id: git_info | ||
run: | | ||
git fetch --tags | ||
echo "tag=$(git tag | sort -V | tail -n 1)" >> "$GITHUB_OUTPUT" | ||
- name: Configure Git | ||
run: | | ||
git config user.name "github-actions" | ||
git config user.email "[email protected]" | ||
- name: Make tag if updated | ||
if: ${{ (steps.pkg_info.outputs.version != steps.git_info.outputs.tag) }} | ||
env: | ||
pkg_version: ${{ steps.pkg_info.outputs.version }} | ||
run: | | ||
git tag "$pkg_version" | ||
git push --tags | ||
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -163,3 +163,4 @@ dmypy.json | |
# Pyre type checker | ||
.pyre/ | ||
|
||
dev/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
test: | ||
conda run --no-capture-output -n impala python -m pytest -s |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,185 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "73a88bcb-ffcb-4b96-90e4-b8422c627f3c", | ||
"metadata": {}, | ||
"source": [ | ||
"# Evaluating the IMPALA Model Fit\n", | ||
"In this document, we introduce some IMPALA functionality for evaluating the quality of the IMPALA calibration in terms of MCMC diagnostics/convergence, prediction in the training data, and statistical coverage of corresponding uncertainties. This is a non-exhaustive set of analyses that one can use to diagnose IMPALA outputs, but it should provide a solid foundational toolkit. We refer interested users to our other tutories (e.g., troubleshooting_convergence) for additional details on specific issues." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c205167c", | ||
"metadata": {}, | ||
"source": [ | ||
"\n", | ||
"## 1. Diagnosing the MCMC Sampler\n", | ||
"IMPALA uses Markov Chain Monte Carlo sampling for parameter estimation, and users should do at minimum some basic evaluation of the performance of the sampler. Additional details are provided elsewhere; here, we just introduce some default functionality available to users with some basic visualization. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c8725fae", | ||
"metadata": {}, | ||
"source": [ | ||
"### Trace Plots\n", | ||
"Trace plots show the posterior draws for various parameters (y-axis) as a function of the number of iterations of the MCMC sampler (x-axis). Visual evaluation of these plots provides a sense of how well the sampler has converged, how correlated parameter draws are across iterations and between different parameters, how many iterations to toss out at the start of the MCMC chain as burn-in, etc. \n", | ||
"\n", | ||
"Here is an example of code one might run to plot to theta0 draws for a hierarchical calibration stored in *out*:\n", | ||
"\n", | ||
" import impala.superCal.post_process as pp\n", | ||
" pp.parameter_trace_plot(out.theta0[:,0,:], ylim=[0, 1])\n", | ||
"\n", | ||
"Here is an example trace plot. Notice the high autocorrelation evident in the last two parameters. In this calibration, the last two parameters were poorly identified with essentially uniform posteriors.\n", | ||
"![something](./images/trace_example.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "4ad5c758", | ||
"metadata": {}, | ||
"source": [ | ||
"### Performance of Parallel Tempering\n", | ||
"IMPALA uses a sophisticated parameter sampling method called parallel tempering, which allows the sampler to more easily move between local modes of the posterior distribution. Parallel tempering works by running many parallel MCMC chains with different properties and at each iteration proposing swaps between chains. One useful diagnostic of the health of the parallel tempering is the proportion of iterations in which a proposed swap between parallel chains is accepted. Ideally, we would like this acceptance rate somewhere between 20-60%. \n", | ||
"\n", | ||
"For pooled and hierarchical calibrations, a diagnostic plot can be generated for a calibration stored in *out* and a IMPALA setup object stored in *setup*:\n", | ||
"\n", | ||
" pp.total_temperature_swaps(out,setup)\n", | ||
"\n", | ||
"Here is an example output plot. The total proportion of swaps accepted is shown in the text, and the proportion of swaps between pairs of chains (indexed by temperatures) is shown in a heatmap. This calibration shows some evidence of sub-optimal tempering performance. \n", | ||
"\n", | ||
"![something](./images/tempering_example.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9679a3d3", | ||
"metadata": {}, | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "44be0aeb", | ||
"metadata": {}, | ||
"source": [ | ||
"### Additional Convergence Diagnostics\n", | ||
"Trace plots and proportion of parallel swaps accepted are the minimal set of convergence diagnostics one might want to run. However, many more diagnostics exist. The pyMC module in Python provides some useful functions for evaluating convergence (e.g., https://pymcmc.readthedocs.io/en/latest/modelchecking.html), and these methods can be applied to many objects stored in the outputted IMPALA calibration object to evaluate convergence." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "b7aa21da", | ||
"metadata": {}, | ||
"source": [ | ||
"## 2. Plotting the Posterior Draws for Theta\n", | ||
"\n", | ||
"### Pairs Plots\n", | ||
"Pairs plots show parameter draws for pairs of parameters. A standard pairs plot can be generated using the *pairs* function, and specialized pairs plots for each calibration type can also be generated. \n", | ||
"\n", | ||
" pp.pairs(...)\n", | ||
" pp.pairwise_theta_plot_pool(...)\n", | ||
" pp.pairwise_theta_plot_hier(...)\n", | ||
" pp.pairwise_theta_plot_cluster(...)\n", | ||
"\n", | ||
"Here is an example of a plot generated by *pairwise_theta_plot_hier*. Parameter theta_i indicates experiment-specific parameter draws, theta_0 indicates the mean of the parent distribution for theta_i's, and theta_star indicates the parent distribution itself (i.e., theta_0 draws + additional noise). \n", | ||
"\n", | ||
"![something](./images/pairs_hier_example.png)\n", | ||
"\n", | ||
"### Experiment-Specific Parameter Boxplots (Hierarchical and Clustering Only)\n", | ||
"\n", | ||
"Suppose we have a setup object with a single *addVecExperiments* call and that nexp is the number of unique theta_i value/experiments entered by the single call. Here is some code that can be used to generate boxplots of the theta_i values across experiments. This type of plot is useful for identifying how variable the theta_i values are across experiments. \n", | ||
"\n", | ||
" import numpy as np\n", | ||
" import pandas as pd\n", | ||
" import seaborn as sns\n", | ||
" KEYS = np.array(pd.DataFrame(setup.bounds.keys())).flatten()\n", | ||
" theta_i = []\n", | ||
" for j in range(nexp):\n", | ||
" mat = pd.DataFrame(np.array(pd.DataFrame(sc.tran_unif(out.theta[0][:,0,j,:],setup.bounds_mat, setup.bounds.keys()).values())).T, columns = KEYS)\n", | ||
" mat['exp'] = j\n", | ||
" if j == 0:\n", | ||
" theta_i = [mat]\n", | ||
" else:\n", | ||
" theta_i.append(mat)\n", | ||
" theta_i_long = pd.concat(theta_i)\n", | ||
"\n", | ||
" fig,ax=plt.subplots(1,setup.p,figsize=(20,4))\n", | ||
" fig.tight_layout(pad=2.0)\n", | ||
" for j in range(setup.p):\n", | ||
" ax[j].set_xlabel('Experiment')\n", | ||
" ax[j].set_ylabel('')\n", | ||
" ax[j].set_title(KEYS[j]) \n", | ||
" A = sns.boxplot(data = theta_i_long, x = theta_i_long['exp'], y = theta_i_long[KEYS[j]], ax=ax[j],showfliers=False)\n", | ||
" A.set(xlabel='Experiment',ylabel='')\n", | ||
" A.set(xticklabels=[]) \n", | ||
"\n", | ||
"Here is an example output: \n", | ||
"![something](./images/thetai_example.png)\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "33249e09", | ||
"metadata": {}, | ||
"source": [ | ||
"## 3. Evaluating Predictions in Training Data\n", | ||
"Another important diagnostic is how well the calibrated model predicts the training data. Poor predictions can indicate model discrepancy, errors in implementation, etc. IMPALA provides several utilities for calculating and visualizing predictions. \n", | ||
"\n", | ||
"### Getting and Plotting Predictions\n", | ||
"The function *get_outcome_predictions_impala* can be used to generate predictions for all training data points and a user-specified set of theta values. The *post_process.py* script also has various other functions that help with plotting these predictions. \n", | ||
"\n", | ||
"For example, prediction plots specific to ModelMaterialStrength calibrations can be generated as follows:\n", | ||
"\n", | ||
" pp.ptw_prediction_plots_pool(...)\n", | ||
" pp.ptw_prediction_plots_hier(...)\n", | ||
" pp.ptw_prediction_plots_cluster(...)\n", | ||
"\n", | ||
"Generate prediction plots can be generated using the *func_prediction_plot* function. \n", | ||
"\n", | ||
"### Cross-Experiment Prediction Errors (Hierarchical and Clustering Only)\n", | ||
"\n", | ||
"One extremely useful diagnostic for identifying outliers and determining if there are subsets of experiments that behave \"differently\" than others is to calculate the predictions for experiment i using the experiment-specific theta values estimated for other experiments. If an experiment has low prediction error using its own theta_i estimate but high prediction error using other experiments' theta estimates, there is some evidence that the experiment is somehow \"different\" than the others. \n", | ||
"\n", | ||
"Here, we provide some code for generating these prediction errors in a particular setting. This code can be adapted by users to accomodate different model structures. For this demonstration, we suppose we have a setup object with a single *addVecExperiments* call and that nexp is the number of unique theta_i value/experiments entered by the single call. \n", | ||
"\n", | ||
" THETAi_Y = [get_outcom_predictions_impala(setup, theta_input = np.array(pd.DataFrame(sc.tran_unif(out.theta[0][:,0,j,:],setup.bounds_mat, setup.bounds.keys()).values())).T)['outcome_draws'] for j in range(nexp)]\n", | ||
" CROSS_ERRORS = np.empty([nexp,nexp])\n", | ||
" for i in range(nexp): \n", | ||
" for j in range(nexp):\n", | ||
" CROSS_ERRORS[i,j] = np.mean(np.abs(setup.ys[0][np.where(setup.s2_ind[0] == j)[0]] -np.mean(THETAi_Y[i][0][:,np.where(setup.s2_ind[0] == j)[0]],axis=0))/setup.ys[0][np.where(setup.s2_ind[0] == j)[0]])\n", | ||
" ax = sns.heatmap(CROSS_ERRORS.T*100, linewidths =0)\n", | ||
" ax.set_xlabel('Theta Index')\n", | ||
" ax.set_ylabel('Prediction Index')\n", | ||
" ax.set_title('Prediction Errors (%)')\n", | ||
"\n", | ||
"Here, we provide an example output. Rows correspond to experiments being predicted and columns correspond to experiment-specific theta_i values. The color in each cell corresponds to the mean absolute percent error (MAPE) for predicting the training data. This figure indicates some differing behavior between the first 7 and last 9 experiments. It also highlights experiment 6 as particularly poorly-predicted using the theta_i values obtained for experiments 7-15. \n", | ||
"\n", | ||
"![something](./images/cross_errors_example.png)\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.8.18" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.