diff --git a/README.md b/README.md
index 5896979..494ade8 100644
--- a/README.md
+++ b/README.md
@@ -12,6 +12,7 @@ Get started now!
 
 ## What's New
 
+- Feb 2025: Added a post evaluation notebook that shows how to run fine-grained model selection after running MMF. Try the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py).
 - Jan 2025: [TimesFM](https://github.com/google-research/timesfm) is available for univariate and covariate forecasting. Try the notebooks: [univariate](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/foundation_daily.py) and [covariate](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/external_regressors/foundation_external_regressors_daily.py).
 - Jan 2025: [Chronos Bolt](https://github.com/amazon-science/chronos-forecasting) models are available for univariate forecasting. Try the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/foundation_daily.py).
 - Jan 2025: [Moirai MoE](https://github.com/SalesforceAIResearch/uni2ts) models are available for univariate forecasting. Try the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/foundation_daily.py).
@@ -25,7 +26,7 @@ To run this solution on a public [M4](https://www.kaggle.com/datasets/yogesh94/m
 
 Local models are used to model individual time series. They could be advantageous over other types of model for their capabilities to tailor fit to individual series, offer greater interpretability, and require lower data requirements. We support models from [statsforecast](https://github.com/Nixtla/statsforecast), [r fable](https://cran.r-project.org/web/packages/fable/vignettes/fable.html) and [sktime](https://www.sktime.net/en/stable/). Covariates (i.e. exogenous regressors) are currently only supported for some models from statsforecast. 
 
-To get started, attach the [examples/daily/local_univariate_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/local_univariate_daily.py) notebook to a cluster running [DBR 15.4LTS for ML](https://docs.databricks.com/en/release-notes/runtime/15.4lts-ml.html) or later versions. The cluster can be either a single-node or multi-node CPU cluster. Make sure to set the following [Spark configurations](https://spark.apache.org/docs/latest/configuration.html) on the cluster before you start using MMF: ```spark.sql.execution.arrow.enabled true``` and ```spark.sql.adaptive.enabled false``` (more detailed explanation to follow). 
+To get started, attach the [examples/daily/local_univariate_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/local_univariate_daily.py) notebook to a cluster running [DBR 15.4LTS for ML](https://docs.databricks.com/en/release-notes/runtime/15.4lts-ml.html) or later versions. The cluster can be either a single-node or multi-node CPU cluster. Make sure to set the following [Spark configurations](https://spark.apache.org/docs/latest/configuration.html) on the cluster before you start using MMF: ```spark.sql.execution.arrow.enabled true``` and ```spark.sql.adaptive.enabled false``` (more detailed explanation can be found [here](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/local_univariate_daily.py)). 
 
 In this notebook, we will apply 20+ models to 100 time series. You can specify the models to use in a list:
 
@@ -110,7 +111,7 @@ run_forecast(
   
 To modify the model hyperparameters, change the values in [mmf_sa/models/models_conf.yaml](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/mmf_sa/models/models_conf.yaml) or overwrite these values in [mmf_sa/forecasting_conf.yaml](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/mmf_sa/forecasting_conf.yaml). 
 
-MMF is fully integrated with MLflow, so once the training kicks off, the experiments will be visible in the MLflow Tracking UI with the corresponding metrics and parameters (note that we do not log all local models in MLFlow, but we store the binaries in the tables ```evaluation_output``` and ```scoring_output```). The metric you see in the MLflow Tracking UI is a simple mean over backtesting trials over all time series.
+MMF is fully integrated with MLflow, so once the training kicks off, the experiments will be visible in the MLflow Tracking UI with the corresponding metrics and parameters (note that we do not log all local models in MLFlow, but we store the binaries in the tables ```evaluation_output``` and ```scoring_output```). The metric you see in the MLflow Tracking UI is a simple mean over backtesting trials over all time series. Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
 
 We encourage you to read through [examples/daily/local_univariate_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/local_univariate_daily.py) notebook to better understand how local models can be applied to your time series using MMF. An example notebook for forecasting with exogenous regressors can be found in [examples/external_regressors/local_univariate_external_regressors_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/external_regressors/local_univariate_external_regressors_daily.py).
 
@@ -189,7 +190,7 @@ To modify the model hyperparameters or reset the range of the hyperparameter sea
 
 MMF is fully integrated with MLflow and so once the training kicks off, the experiments will be visible in the MLflow Tracking UI with the corresponding metrics and parameters. Once the training is complete the models will be logged to MLFlow and registered to Unity Catalog. 
 
-We encourage you to read through [examples/daily/global_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/global_daily.py) notebook to better understand how global models can be applied to your time series using MMF. An example notebook for forecasting with exogenous regressors can be found in [examples/external_regressors/global_external_regressors_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/external_regressors/global_external_regressors_daily.py).
+We encourage you to read through [examples/daily/global_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/global_daily.py) notebook to better understand how global models can be applied to your time series using MMF. An example notebook for forecasting with exogenous regressors can be found in [examples/external_regressors/global_external_regressors_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/external_regressors/global_external_regressors_daily.py). Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
 
 ### Foundation Models
 
@@ -238,7 +239,7 @@ To modify the model hyperparameters, change the values in [mmf_sa/models/models_
 
 MMF is fully integrated with MLflow and so once the training kicks off, the experiments will be visible in the MLflow Tracking UI with the corresponding metrics and parameters. During the evaluation, the models are logged and registered to Unity Catalog.
 
-We encourage you to read through [examples/daily/foundation_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/foundation_daily.py) notebook to better understand how foundation models can be applied to your time series using MMF. An example notebook for forecasting with exogenous regressors can be found in [examples/external_regressors/foundation_external_regressors_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/external_regressors/foundation_external_regressors_daily.py).
+We encourage you to read through [examples/daily/foundation_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/daily/foundation_daily.py) notebook to better understand how foundation models can be applied to your time series using MMF. An example notebook for forecasting with exogenous regressors can be found in [examples/external_regressors/foundation_external_regressors_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/external_regressors/foundation_external_regressors_daily.py). Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
 
 #### Using Time Series Foundation Models on Databricks
 
diff --git a/examples/daily/foundation_daily.py b/examples/daily/foundation_daily.py
index ad8f2e6..bdf60f1 100644
--- a/examples/daily/foundation_daily.py
+++ b/examples/daily/foundation_daily.py
@@ -200,6 +200,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/daily/global_daily.py b/examples/daily/global_daily.py
index 9189ea3..79e6542 100644
--- a/examples/daily/global_daily.py
+++ b/examples/daily/global_daily.py
@@ -193,6 +193,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/daily/local_univariate_daily.py b/examples/daily/local_univariate_daily.py
index 88345ec..bee2b2d 100644
--- a/examples/daily/local_univariate_daily.py
+++ b/examples/daily/local_univariate_daily.py
@@ -231,6 +231,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/external_regressors/foundation_external_regressors_daily.py b/examples/external_regressors/foundation_external_regressors_daily.py
index a569488..8176d2f 100644
--- a/examples/external_regressors/foundation_external_regressors_daily.py
+++ b/examples/external_regressors/foundation_external_regressors_daily.py
@@ -152,6 +152,11 @@
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/external_regressors/global_external_regressors_daily.py b/examples/external_regressors/global_external_regressors_daily.py
index 2f1fa94..5ed73d0 100644
--- a/examples/external_regressors/global_external_regressors_daily.py
+++ b/examples/external_regressors/global_external_regressors_daily.py
@@ -159,6 +159,11 @@
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/external_regressors/local_univariate_external_regressors_daily.py b/examples/external_regressors/local_univariate_external_regressors_daily.py
index 445dbd5..9adf777 100644
--- a/examples/external_regressors/local_univariate_external_regressors_daily.py
+++ b/examples/external_regressors/local_univariate_external_regressors_daily.py
@@ -190,6 +190,11 @@
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/hourly/foundation_hourly.py b/examples/hourly/foundation_hourly.py
index c55f298..3487e20 100644
--- a/examples/hourly/foundation_hourly.py
+++ b/examples/hourly/foundation_hourly.py
@@ -180,6 +180,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/hourly/global_hourly.py b/examples/hourly/global_hourly.py
index 960cceb..4d27cf4 100644
--- a/examples/hourly/global_hourly.py
+++ b/examples/hourly/global_hourly.py
@@ -174,6 +174,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/hourly/local_univariate_hourly.py b/examples/hourly/local_univariate_hourly.py
index c3095fa..f89e8cd 100644
--- a/examples/hourly/local_univariate_hourly.py
+++ b/examples/hourly/local_univariate_hourly.py
@@ -201,6 +201,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/monthly/foundation_monthly.py b/examples/monthly/foundation_monthly.py
index 249c9ae..668ab4e 100644
--- a/examples/monthly/foundation_monthly.py
+++ b/examples/monthly/foundation_monthly.py
@@ -189,6 +189,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/monthly/global_monthly.py b/examples/monthly/global_monthly.py
index 78d072c..1366646 100644
--- a/examples/monthly/global_monthly.py
+++ b/examples/monthly/global_monthly.py
@@ -183,6 +183,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/monthly/local_univariate_monthly.py b/examples/monthly/local_univariate_monthly.py
index b2459eb..71fdf2b 100644
--- a/examples/monthly/local_univariate_monthly.py
+++ b/examples/monthly/local_univariate_monthly.py
@@ -223,6 +223,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/post-evaluation-analysis.ipynb b/examples/post-evaluation-analysis.ipynb
new file mode 100644
index 0000000..fe5e0e0
--- /dev/null
+++ b/examples/post-evaluation-analysis.ipynb
@@ -0,0 +1,301 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "1e2ce813-1aee-44e9-9517-f98e63f23a4b",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "source": [
+    "# Many Models Forecasting Demo\n",
+    "\n",
+    "This notebook demonstrates how to conduct fine-grained model selection after running the `mmf.run_forecast` function. Before proceeding, ensure you have run the notebooks in [`/examples/monthly`](https://github.com/databricks-industry-solutions/many-model-forecasting/tree/main/examples/monthly). You can run this notebook on a serverless compute."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "3bf99e2b-dcf8-44ca-a48b-37e162f1767f",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "catalog = \"mmf_demo\"  # Name of the catalog we use to manage our assets\n",
+    "db = \"m4\"             # Name of the schema we use to manage our assets (e.g. datasets)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "8f004c3a-03ee-4a1a-9587-7e69e9cf505c",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "source": [
+    "In the `scoring_output` table, forecasts for each time series from every model are stored. Let's filter by a specific time series (e.g., `M1`) and examine the forecasts from all models."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "implicitDf": true,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "aa631669-768f-45b7-a149-398094efec3b",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "scoring_output =  spark.sql(f\"\"\"\n",
+    "    SELECT model, unique_id, date, y FROM {catalog}.{db}.monthly_scoring_output \n",
+    "    WHERE unique_id='M1' ORDER BY model\n",
+    "    \"\"\")\n",
+    "\n",
+    "display(scoring_output)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "768b3d5c-f351-43b1-b49a-6356df69a2dd",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "source": [
+    "This table contains forecasts from 44 different models, but we need to determine which one is best for making business decisions. This is where the `evaluation_output` table becomes useful. Let's filter by a specific time series (e.g., `M1`) and review the evaluation results (i.e., backtesting trials) from all models."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "implicitDf": true,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "cb2c82f8-7690-47a5-98c2-ee3368dcc027",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "evaluation_output =  spark.sql(f\"\"\"\n",
+    "    SELECT model, unique_id, backtest_window_start_date, metric_name, metric_value, forecast, actual \n",
+    "    FROM {catalog}.{db}.monthly_evaluation_output where unique_id='M1'\n",
+    "    order by model, backtest_window_start_date\n",
+    "    \"\"\")\n",
+    "\n",
+    "display(evaluation_output)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "824a3595-1493-4e60-981d-5945f2dc08cf",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "source": [
+    "Based on the backtesting configuration, defined by the parameters `backtest_length`, `prediction_length`, and `stride` in the `mmf_sa.run_forecast` function, we obtain results from 10 backtesting trials for each model. For each trial, both forecasts and actual values are stored, enabling you to compute evaluation metrics based on residuals. Additionally, this table includes a built-in metric for quick assessment, which can be specified using the `metric` parameter. In this case, the metric is `smape`, and currently, `mae`, `mse`, `rmse`, `mape`, and `smape` are supported.\n",
+    "\n",
+    "We compute the mean `smape` across 10 backtesting trials for each model and each time series. The model with the lowest mean `smape` is then selected for each time series, and its forecast is retrieved from the `forecast_output` table. Below is a SQL query that performs this selection."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "implicitDf": true,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "e0e89219-3725-430f-8be3-de4817177f43",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "forecast_best_model = spark.sql(f\"\"\"\n",
+    "    SELECT eval.unique_id, eval.model, eval.average_smape, score.date, score.y\n",
+    "    FROM \n",
+    "    (\n",
+    "      SELECT unique_id, model, average_smape,\n",
+    "      RANK() OVER (PARTITION BY unique_id ORDER BY average_smape ASC) AS rank\n",
+    "      FROM (\n",
+    "        SELECT unique_id, model, AVG(metric_value) AS average_smape\n",
+    "        FROM {catalog}.{db}.monthly_evaluation_output\n",
+    "        GROUP BY unique_id, model) \n",
+    "        ORDER BY unique_id, rank\n",
+    "    ) AS eval\n",
+    "    INNER JOIN {catalog}.{db}.monthly_scoring_output AS score \n",
+    "      ON eval.unique_id=score.unique_id AND eval.model=score.model\n",
+    "    WHERE eval.rank=1\n",
+    "    ORDER BY eval.unique_id\n",
+    "    \"\"\")\n",
+    "\n",
+    "display(forecast_best_model)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {},
+     "inputWidgets": {},
+     "nuid": "bdc7d8c6-0410-43f7-8f85-6901d986ebf4",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "source": [
+    "These forecasts will be used to guide our business decisions. Let's count how many times each model was the best-performing one."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 0,
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "4f94dc20-e26f-4906-9f9c-635c73181038",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "outputs": [],
+   "source": [
+    "model_ranking = spark.sql(f\"\"\"\n",
+    "    SELECT model, count(*) as count\n",
+    "    FROM (\n",
+    "      SELECT unique_id, model, average_smape,\n",
+    "      RANK() OVER (PARTITION BY unique_id ORDER BY average_smape ASC) AS rank\n",
+    "      FROM (\n",
+    "        SELECT unique_id, model, AVG(metric_value) AS average_smape\n",
+    "        FROM {catalog}.{db}.monthly_evaluation_output\n",
+    "        GROUP BY unique_id, model) \n",
+    "        ORDER BY unique_id, rank\n",
+    "    ) WHERE rank=1 GROUP BY model \n",
+    "    ORDER BY count DESC\n",
+    "    \"\"\")\n",
+    "\n",
+    "display(model_ranking)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "application/vnd.databricks.v1+cell": {
+     "cellMetadata": {
+      "byteLimit": 2048000,
+      "rowLimit": 10000
+     },
+     "inputWidgets": {},
+     "nuid": "183eb27a-a19a-4967-aa65-26b6f5e8ce11",
+     "showTitle": false,
+     "tableResultSettingsMap": {},
+     "title": ""
+    }
+   },
+   "source": [
+    "\n",
+    "On this dataset (M4 monthly), it appears that TimesFM models were the best performing models based on the number of time series they excelled in.\n",
+    "\n",
+    "Exposing the `evaluation_output` and `forecast_output` tables in these formats provides great flexibility in model selection. For example, you can define your own evaluation metric to compare forecasting accuracy. You can aggregate metrics using a weighted average or even the median across backtesting trials. Additionally, you can retrieve forecasts from multiple models for each time series and ensemble them. All of these options simply require writing queries against these tables."
+   ]
+  }
+ ],
+ "metadata": {
+  "application/vnd.databricks.v1+notebook": {
+   "computePreferences": null,
+   "dashboards": [],
+   "environmentMetadata": {
+    "base_environment": "",
+    "client": "2"
+   },
+   "language": "python",
+   "notebookMetadata": {
+    "mostRecentlyExecutedCommandWithImplicitDF": {
+     "commandId": -1,
+     "dataframes": [
+      "_sqldf"
+     ]
+    },
+    "pythonIndentUnit": 2
+   },
+   "notebookName": "post-evaluation-analysis",
+   "widgets": {}
+  },
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/examples/weekly/foundation_weekly.py b/examples/weekly/foundation_weekly.py
index aecd9ea..cbc2db6 100644
--- a/examples/weekly/foundation_weekly.py
+++ b/examples/weekly/foundation_weekly.py
@@ -180,6 +180,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/weekly/global_weekly.py b/examples/weekly/global_weekly.py
index d0ae5fb..1906ef7 100644
--- a/examples/weekly/global_weekly.py
+++ b/examples/weekly/global_weekly.py
@@ -174,6 +174,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.
 
diff --git a/examples/weekly/local_univariate_weekly.py b/examples/weekly/local_univariate_weekly.py
index 87e58b8..79a833d 100644
--- a/examples/weekly/local_univariate_weekly.py
+++ b/examples/weekly/local_univariate_weekly.py
@@ -212,6 +212,11 @@ def transform_group(df):
 
 # COMMAND ----------
 
+# MAGIC %md
+# MAGIC Refer to the [notebook](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/post-evaluation-analysis.py) for guidance on performing fine-grained model selection after running `run_forecast`.
+
+# COMMAND ----------
+
 # MAGIC %md ### Delete Tables
 # MAGIC Let's clean up the tables.