Skip to content

Code repository for the paper "Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction" @ ECCV 2024 (Oral)

License

Notifications You must be signed in to change notification settings

alextimans/conformal-od

Repository files navigation

Overview

This is the public code repository for our work Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction presented as an oral paper at ECCV 2024.

Abstract 📝


Quantifying a model's predictive uncertainty is essential for safety-critical applications such as autonomous driving. We consider quantifying such uncertainty for multi-object detection. In particular, we leverage conformal prediction to obtain uncertainty intervals with guaranteed coverage for object bounding boxes. One challenge in doing so is that bounding box predictions are conditioned on the object's class label. Thus, we develop a novel two-step conformal approach that propagates uncertainty in predicted class labels into the uncertainty intervals of bounding boxes. This broadens the validity of our conformal coverage guarantees to include incorrectly classified objects, thus offering more actionable safety assurances. Moreover, we investigate novel ensemble and quantile regression formulations to ensure the bounding box intervals are adaptive to object size, leading to a more balanced coverage. Validating our two-step approach on real-world datasets for 2D bounding box localization, we find that desired coverage levels are satisfied with practically tight predictive uncertainty intervals.


Citation

If you find this repository useful, please consider citing our work:

@inproceedings{timans2024conformalod,
    title = {Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction}, 
    author = {Alexander Timans and Christoph-Nikolas Straehle and Kaspar Sakmann and Eric Nalisnick},
    booktitle = {Proceedings of the European Conference on Computer Vision},
    year = {2024}
}

Acknowledgements

The Robert Bosch GmbH is acknowledged for financial support.

Repo structure

The folder structure of the repo is quite self-explanatory, but below are some comments on each folder or file on the main repository level. The most important folder is control.

conformal-od/
├── calibration: Conformal nonconformity scores, quantile selection and multiple testing correction, prediction interval construction, random data splitting.
├── config: YAML files with config params per dataset and conformal method, incl. baselines. A description of the config file setup is given in cfg_description.yaml.
├── control: Individual files to run for every conformal method and baseline (both conformal and object detector). These scripts will produce the final output files with performance metrics and results.
├── data: Files data_loader.py for custom dataset loading in detectron2, data_collector.py for custom pre-collection of prediction information, cityscapes.py for custom dataset transforms.
├── detectron2: A local copy of the detectron2 library which is tightly integrated in the data and model prediction aspects of the code.
├── evaluation: Metrics definitions, custom AP evaluation script, final results table generation, notebook with ablation study.
├── gaussian_yolo: Minimal local integration of relevant files to run the GaussianYOLO and YOLOv3 baselines.
├── model: Model- and prediction-related functions such as model definitions that have been slightly modified from the original detectron2 code, model loading, matching predictions to ground truths, and custom quantile regression head training.
├── plots: Plotting utilities and the notebook plots.ipynb to reproduce final plots contained in the paper.
├── util: read/write and other utilities, soume required for the DETR baseline only.
├── commands.py: Script to create a file with python run commands to start a suite of experiments.
├── commands_abl.py: Script to create a file with python run commands to ablate for different coverage levels.
├── env_conf1m.yml: The package environment file as used by myself, with exact package versions.
├── env_minimal.yml: A (heuristic) minimal environment file that contains the most important packages to run the repo.
├── main.py: The main script to run experiments, with CLI args that override any YAML configs.
└── run.sh: A simple shell script that executes the run commands generated by commands.py in series, and logs errors.

To run the code, guidelines are provided below.

Preparing the local setup

  1. Clone repo, e.g. using git
git clone https://github.com/alextimans/conformal-od
  • Note: Make sure to be in the parent directory of conformal-od as working directory. Let's call it run-code, so set working directory to run-code. This directory is also the one from which to launch code runs, which is why the provided sample runs below follow the scheme python conformal-od/script.py -args.
  1. Set-up python env (e.g. with conda package manager)
conda env create -f conformal-od/env_conf1m.yml
conda activate conf1m
  • Either generate python env from env_minimal.yml and add potentially missing packages, or use env_conf1m.yml for a more similar but memory-heavy environment.
  • Make sure that the PYTHONPATH environment variable also points to the conformal-od repo to detect all python modules.
  • Code is written to also be compatible with cuda and GPU support.
  • Note: We have tried to outsource any modifications made to original detectron2 files into separate files (e.g., model/fast_rcnn.py) for ease of use. However, a few minimal changes have been left directly in our local copy of detectron2: (1) one changed import statement in detectron2/modeling/roi_heads/roi_heads.py; (2) one changed import statement in detectron2/modeling/meta_arch/rcnn.py; (3) one line comment in detectron2/data/dataset_mapper.py to retain instance annotations. Should a fresh install of detectron2 be used, these changes need to be manually integrated to point correctly.
  1. Get and prepare the data
  • Download COCO, Cityscapes and BDD100k datasets from their respective websites, which may require requesting access and log-in.
  • Keep the data in their original folder structures, except for Cityscapes: since we never train on it, we move all the data (which is structured by city) to the train folder and leave test and val folders empty. This permits to use all available data for calibration/test splitting.
  • You should now have a data folder accessible from the working directory, e.g., run-code/conformal-od/data/... directly, or run-code/data/... and access from within conformal-od with a symbolic link. Its structure should look something like
data
└── bdd100k
    └── images/100k
        └── test
        └── train
        └── val
    └── labels
└── cityscapes
    └── gtFine
        └── test (empty)
        └── train (all files)
        └── val (empty)
    └── leftImg8bit
        └── test (empty)
        └── train (all files)
        └── val (empty)
└── coco
    └── annotations
    └── labels/val2017
    └── val2017
  1. Download necessary model checkpoints
  • For Box-CQR, the Faster R-CNN model with trained quantile regression head can be downloaded from this Google Drive (x101fpn_train_qr_5k_postprocess). The link also provides the model checkpoints for some of the original pre-trained object detectors (Faster R-CNN, (Gaussian)YOLO, Sparse R-CNN) for good measure.
  • The pretrained models should in principle all be obtainable from the respective original repositories: Faster R-CNN, YOLOv3, DETR, Sparse R-CNN.
  • Place them in a folder accessible by the working directory, e.g., run-code/checkpoints.

Example code runs

  • Note: A name-change in the class label sets means that the 'Naive' method from the paper is referred to as oracle in the code. This should not be confused with the 'Oracle' from the paper, which translates to not using any label set method (i.e., the obtained results from only applying our box interval methods) and thus is not an explicit label set option.

  • Note: The config and script files include path directory names, which will most likely initially throw errors due to non-matching paths. Modify/remove as necessary to obtain the correct directory pointers (see also config/cfg_description.yaml).

  • To check the main script CLI arguments that can be given and their default values and short explanations:

python conformal-od/main.py -h
  • To run a conformal bounding box method with a conformal label set method for a dataset:
    • Choose risk_control from ["std_conf", "ens_conf", "cqr_conf", "base_conf"]
    • Choose label_set from ["top_singleton", "full", "oracle", "class_threshold"]
    • Set config_path thus deciding on dataset choice
    • Default object detector: Faster R-CNN X101-FPN
python conformal-od/main.py --config_file=cfg_std_rank --config_path=conformal-od/config/coco_val --run_collect_pred --save_file_pred --risk_control=std_conf --alpha=0.1 --label_set=class_threshold --label_alpha=0.01 --run_risk_control --save_file_control --save_label_set --run_eval --save_file_eval --file_name_suffix=_std_rank_class --device=cuda
  • To run the full suite of experiments (all combinations & conformal baselines):
python conformal-od/commands.py  # generate commands.txt
sh conformal-od/run.sh  # read and run all combos
  • To run the model baselines (DeepEns, GaussianYOLO, YOLOv3, DETR, Sparse-RCNN):
    • Follow the instructions in their respective script (found in control/baseline_*.py)
    • These scripts are runnables since these methods require additional customization not easily amenable with main.py.
python conformal-od/control/baseline_gaussian_yolo.py --config_file=cfg_base_gaussian_yolo --config_path=conformal-od/config/coco_val --run_collect_pred --save_file_pred --risk_control=gaussian_yolo --alpha=0.1 --run_risk_control --save_file_control --run_eval --save_file_eval --file_name_suffix=_base_gaussian_yolo --device=cuda
  • To generate final aggregated results tables as .csv files, see the interactive evaluation/results_tables.ipynb notebook.
  • To generate final bounding box figures that can be saved to file, see the interactive plots/plots.ipynb notebook.

Interpreting the output

Consider a standard command is executed that runs and saves all intermediate outputs, for example

python conformal-od/main.py --config_file=cfg_std_rank --config_path=conformal-od/config/coco_val --run_collect_pred --save_file_pred --risk_control=std_conf --alpha=0.1 --label_set=class_threshold --label_alpha=0.01 --run_risk_control --save_file_control --save_label_set --run_eval --save_file_eval --file_name_suffix=_std_rank_class --device=cuda

The executable will create an associated output folder which logs all the results, and which is located in the output directory specified in the config file. In this case, the outputs will be located in .../output/std_conf_x101fpn_std_rank_class.

The folder will contain a (sub)set of the possible files listed below. Most important are the .csv files which contain readable, aggregate-level results over evaluation metrics.

  • log.txt: The logging file capturing standard print output. This is quite verbose and nice to track if all the settings and behaviour is as expected.
  • *_ap_info.json, *_ap_scores.json: Outputs from the AP metric evaluations on predictive model performance (if the option is on).
  • *_ap_table.csv: The file with readable AP evaluation results.
  • *_img_list.json: An index list tracking which object classes appear in which images (see data/data_collector.py)
  • *_ist_list.json: A dictionary list where each dictionary collects all the object-level information for objects of a particular class (see data/data_collector.py)
  • *_test_idx.pt: A tensor tracking which images (and associated objects) are assigned to calibration/test across trials.
  • *_control.pt: A tensor storing all the evaluation metrics for the conformal box method across trials, classes and coordinates (used to generate the .csv files)
  • *_res_table_*.csv: A table with aggregate results for the conformal box method only (i.e., using 'oracle' box quantiles and no label sets). The second * denotes the assigned name of the nonconformity function (e.g., abs_res).
  • *_box_set_table_*.csv: A table with aggregate results for the conformal box method and selected conformal label set method (i.e., after the full two-step approach).
  • *_label_table.csv: A table with aggregate results for the conformal label set method only (i.e., no bounding box procedure).
  • *_label.pt: A tensor storing all the evaluation metrics for the conformal label set method across trials, classes and coordinates (used to generate the .csv file).

Since storing all these files for every run can become storage-heavy, boolean toggle arguments can be leveraged (see main.py -h) to reduce the intermediate write operations. However, it can be useful to store intermediate outputs once and then leverage those files to perform cheaper evaluations for other experiments. For example, *_img_list.json and *_ist_list.json contain all the necessary prediction information and can be reused for multiple different post-hoc conformal method combinations.

Still open questions?

If there are any problems you encounter which have not been addressed, please feel free to create an issue or reach out!