Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A follow up on large noise for input perturbation on MDS method #156

Closed
2454511550Lin opened this issue May 3, 2023 · 5 comments
Closed

Comments

@2454511550Lin
Copy link

Dear Authors, I have a follow-up about the "extremely good" result using the MDS method using large noise for input processing in #154. I was running on Cifar-10 as ID. I was testing different noises using the postprocessor_sweep in configs/postprocessors/mds.yml:

postprocessor:
  name: mds
  APS_mode: True
  postprocessor_args:
    noise: 0.0014
    ...
  postprocessor_sweep:
    noise_list: [0, 0.0005, 0.001, 0.0014, 0.002, 0.0024, 0.005, 0.01, 0.05, 0.1, 0.2,0.3]

The script surprisingly picks noise = 0.3, and claims that it has the best AUROC on validation datasets. This does not make sense as from my perspective 0.3 will no doubt distort any information in the original image. I expect all ID/OOD samples would have extremely high scores and are not distinguishable. But here is the log and result:

Performing inference on cifar10 dataset...
Starting automatic parameter search...
Hyperparam:[0], auroc:0.6337179444444444
Hyperparam:[0.0005], auroc:0.6298636666666666
Hyperparam:[0.001], auroc:0.625976388888889
Hyperparam:[0.0014], auroc:0.6228961111111111
Hyperparam:[0.002], auroc:0.6182671111111111
Hyperparam:[0.0024], auroc:0.6152154999999999
Hyperparam:[0.005], auroc:0.5956825
Hyperparam:[0.01], auroc:0.5598402222222223
Hyperparam:[0.05], auroc:0.3970671111111111
Hyperparam:[0.1], auroc:0.48044600000000004
Hyperparam:[0.2], auroc:0.8543432777777779
Hyperparam:[0.3], auroc:0.9787509999999999
Final hyperparam: 0.3

ood.csv outputs:

image

I also visualize the scores using histogram and boxplot:

image

I then think what happens is that only the ID samples are processed with large noise of 0.3, while OOD samples are only processed with small noise. But what causes this in the code? I think the issue is at the configs/postprocessors/mds.yml. When we specify the default noise, postprocessor.postprocessor_args.noise, it isn't overwritten for processing OOD samples even after the hyperparameter searching. Therefore, the code chooses 0.3 for ID, and 0.0014 for OOD for input perturbation in this case. To verify, I set postprocessor.postprocessor_args.noise = 0.3, I expect all the ID/OOD scores to be pushed very high this time:

postprocessor:
  name: mds
  APS_mode: True
  postprocessor_args:
    noise: 0.3
    ...
  postprocessor_sweep:

    noise_list: [0.3]

The log gives:

Starting automatic parameter search...
Hyperparam:[0.3], auroc:0.500437
Final hyperparam: 0.3

ood.csv outputs:
image

And boxplots show that both ID/OOD scores are indistinguishable (except for MNIST):

image

But now I am confused. It looks like the ID dataset (Cifar-10) is not pushed toward zero as it was in the previous case, although the noise is the same. But the result kind of makes sense because scores are indistinguishable. It is hard for me to tell if the code is "correct" at this point. I am not sure if I set the config file correctly, but any feedback from you will be greatly appreciated. I am more than happy to provide more details if needed.

@zjysteven
Copy link
Collaborator

zjysteven commented May 3, 2023

@2454511550Lin You are right. More specifically, the bug is because of the following lines:

id_pred, id_conf, id_gt = postprocessor.inference(
net, id_data_loader['test'])
if self.config.recorder.save_scores:
self._save_scores(id_pred, id_conf, id_gt, dataset_name)
if self.config.postprocessor.APS_mode:
self.hyperparam_search(net, [id_pred, id_conf, id_gt],
ood_data_loaders['val'], postprocessor)
# load nearood data and compute ood metrics
self._eval_ood(net, [id_pred, id_conf, id_gt],
ood_data_loaders,
postprocessor,
ood_split='nearood')
# load farood data and compute ood metrics
self._eval_ood(net, [id_pred, id_conf, id_gt],
ood_data_loaders,
postprocessor,
ood_split='farood')

where the searched hyperparameter will only be applied to OOD samples and not to ID samples (the [id_pred, id_conf, id_gt] is obtained only once with the default hyperparam). When you directly specify the hyperparam instead of using automatic search, the issue is gone which is expected.

I've actually fixed this bug in my local branch but just realized that I haven't pushed it to the main branch. Thanks for all the experimentation though!

@2454511550Lin
Copy link
Author

I see. Thank you for letting me know. While directly specifying the hyperparam gives the correct result, are you planning to push the fixed version to the main branch sooner? It would be very helpful to have an automatic hyperparameters searching scripts to work with.

@zjysteven
Copy link
Collaborator

Yes I will make a pull request in 1-2 days. I'm just wrapping up a lot of commits and trying to document what changes I made. Will let you know for sure.

@2454511550Lin
Copy link
Author

Sounds good. Thank you so much!

@zjysteven
Copy link
Collaborator

Hi @2454511550Lin, just so you know that OpenOOD v1.5 has been released. See here for a summary of the updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants