Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed condition_on_observations in fully Bayesian models #2151

Closed
wants to merge 3 commits into from

Conversation

hvarfner
Copy link
Contributor

Motivation

Conditioning on observations in fully bayesian models - enables fully Bayesian JES & KG(?).

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

Tests are written to ensure functionality for inferred and fixed noise. note that the _aug_batch_shape attribute assignment was removed in condition_on_observations. In FullyBayesianGPs, this argument could not be assigned (hence the removal). I could not find the use for this argument, and all tests passed when removing it.

Other changes are commented throughout, and the changes were made so as to assure that FBGPs can have one set of training data throughout. Howver, conditioning on obervations adds a batch dim to the training data (which is necessary in GPyTorch here) to infer the correct batch dim.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Dec 16, 2023
@hvarfner
Copy link
Contributor Author

@dme65

@facebook-github-bot
Copy link
Contributor

@sdaulton has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@sdaulton
Copy link
Contributor

Thanks for putting this up @hvarfner!

@dme65
Copy link
Contributor

dme65 commented Dec 18, 2023

Thanks for fixing this @hvarfner! These changes generally look good to me.

Copy link
Contributor

@saitcakmak saitcakmak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making these changes. I left some, mostly cosmetic, comments in line.
One thing I want to make sure is that the model batch shape remains compatible with the model itself. To this end, could you also update the batch_shape property of SaasFullyBayesianSingleTaskGP to reflect the correct shape before & after fantasizing?

botorch/models/fully_bayesian.py Outdated Show resolved Hide resolved
botorch/models/fully_bayesian.py Outdated Show resolved Hide resolved
botorch/models/fully_bayesian.py Outdated Show resolved Hide resolved
botorch/models/fully_bayesian.py Outdated Show resolved Hide resolved
raise ValueError(
"Conditioning in fully Bayesian models must contain a batch dimension."
"Add a batch dimension (the leading dim) with length matching the "
"number of hyperparameter sets to the conditioned data."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the number of hyper-parameter sets self.num_mcmc_samples or self.batch_shape (the two are the same, unless the model is already batched / fantasized)? If I were to attempt fantasizing from an already fantasized model, would this still hold? It'd be good to print out the expected shape as part of the error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a bit of an incorrect rushed comment before, which I deleted. =)

botorch/models/gpytorch.py Outdated Show resolved Hide resolved
botorch/models/gpytorch.py Outdated Show resolved Hide resolved
test/models/test_fully_bayesian.py Outdated Show resolved Hide resolved
@@ -656,6 +666,77 @@ def test_custom_pyro_model(self):
atol=5e-4,
)

def test_condition_on_observation(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these tests. For completeness, can you add a test that calls model.fantasize and a test that evaluates some acquisition function (e.g., qLogEI) with the fantasized model? I just want to make sure everything works e2e including the sampler related code inside the acquisition functions.

@hvarfner hvarfner force-pushed the main branch 3 times, most recently from 01f9e5e to b4040b5 Compare December 20, 2023 08:49
@hvarfner
Copy link
Contributor Author

@saitcakmak thanks for the feedback!

I didn't consider fantasizing before, so I had to rework a little bit to add it. Now, the batch shapes are inferred in condition_on_observations for the fantasizing case X.shape = n x d, Y.shape = fantasy_dim x batch_dim x n x 1 and the no-batch_shape case X.shape = n x d, Y.shape = n x 1. In both cases, the output batch shape will be non-empty, as seen in the added tests.

@saitcakmak
Copy link
Contributor

Thanks for making the changes. I had not realized that your original changes did not affect the model batch size (as in, the output shape and the shape of the lengthscales does not change in the conditioning tests). But fantasizing (which is the use case of conditioning I am most familiar with) does change the batch shape by adding a another dimension to it.

So, if I tried to use the fantasized model with some acquisition functions, it will now error out because the batch_shape property of the model is not updated. E.g., adding

            acqf = qNoisyExpectedImprovement(
                model=fantasy_model, X_baseline=train_X, prune_baseline=False
            )
            acqf(train_X)

to the end of the fantasize test will raise AssertionError: Expected the output shape to match either the t-batch shape of X, or the model.batch_shape in the case of acquisition functions using batch models; but got output with shape torch.Size([19]) for X with shape torch.Size([1, 10, 4]). This is because the model batch shape becomes fantasy_size, num_models after fantasizing but Line 427-31 defines batch_shape as torch.Size([self.num_mcmc_samples]) (equal to num_models in the test).

One option here would be to update the definition of the batch_shape property to correctly return the batch shape of both the basic and fantasized models. Another would be to roll back the fantasize support, which seems to require some more work to get working correctly (probably beyond the scope of this PR). Knowledge gradient is the most common use case for fantasy models that I know of and it seems to trigger additional errors when I add this bit to the fantasize test:

        acqf = qKnowledgeGradient(
            model=model, num_fantasies=4
        )
        acqf(train_X)

@hvarfner
Copy link
Contributor Author

@saitcakmak Yeah, there seems to be some nuance to this that I failed to consider. I think there are two things that need to be ironed out if we want consistency (and retain only one batch of training data by default):

  1. What should the shape of the FBGP posterior be? Currently, the input must be batch_shape * q * d, so that batch_shape * num_models * q * d is returned by unsqueezing exactly MCMC_DIM = -3 in the posterior (previously the forward) call. For a the aforementioned fantasy_model to work, two dims must be unsqueezed. (or (len(batch_shape) dims in general). Moreover, this makes model.posterior(train_X) rather awkward, since the shape of the posterior will have the first two dims swapped compared to the conventional call described above. acqf(train_X) does not give the intended shape for any FB-acquisition function AFAIK - it has to be three-dimensional.

  2. When using fantasize, I think that the user would be pretty surprised if they received a
    batch_shape * num_fantasies * num_models * q * d (since the fantasy dim typically goes first, right?). However, this is more in line with the current FBGP structure, and this solution could be accommodated by updating the batch_shape and unsqueezing len(batch_shape) dimensions in the posterior call. Ultimately, the models dim needs to trail the batch_shape dim due to this line, so I struggle to see a general solution that gives the consistent num_fantasies * batch_shape * num_models * q * d when fantasizing.

I am not quite sure how this can be solved in a consistent manner without reconsidering the FBGP altogether (i.e. making the model dim a leading dim), so I am for rolling back as of now!

@saitcakmak
Copy link
Contributor

That sounds good to me. We can revisit it down the line if there's a need for fantasize support. Happy to merge this in after you push the changes rolling it back.

@hvarfner hvarfner force-pushed the main branch 2 times, most recently from c52c329 to 17fdee8 Compare December 27, 2023 08:34
@hvarfner
Copy link
Contributor Author

@saitcakmak done!

I think the long-term fix (which I don't think would be too much work) would be to have MCMC_DIM be a leading dim, but the last one, i.e. at index -4. I could give this a go going forward if you don't foresee any issues with it.

@facebook-github-bot
Copy link
Contributor

@saitcakmak has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@saitcakmak saitcakmak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I haven't thought too deeply about this, but I think a proper way to support fantasize would start with adding support for batched inputs more generally (in init) and add the MCMC dimension as the outer-most dimension as you suggest. We could also store the train_inputs/targets as batched (along the MCMC) dimension to bring the model internals further in-line with other batched models (e.g., SingleTaskGP). That being said, this is a decent bit of work to make these changes (will also require updating other parts of the code base that have specialized logic for this model), so it might not be worth the effort unless there's a real motivator.

@facebook-github-bot
Copy link
Contributor

@saitcakmak merged this pull request in 0c37aac.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants