Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NP Regression Model w/ LIG Acquisition #2683

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

eibarolle
Copy link

Motivation

This pull request adds a Neural Process Regression Model with a Latent Information Gain acquisition function for BoTorch functionality.

Have you read the Contributing Guidelines on pull requests?

Yes, and I've followed all the steps and testing.

Test Plan

I wrote my own unit tests for both the model and acquisition function, and all of them passed. The test files are in the appropriate folder. In addition, I ran the pytests on my files, and all of them succeeded for those files.

Related

I made a repository holding the pushed files at https://github.com/eibarolle/np_regression, and it has the appropriate API documentation.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jan 18, 2025
@sdaulton
Copy link
Contributor

Hi @eibarolle! Thanks for the PR! I'll review it shortly.

Copy link
Contributor

@sdaulton sdaulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks the PR! This is looking good. I left some comments inline

import torch
import torch.nn as nn
import matplotlib.pyplot as plts
# %matplotlib inline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# %matplotlib inline

Comment on lines 24 to 27
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import (RBF, Matern, RationalQuadratic,
ExpSineSquared, DotProduct,
ConstantKernel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import (RBF, Matern, RationalQuadratic,
ExpSineSquared, DotProduct,
ConstantKernel)

Let's avoid the sklearn dependency since it isn't used

ConstantKernel)
from typing import Callable, List, Optional, Tuple
from torch.nn import Module, ModuleDict, ModuleList
from sklearn import preprocessing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from sklearn import preprocessing

from typing import Callable, List, Optional, Tuple
from torch.nn import Module, ModuleDict, ModuleList
from sklearn import preprocessing
from scipy.stats import multivariate_normal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from scipy.stats import multivariate_normal

Let's remove the unused imports

from scipy.stats import multivariate_normal
from gpytorch.distributions import MultivariateNormal

device = torch.device("cpu")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
device = torch.device("cpu")

Let's make the code agnostic to the device used.

Comment on lines 294 to 297
if n == 1:
eps = torch.autograd.Variable(logvar.data.new(self.z_dim).normal_()).to(device)
else:
eps = torch.autograd.Variable(logvar.data.new(n,self.z_dim).normal_()).to(device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if n == 1:
eps = torch.autograd.Variable(logvar.data.new(self.z_dim).normal_()).to(device)
else:
eps = torch.autograd.Variable(logvar.data.new(n,self.z_dim).normal_()).to(device)
shape = [n, self.z_dim]
if n == 1:
shape = shape[1:]
eps = torch.autograd.Variable(logvar.data.new(*shape).normal_()).to(device)

This is a bit more concise

mu: torch.Tensor,
logvar: torch.Tensor,
n: int = 1,
min_std: float = 0.1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default seems high, no?

Comment on lines 356 to 368
def load_state_dict(
self,
state_dict: dict,
strict: bool = True
) -> None:
"""
Initialize the fully Bayesian model before loading the state dict.

Args:
state_dict (dict): A dictionary containing the parameters.
strict (bool): Case matching strictness.
"""
super().load_state_dict(state_dict, strict=strict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed, since it just call's the parent class's method

Comment on lines 450 to 455
ind = np.arange(x.shape[0])
mask = np.random.choice(ind, size=n_context, replace=False)
x_c = torch.from_numpy(x[mask])
y_c = torch.from_numpy(y[mask])
x_t = torch.from_numpy(np.delete(x, mask, axis=0))
y_t = torch.from_numpy(np.delete(y, mask, axis=0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not do this in pytorch?

import torch
#reference: https://arxiv.org/abs/2106.02770

class LatentInformationGain:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we implement this as a subclass of Acquisition, so that we can use it more organically in botorch? Likely the context would be needed to be provided in LatentInformationGain.__init__

@eibarolle
Copy link
Author

I applied the suggested changes to my new code. Note that for the decoder, the other functions are designed around its current state.

@Balandat
Copy link
Contributor

@hvarfner curious if you have any thoughts on this PR re the information gain aspects

@hvarfner
Copy link
Contributor

Interesting! I'll check out the paper quickly and get back to you


self.acquisition_function.num_samples = 20
lig_2 = self.acquisition_function.forward(candidate_x=self.candidate_x)
self.assertTrue(lig_2.item() < lig_1.item())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this unit test necessarily be a good check? I am not sure on the details, but it seems to me that this just improves the accuracy of the acquisition computation.

self.context_x = context_x.to(device)
self.context_y = context_y.to(device)

def forward(self, candidate_x):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me like the acquisition function computes the information gain on one batch of points, with the batch being in the first dimension. Thus, the output of this forward would be one scalar.

This would run contrary to the acquisition function convention, and so it wouldn't be able to be used with optimize_acqf etc.

Copy link
Contributor

@hvarfner hvarfner Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would want the forward to be able to handled a N x q x D-shaped input, where you are currently only computing the q-element (but that aspect seems correct as far as I can tell right now). This may be a bit challenging, but certainly looks doable!

I think this a pretty cool idea that could be useful generally in latent space models, so it would be nice some fairly general naming for the encoding steps if this were to be used with other encoder-decoder based architectures.

@eibarolle
Copy link
Author

The Latent Information Gain forward function has been updated with the correct dimensions, with the test cases adjusted as needed.

@eibarolle
Copy link
Author

eibarolle commented Feb 10, 2025

Notify me when you can check over the new Latent Information Gain function. @hvarfner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants