Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanded guassian_process_torch hyperparameter space #243

Merged
merged 10 commits into from
Sep 30, 2024

Conversation

MaxBalmus
Copy link
Collaborator

@MaxBalmus MaxBalmus commented Sep 17, 2024

  1. Expanded the hyper parameter space of the torch GPEs to include more expensive, but also potentially more expensive cases:

    • added ScaleKernel(MatternKernel())+ConstantKernel()
    • set ard_num_dims to be the number of inputs for the kernels where the is possible (this allows the kernels to adjust the lengthscale based on individual parameters)
    • replaced ConstantMean() and ZeroMean() from the mean_module with the higher order LinearMean() and PolyMean(degree=2)
    • PolyMean is a new class of mean_module found in autoemulatate.emulators.gaussian_process_utils
    • process_param_space: added new parameter input_dim (default value 1) equal the number of dimensions of the input. This addition is necessary to be able to define the value of ard_num_dims.
  2. Defined a new callback EarlyStoppingMax found in autoemulatate.emulators.gaussian_process_utils:

    • the new class is derived skorch.callbacks.EarlyStopping
    • it fixes a bug which assumes that the cost function is always positive which can break the monotonicity of the cost function threshold
    • a pull request was created on skorch to fix the original bug and is currently under review.

@MaxBalmus MaxBalmus requested a review from mastoffel September 17, 2024 13:06
Copy link
Collaborator

@mastoffel mastoffel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR @MaxBalmus ! Lots of good stuff in there! I've given some comments, let me know whether anything is unclear.

Some broader comments:

  1. having seperate lengthscales for features sounds great, but I suggested a slightly different implementation (see below). Let me know whether you'd like to have a go, otherwise I'm happy to do that.
  2. I think I'd be good to put the train/valid split for early stopping in a seperate PR
  3. there's a typo in the utils file name
  4. it'd be great to add docstrings and tests where possible (see below for details)
  5. could you give a bit of context about the new mean modules for me to understand them a bit better?

@@ -253,7 +253,7 @@ def predict(self, X, return_std=False):
return mean

@staticmethod
def get_grid_params(search_type: str = "random"):
def get_grid_params(search_type: str = "random", input_dim=1):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting ard_num_dims is great, and we should definitely do that (I hope that the computational cost isn't too big, but lets try this). I think adding an input_dim argument isn't the cleanest option as we don't need it for most models and it suddenly makes get_grid_params data dependent. Instead, we can change the kernels to be callables and initialise them in fit when self.n_features_in_ is available. I'm thinking about something like this:

So we could have the kernels as callables like this:

    """Returns the grid parameters for the emulator."""
    param_space = {
        "covar_module": [
            lambda n_features: gpytorch.kernels.RBFKernel(ard_num_dims=n_features).initialize(lengthscale=1.0),
            lambda n_features: gpytorch.kernels.MaternKernel(nu=2.5, ard_num_dims=n_features),
            lambda n_features: gpytorch.kernels.MaternKernel(nu=1.5, ard_num_dims=n_features),
            lambda n_features: gpytorch.kernels.PeriodicKernel(ard_num_dims=n_features),
            lambda n_features: gpytorch.kernels.RQKernel(ard_num_dims=n_features),
        ],

and then in the fit method call the callable with the number of features:

covar_module = self.covar_module(self.n_features_in_) if callable(self.covar_module) else self.covar_module

Let me know if there's anything I'm missing. Happy to implement this myself or leave it up to you!

Comment on lines +124 to +125
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)
train_split = predefined_split(Dataset(X_val, y_val))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, I was wondering about this. The thing is that we have quite small datasets already, and we split initially into test/train and then do cv on train which splits and this would therefore be a third split. Would if be ok to open a seperate PR for this? Maybe we implement it as an option and see how it does.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see your point. Let's open a new PR.

from .polynomial_features import PolynomialFeatures


class PolyMean(gpytorch.means.Mean):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding a docstring and tests for this? Feel free to start a new test file. Testing with pytest tests/test_gp_utils.py

import torch


class PolynomialFeatures:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, having a docstring and tests would be great here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted the docstrings but to be honest I am not 100% how to go about making the tests.

autoemulate/emulators/gaussian_process_torch.py Outdated Show resolved Hide resolved
@aranas
Copy link
Collaborator

aranas commented Sep 18, 2024

4. it'd be great to add docstrings and tests where possible (see below for details)

Re testing, I personally would be very keen on some upskilling re best practices around code testing. I know there are many resources around code testing, see here for one developed at the Turing & Turing Way community: https://book.the-turing-way.org/reproducible-research/testing.

I wonder, @MaxBalmus might this also be of interest to you and other researchers in TRIC? And @mastoffel would there be capacity to do a small session on this from REG side? apart from general upskilling for TRIC researchers re writing good tests, we could use this as an opportunity to develop a paragraph on contributor requirements re testing for autoemulate specifically. What do you think?

also to note, there is an upcoming Turing Way book dash that could be an opportunity to run this session using (and potentially improving!) The Turing Way resource specifically.

@mastoffel
Copy link
Collaborator

@aranas I'm sure there would be capacity if there's interest. Happy to chat about this. Lots of possibilities here, like an interactive session or just a talk etc. . I haven't read the Turing Way bits on testing yet, but will definitely do it now, thanks for the link!

@mastoffel
Copy link
Collaborator

@MaxBalmus let me know if this is ready again to review!

@MaxBalmus
Copy link
Collaborator Author

I didn't get to write the tests that you indicated. Unfortunately, I am a bit pressed for time. Could you have a look into that?

@mastoffel mastoffel mentioned this pull request Sep 30, 2024
@mastoffel mastoffel merged commit f2ee0cb into alan-turing-institute:main Sep 30, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants