Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap PyTorch models under TorchModule #147

Merged
merged 9 commits into from
Feb 9, 2024
Merged

Wrap PyTorch models under TorchModule #147

merged 9 commits into from
Feb 9, 2024

Conversation

bryanlimy
Copy link
Member

@bryanlimy bryanlimy commented Feb 6, 2024

  • new autoemulate/neural_networks modules where we can define new PyTorch architectures by implementing TorchModule
  • NeuralNetTorch takes input argument (string) module as the name of the module to initialize.
  • moved set_random_seed to autoemulate/utils.py as it might be used by other non-PyTorch modules

address Issue #129

…plement the TorchModule class and user can define which NN architecture to use by providing module argument
@bryanlimy
Copy link
Member Author

bryanlimy commented Feb 6, 2024

The update failed the test case check_estimators_overwrite_params in estimator_checks. I am not sure why, if we create a deepcopy of a model and fit the copied model, why would we expect the copied model to be the same as the original model.

@bryanlimy bryanlimy requested a review from mastoffel February 6, 2024 17:21
Copy link
Collaborator

@mastoffel mastoffel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @bryanlimy ! Just a few comments below.

  • the failed check_estimators_overwrite_params tests is fine for now as long as everything else works. Could you add it to the _xfail_check dict in neural_net_torch.py to pass CI?

autoemulate/compare.py Outdated Show resolved Hide resolved
Comment on lines 9 to 15
def register(name):
def add_to_dict(fn):
global _MODULES
_MODULES[name] = fn
return fn

return add_to_dict
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether global variables can lead to issues down the line (state management/testing) and whether it would be better to encapsulate this in a class. If you're frequently using globals and think it's fine, I'm ok with leaving this for the moment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the get_module method to not use global variables.

autoemulate/emulators/neural_networks/neural_networks.py Outdated Show resolved Hide resolved
Comment on lines 10 to 30
class MLPModule(TorchModule):
def __init__(
self,
input_size: int = None,
output_size: int = None,
random_state: int = None,
hidden_sizes: Tuple[int] = (100,),
):
super(MLPModule, self).__init__(
module_name="mlp",
input_size=input_size,
output_size=output_size,
random_state=random_state,
)
modules = []
for hidden_size in hidden_sizes:
modules.append(nn.Linear(in_features=input_size, out_features=hidden_size))
modules.append(nn.ReLU())
input_size = hidden_size
modules.append(nn.Linear(in_features=input_size, out_features=output_size))
self.model = nn.Sequential(*modules)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we should move the hyperparameter search space from neural_net_torch.py to here, as it will be quite specific for each PyTorch model.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it make sense that the hyperparameter settings live within each TorchModule

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the get_grid_params method to TorchModule. A problem with this approach is that the module is not initialized upon creation of NeuralNetTorch, and is only initialized when we call fit for the first time. So if we call grid_params = nn_torch_model.get_grid_params() before a hyperparameter search, we will get an error due to self.module_.get_grid_params does not exists yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can either always initialize the module in NeuralNetTorch.__init__, which would fail some cases in the estimator test suite, or before we do hyperparameter search, we initialize the module on our own.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I see, so this currently fails when running:

em = AutoEmulate()  
em.setup(X, y, model_subset=["NeuralNetTorch"], param_search=True)
em.compare()

with AttributeError: 'NeuralNetTorch' object has no attribute 'module_'

My feeling is that initializing the module in NeuralNetTorch.__init__ is fine and we just add the failed tests to "_xfail_checks", because this seems like what skorch is intending to do anyway.

@mastoffel
Copy link
Collaborator

@bryanlimy sorry one more comment: Would you mind adding a few docstrings?

Copy link
Contributor

github-actions bot commented Feb 9, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  autoemulate
  utils.py 322-323
  autoemulate/emulators
  neural_net_torch.py 75, 117
  autoemulate/emulators/neural_networks
  __init__.py
  base.py 28, 31
  get_module.py 15-16
  mlp.py 37-58
  tests
  test_emulators.py
Project Total  

This report was generated by python-coverage-comment-action

@codecov-commenter
Copy link

codecov-commenter commented Feb 9, 2024

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Comparison is base (d6adb8b) 91.72% compared to head (44582e6) 91.54%.
Report is 15 commits behind head on main.

Files Patch % Lines
autoemulate/emulators/neural_networks/mlp.py 71.42% 8 Missing ⚠️
autoemulate/emulators/neural_net_torch.py 77.77% 2 Missing ⚠️
autoemulate/emulators/neural_networks/base.py 86.66% 2 Missing ⚠️
...utoemulate/emulators/neural_networks/get_module.py 80.00% 2 Missing ⚠️
autoemulate/utils.py 80.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #147      +/-   ##
==========================================
- Coverage   91.72%   91.54%   -0.18%     
==========================================
  Files          36       40       +4     
  Lines        1691     1739      +48     
==========================================
+ Hits         1551     1592      +41     
- Misses        140      147       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bryanlimy bryanlimy merged commit a70aa6e into main Feb 9, 2024
5 checks passed
@bryanlimy bryanlimy deleted the pytorch_wrapper branch February 9, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants