-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap PyTorch models under TorchModule
#147
Conversation
…plement the TorchModule class and user can define which NN architecture to use by providing module argument
The update failed the test case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @bryanlimy ! Just a few comments below.
- the failed check_estimators_overwrite_params tests is fine for now as long as everything else works. Could you add it to the
_xfail_check
dict inneural_net_torch.py
to pass CI?
def register(name): | ||
def add_to_dict(fn): | ||
global _MODULES | ||
_MODULES[name] = fn | ||
return fn | ||
|
||
return add_to_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether global variables can lead to issues down the line (state management/testing) and whether it would be better to encapsulate this in a class. If you're frequently using globals and think it's fine, I'm ok with leaving this for the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the get_module
method to not use global variables.
class MLPModule(TorchModule): | ||
def __init__( | ||
self, | ||
input_size: int = None, | ||
output_size: int = None, | ||
random_state: int = None, | ||
hidden_sizes: Tuple[int] = (100,), | ||
): | ||
super(MLPModule, self).__init__( | ||
module_name="mlp", | ||
input_size=input_size, | ||
output_size=output_size, | ||
random_state=random_state, | ||
) | ||
modules = [] | ||
for hidden_size in hidden_sizes: | ||
modules.append(nn.Linear(in_features=input_size, out_features=hidden_size)) | ||
modules.append(nn.ReLU()) | ||
input_size = hidden_size | ||
modules.append(nn.Linear(in_features=input_size, out_features=output_size)) | ||
self.model = nn.Sequential(*modules) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we should move the hyperparameter search space from neural_net_torch.py to here, as it will be quite specific for each PyTorch model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it make sense that the hyperparameter settings live within each TorchModule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved the get_grid_params
method to TorchModule
. A problem with this approach is that the module
is not initialized upon creation of NeuralNetTorch
, and is only initialized when we call fit
for the first time. So if we call grid_params = nn_torch_model.get_grid_params()
before a hyperparameter search, we will get an error due to self.module_.get_grid_params
does not exists yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can either always initialize the module
in NeuralNetTorch.__init__
, which would fail some cases in the estimator test suite, or before we do hyperparameter search, we initialize the module on our own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I see, so this currently fails when running:
em = AutoEmulate()
em.setup(X, y, model_subset=["NeuralNetTorch"], param_search=True)
em.compare()
with AttributeError: 'NeuralNetTorch' object has no attribute 'module_'
My feeling is that initializing the module
in NeuralNetTorch.__init__
is fine and we just add the failed tests to "_xfail_checks"
, because this seems like what skorch is intending to do anyway.
@bryanlimy sorry one more comment: Would you mind adding a few docstrings? |
…forward and get_grid_params in TorchModule. add docstring
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #147 +/- ##
==========================================
- Coverage 91.72% 91.54% -0.18%
==========================================
Files 36 40 +4
Lines 1691 1739 +48
==========================================
+ Hits 1551 1592 +41
- Misses 140 147 +7 ☔ View full report in Codecov by Sentry. |
…where model was tested without initialization
autoemulate/neural_networks
modules where we can define new PyTorch architectures by implementingTorchModule
NeuralNetTorch
takes input argument (string)module
as the name of the module to initialize.set_random_seed
toautoemulate/utils.py
as it might be used by other non-PyTorch modulesaddress Issue #129