-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use masksembles? #3
Comments
Hey @ToBeNormal, thanks! Could you elaborate a bit on that, what you would like to be added? I have plans to provide a tutorial on how to use Masksembles layers for eg CIFAR dataset, would it be good?
In general, yes, that's the simplest way to use it. You could start with inserting Masksembles layers instead of dropout layer, in our experiments it worked fine and brought improvement in quality of generated uncertainty. On the other hand, if you would like to achieve single model -- ensembles transition that we've described in our paper then you need to increase the number of channels in your layers. Anyway, from a more practical point of view, I would recommend the first option though. |
The tutorial on how to use Masksembles layers for eg CIFAR dataset would be helpful enough! Thanks a lot! |
hey @ToBeNormal, sorry for the long response. I've updated README with some examples and colab notebook with MNIST. Would be great if you can check it and comment if it works for you. |
Hi, sorry for the long response. Does the batchsize mean sample times during the test process? |
@nikitadurasov Hi, we can only test one sample every time during the test process, is it right? Does the batch size of the test have to be 1? |
@nikitadurasov Hi, I don't understand the meanings of the total model size and model size? Could you give me a favor? Thanks a lot! |
@nikitadurasov Hi, does masksembles layer bring learnable parameters? In table 1, the model size of MC-dropout |
@nikitadurasov Hi~Could you tell me the batch size during the training and testing process? |
Hey @ToBeNormal, Putting it in a straight way: when you have a batch of N samples and there are M models in Masksembles, then after inference you'll get a batch of predictions with N samples still. The trick is that because of the current implementation first N / M samples in predictions in the batch will be predictions corresponding to the first submodel, the second N / M predicted values from the second submodel, and so on. That's the reason why why it's required to N % M = 0 About total size, imagine that we have a simple NN model model with only one hidden layer. Input and output size are fixed (let's say they are I and O, and the hidden layer is H size). This way if you add Masksembles model after the hidden layer, then every submodel of Masksembles effectively will take less number of active neurons and in general, will have a smaller capacity. To avoid that we increase H for Masksembles model to make it fit submodels sizes to the original model. Hope it would help! |
Hello, do you have a Pytorch version of this code |
Hi, thank you for your excellent work!
I have read the masksembles layer code, but I have a little issue with understanding how to use it.
Does it mean replacing the dropout layer with masksembles layer in practice?
Could you give a more detailed example?
I have run the following test code sucessfully.
layer = Masksembles1D(20, 10,2.) output=layer(torch.randn([40,20])) print(output)
The text was updated successfully, but these errors were encountered: