-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add grid lstm #345
base: master
Are you sure you want to change the base?
add grid lstm #345
Conversation
An example of usage of the layer |
But it works. It is more a question on what is happening behind the element research framework... to get it work as fast as the original version, without memory leak. Could someone advise me on these questions : 1° how does garbage collection work ? training multiple forward / backwards does delete tensors ? shall I call forget method during each step of the training ? 2° if I want to initialize the parameters (https://github.com/christopher5106/grid-lstm/blob/master/train.lua#L163-L180) inside the layer, in which function shall I put it ? Thanks a lot |
Using rnn:forget() solves the memory and speed issue. |
Small corrections. Works perfectly now. |
@christopher5106 Is it too l ate for me to ask you to include documentation and unit test? Sorry for the delay. |
Hello guys! Any updates on this PR? Looks tasty! |
What would you like exactly for this? |
@christopher5106 For documentation, adding a section to README.md with link to paper and brief explanation should do the trick. For unit tests, add a function to test.lua to make sure GridLSTM behaves as expected. Doesn't have to be extensive. |
self.cells = {[0] = {}} | ||
|
||
for L=1,self.nb_layers do | ||
local h_init = torch.zeros(input:size(1), self.outputSize):cuda() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will have to fix this option so it can work without a gpu, i'm getting errors on cpu version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried removing cuda() references but could not get it to work on the rnn-sin demo with 4x2 tensor/table
I added 2D Grid LSTM following https://github.com/coreylynch/grid-lstm
It is pretty slow and I get an out of memory error. Sounds I'm not sure to understand all aspects of Element-research/rnn...