Bad size of gradInput in BiSequencerLM #418

saztorralba · 2017-06-12T11:09:19Z

Hi

I'm trying to use BiSequencerLM to train a network using sequences of different length, but I'm finding an issue in the gradInput of the BiSequencerLM module. When a sequence is shorter than the previous sequence, the number of elements in self.gradInput in the function below is the number of elements of the previous sequence, not the current sequence.

function BiSequencerLM:updateGradInput(input, gradOutput)
   local nStep = #input

   self._mergeGradInput = self._merge:updateGradInput(self._mergeInput, gradOutput)
   self._fwdGradInput = self._fwd:updateGradInput(_.first(input, nStep - 1), _.last(self._mergeGradInput[1], nStep - 1))
   self._bwdGradInput = self._bwd:updateGradInput(_.last(input, nStep - 1), _.first(self._mergeGradInput[2], nStep - 1))

   -- add fwd rnn gradInputs to bwd rnn gradInputs
   for i=1,nStep do
      if i == 1 then
         self.gradInput[1] = self._fwdGradInput[1]
      elseif i == nStep then
         self.gradInput[nStep] = self._bwdGradInput[nStep-1]
      else
         self.gradInput[i] = nn.rnn.recursiveCopy(self.gradInput[i], self._fwdGradInput[i])
         nn.rnn.recursiveAdd(self.gradInput[i], self._bwdGradInput[i-1])
      end
   end
   return self.gradInput
end

I believe this is caused by self.gradInput not being recreated for the new sequence, and hence maintaining the length of the previous sequence. This causes an error when you have further modules down to backpropagate, because their gradOutput is going to have incorrect size (different to their input). This issue can be fixed by setting gradInput to an empty table before the call to updateGradInput. I can do this by accessing the module form my code, but maybe it would be better to just add this line of code

self.gradInput={}

before the for loop (something equivalent would have to be done if working with Tensors instead of Tables).

Or maybe I'm wrong and this is expected behavior and I'm doing something wrong, in which case any advice is appreciated. Thanks!

The text was updated successfully, but these errors were encountered:

murthyrudra · 2017-09-09T13:17:54Z

Hi,
I'm facing same issue when training a language model (sort of). Please suggest me how do i get this issue resolved. Additionally, i'm using optim package for optimization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad size of gradInput in BiSequencerLM #418

Bad size of gradInput in BiSequencerLM #418

saztorralba commented Jun 12, 2017

murthyrudra commented Sep 9, 2017

Bad size of gradInput in BiSequencerLM #418

Bad size of gradInput in BiSequencerLM #418

Comments

saztorralba commented Jun 12, 2017

murthyrudra commented Sep 9, 2017