Reconstruction of table 6 from paper - Dealing with OOV words #13

ThorJonsson · 2016-02-21T14:32:44Z

Hi, thank you very much for this.

I wanted to ask you whether you could elaborate on how table 6 is constructed, I am having some difficulties reconstructing it after training on the PTB-data.
Specifically for OOV words.

I think I understand how to compute the cosine similarity between two words that exist in the word_vecs lookup table. However when I compute the nearest neighbor words based on cosine similarity I get different results from what is described in the paper:

th> get_sim_words('his',5,cpchar,word2idx,idx2word)                                         
{
  1 : 
   {
      1 : "his"
      2 : 1
    }
  2 : 
    {
      1 : "my"
      2 : 0.67714271790195
    }
  3 : 
    {
      1 : "your"
      2 : 0.67532773464339
    }
  4 : 
    {
      1 : "its"
      2 : 0.63439247861717
    }
  5 : 
    {
      1 : "her"
      2 : 0.62416681420755
    }
}

Here I am simply using the lookup table found in checkpoint.protos.rnn.modules[2].weight:double().
I obtain the row in the lookup table which corresponds to the word for which I want the nearest neighbors. Compute the matrix vector product and sort based on similarity.

I assume that for the nearest neighbor words of OOV words you are using the character embedding space? Any help or tips on how you did this would be very appreciated.

Thanks,

bqcao · 2016-04-13T04:04:34Z

Any progress to share please?

yoonkim · 2016-04-14T14:14:50Z

There is randomness built into the models (due to initialization) so you shouldn't expect the nearest neighbors to be exactly the same. Your nearest neighbors seem to make sense (and close to the ones in the paper as well).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconstruction of table 6 from paper - Dealing with OOV words #13

Reconstruction of table 6 from paper - Dealing with OOV words #13

ThorJonsson commented Feb 21, 2016

bqcao commented Apr 13, 2016

yoonkim commented Apr 14, 2016

Reconstruction of table 6 from paper - Dealing with OOV words #13

Reconstruction of table 6 from paper - Dealing with OOV words #13

Comments

ThorJonsson commented Feb 21, 2016

bqcao commented Apr 13, 2016

yoonkim commented Apr 14, 2016