Regarding the BiLSTM baseline model stated in the PAWS paper #3

AladarMiao · 2019-06-27T03:54:20Z

If I read the PAWS paper correctly, it stated that BiLSTM+cosine similarity is one of the baseline models that was used to evaluate the PAWS dataset. I tried to reenact the experiment with a BiLSTM+cosine similarity model I designed, but the accuracy is still quite far from the accuracy stated in the paper. Is there somewhere to see how you guys defined the BiLSTM+cosine similarity model? It would be really helpful on my current study regarding paraphrase identification. Thanks in advance!

yuanzh · 2019-07-01T16:59:44Z

Hi, sorry for the delay. Could you please specify which number in the paper you would like to compare to, and whether you got a lower or a higher accuracy number?

Regarding our model architecture, it's a standard BiLSTM with dropout=0.2, hidden size = 256, activation = relu, using the first/last state vec of the forward/backward LSTM, and Glove embedding. What's your model configuration?

AladarMiao · 2019-07-02T00:43:42Z

I am currently using a self trained embedding, BiLSTM, last state vec, concatenate, and dense as the last layer. If what you stated is the case, where does cosine similarity come in? I am comparing my model with what's stated on page 8 of the paper, where BiLSTM achieved a 86.3 acc and 91.6 auc.

yuanzh · 2019-07-03T18:30:32Z

Each input is first mapped to a vector by the BiLSTM. Let v_l, and v_r be the vectors of the left/right inputs.
The final score is sigmoid(a(cosine_similarity(v_l, v_r) + b)) where a and b are learned variables. I'm not sure if the affine transformation makes a big difference.

Just to be more precise, we take the state at the last token for the forward LSTM, and the state at the first token for the backward LSTM. Concatenate the two states and add a dense layer to project them to the required dimension (256).

AladarMiao · 2019-07-04T03:00:41Z

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding the BiLSTM baseline model stated in the PAWS paper #3

Regarding the BiLSTM baseline model stated in the PAWS paper #3

AladarMiao commented Jun 27, 2019

yuanzh commented Jul 1, 2019

AladarMiao commented Jul 2, 2019

yuanzh commented Jul 3, 2019

AladarMiao commented Jul 4, 2019

Regarding the BiLSTM baseline model stated in the PAWS paper #3

Regarding the BiLSTM baseline model stated in the PAWS paper #3

Comments

AladarMiao commented Jun 27, 2019

yuanzh commented Jul 1, 2019

AladarMiao commented Jul 2, 2019

yuanzh commented Jul 3, 2019

AladarMiao commented Jul 4, 2019