Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug?] Positive probabilities in Alignment CRF when length nontrivial #46

Open
tyang92 opened this issue Jan 28, 2020 · 10 comments
Open

Comments

@tyang92
Copy link

tyang92 commented Jan 28, 2020

Hello!
First thanks for this awesome project. I think it will make a big difference in the NLP community and I am already excited to use it in my work.

Problem TLDR: AlignmentCRF.log_prob returns values greater than 0 (marginal probabilities greater than 1. I think this might be a bug unless I'm misusing the API.

More background

I'm attempting to use an Alignment CRF (smith waterman) as the loss function for a generative model. The "true" labels are sequences of tokens of various length, and I like the alignment because it will still work if my model accidentally adds an extra token. Obviously my data is variable length, so I provide the lengths tensor argument which has the length of my true labels in each batch. When I do this and compute dist.log_prob(dist.argmax) I get some batches who have log likelihood which is very positive. I checked the partition function and it looks very negative, so my guess is the error might have to do with masking in the partition but not the score or vice-versa.

To reproduce this error

I get the same are in the CTC.ipynb example in the notebooks folder when I initialize the distribution with dist = torch_struct.AlignmentCRF(log_potentials, lengths=torch.Tensor([t] * 2 + [t-1] * 3).long())
and then dist.log_prob(dist.argmax).
Image of the error is below. Here is a collab notebook which has the error https://colab.research.google.com/drive/1C1uDWNe8IcXB-Re6WcnwRsif-q-JQySe.

image

Do you agree this is a bug or is this the intended behavior? Any suggestions for how I might debug this or correctly use your API would be much appreciated. I am not familiar with SemiRings but if you explain what might fix the issue I can try to submit a pull request (still fairly new to pytorch so I'm not sure how much help I can be).

Thanks for your attention and again for the great library!
Tim Y.

@srush
Copy link
Collaborator

srush commented Jan 28, 2020

Yup, seems like a bug. It seems like it is not finding any valid solution at all when you set length < t.

Is there anyway you could avoid using lengths on alignment? The Alignment code is a bit gnarly, so I am guessing I just messed up the length constraint. An alternative method is just to pad your sequences with matches on the boundaries.

@tyang92
Copy link
Author

tyang92 commented Jan 28, 2020

Hello Dr. Rush,

Many thanks for your rapid response and diagnosis of the problem. My data is intrinsically variable length, so I appreciate your help thinking through alternatives to lengths while the bug stands.

If I understand your suggestion, you are recommending I create fake log_potentials by padding both my ground truth sequences and predicted logits with match values? How do I determine what the match value should be? (in your example you use the predicted logit at the true label index). Should I just use 0 so that exp(0) = 1 (certain match). How would this effect dist.argmax? And is it the case that if I follow this procedure exactly my log_marginal probability should not change (intuitively, it seems like this should change the partition function and therefore my marginal)

Thank you again!
Tim Y.

@tyang92
Copy link
Author

tyang92 commented Feb 19, 2020

Dear Dr. Rush,

First, thank you for writing the excellent arxiv explainer on this library! It very much helped me understanding. I have experimented with padding with matches, but I am not confident in my choice of log potentials as I still do not fully understand how this will effect the partition function and argmax. I wonder if you might confirm that 0 is the proper choice of log potential for certain match? Or perhaps it would be possible to write a wrapper which does this padding automatically for a given length constraint.

Either way, thank you and hope to hear from you soon.
Best,
Tim

@srush
Copy link
Collaborator

srush commented Feb 19, 2020

Hi Tim,

Sorry for the slow reply. I started looking at the issue, but I am still not happy with the speed of this particular function and am worried it might not useful. Are you using it in a production system or a demo?

Either way I will try to get a padded demo this week, but I want to make sure you are not waiting on a function that will not be directly applicable.

@tyang92
Copy link
Author

tyang92 commented Feb 19, 2020

Hi Dr. Rush,
Thanks so much for the info. My project is a research prototype so a big performance hit should be just fine if I can get something running at all.

I look forward to the padding demo, and once again very grateful for your help!
Tim

@tyang92
Copy link
Author

tyang92 commented Mar 17, 2020

Hi Dr. Rush and struct team,

Hope all is well with the virus and what not. I know it must be very hectic for you, but I wanted to check in on this. I'm now basically stuck at home and have a lot of time for this project and was hoping I could get a prototype of the alignment working.

Let me know if you have any updates or could point me in the right direction.
Thank you!
Tim

@srush
Copy link
Collaborator

srush commented Mar 18, 2020

I'm so sorry, things have been a bit crazy. I will try to get to this this week.

@srush
Copy link
Collaborator

srush commented Mar 23, 2020

Hi, I finally got around to making you a full example:

https://github.com/harvardnlp/pytorch-struct/blob/master/notebooks/CTC_with_padding.ipynb

image

Let me know if it is clear. I will eventually get his checked in under the lengths command.

@tyang92
Copy link
Author

tyang92 commented Mar 24, 2020

This looks like exactly what I needed! Thank you SO much Dr. Rush, best wishes and stay safe

@tyang92 tyang92 closed this as completed Mar 24, 2020
@srush
Copy link
Collaborator

srush commented Mar 24, 2020

Awesome. I am going to leave it open so I remember to build it in.

@srush srush reopened this Mar 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants