-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Case: fstalign ignores symbols table and does not find alignment in a simple transcript #40
Comments
Some progress: i've started playing with the symbol loading outputs and it turns out it wasn't loading a file (it wasn't there but wasn't failing), after fixing that and adding asr control symbols, I have: symbol table:
and fstalign correctly prints out the fst:
So the problem still persists - with correct fst, and easily alignable transcript by hand, the graph walker fails to align the transcripts. |
Changing composition approach to |
Hi, we don't support FST input yet for the composition we made default in https://github.com/revdotcom/fstalign/releases/tag/1.2.0, so you would have to use the
|
Thank you, happy I could perhaps help someone looking for this in the future! Your library is amazing! |
Hi,
fstalign 1.6.1 does not load fst symbol tables properly and modifies the hypothesis FST so it's completely borked:
Here is the output of the command
/fstalign/bin/fstalign wer --ref /data/customer/ref.txt --hyp /data/customer/hyp.nlp --symbols /data/customer/hyp.sym --output-sbs /data/customer/res.sbs --log /data/customer/res.log
in the current docker. It happens both with txt file (one gold transcript word per line) and ctm with time aligned gold transcript.The proper FST is however:
with a symbol table:
The bug is thus:
hyp size: 0
when it is not 0but were
i. e. ids were mistakenly shifted -8 in mapping to symbols.
The text was updated successfully, but these errors were encountered: