-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Universal dictionary with information about the world #80
Comments
That is interesting idea. Currently UDPipe can utilize only some columns in the CoNLL-U file, so using So either we could implement utilizing individual features from FEATS (which we should anyway), or support explicit "external" knowledge (i.e., a mapping from FORM (or maybe any other column) to a value, which is passed to the tagger/parser/...)). I will be improving support for morphological dictionary in several months (because currently it needs to be specified during training and is embedded in the model; we want to be able to utilize any given dictionary during inference, and I wanted to add support for providing only some of the columns). Maybe during the rewrite I could generalize the dictionary to provide also "additional" columns (like valency), which would be passed to tagger/lemmatizer/parser. I will think about it, and I am leaving this open as a remainder. |
A fun exampleYou can provide a dictionary of average lengths of objects. The parser will deep-learn that bigger objects rarely are in smaller ones, which should help to disambiguate e.g. classical Alice drove down the street in her car. |
If there is (for example) a valency dictionary, one can tag each verb in the gold standard with valency, train the parser using that additional annotation, and then provide the dictionary at the inference stage so that the parser can take better, more informed decisions — like UDPipe already does with a morphological dictionary. I wonder if putting everything into the
FEATS
column isn’t suboptimal. Should there be a dedicated way to aid the parser with additional non-morphological annotation or usingFEATS
should suffice? What if one does not have a morpho dict but has a valency dict?The text was updated successfully, but these errors were encountered: