Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong POS for "keine": PRON instead of DET #1431

Open
GeorgeS2019 opened this issue Mar 4, 2024 · 7 comments
Open

Wrong POS for "keine": PRON instead of DET #1431

GeorgeS2019 opened this issue Mar 4, 2024 · 7 comments

Comments

@GeorgeS2019
Copy link

GeorgeS2019 commented Mar 4, 2024

Ich habe keine Übungen gemacht, weil ich keine Lust habe.

Stanza states keine as DET
CoreNLP 4.5.6 (with corresponding 4.5.6 German model) states keine as PRON

@AngledLuffa
Copy link
Contributor

The data used to train the Stanza tagger was

ud-treebanks-v2.13/UD_German-GSD/de_gsd-ud-train.conllu

where keine is treated as DET

The CoreNLP tagger has not been retrained since UD 2.4, where the standard was to treat keine as PRON

Retraining taggers with updated data is less of a hassle than the general feature adds you've been requesting, so, we'll put updated data for some of those models on the list

@GeorgeS2019
Copy link
Author

GeorgeS2019 commented Mar 4, 2024

@AngledLuffa

I have tried to connect to @manning through Linkedin regarding CoreNLP 4.5.6 with specific interest on German model 4.5.6

@GeorgeS2019
Copy link
Author

@AngledLuffa

I also have issue with the result of dependency parsing. Hopefully, this will go away when the German POS assignment is correct.

@GeorgeS2019
Copy link
Author

@AngledLuffa
I am comparing the CoreNLP German output through code with that of Stanza.
I understand that CoreNLP run online is no longer running. It will take extra few steps to compare between CoreNLP 4.5.6 and the latest Stanza.

@AngledLuffa
Copy link
Contributor

AngledLuffa commented Mar 4, 2024 via email

@GeorgeS2019
Copy link
Author

GeorgeS2019 commented Mar 15, 2024

@AngledLuffa

Does german parser in CoreNLP support XPOS? I can ONLY find UPOS

CoreNLP

props.setProperty("annotators", "tokenize, ssplit, mwt, pos, lemma, ner, depparse");

Stanza

https://stanfordnlp.github.io/stanza/pos.html
image

@AngledLuffa
Copy link
Contributor

AngledLuffa commented Mar 15, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants