Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Lexical Entries in OdeNet #46

Open
hdaSprachtechnologie opened this issue Aug 29, 2023 · 2 comments
Open

Duplicate Lexical Entries in OdeNet #46

hdaSprachtechnologie opened this issue Aug 29, 2023 · 2 comments

Comments

@hdaSprachtechnologie
Copy link
Owner

LexEntries with the same lemma should only occur in the case of real homonyms. But OdeNet contains quite a lot of duplicate lexical entries that are not homonyms and should be resolved. The file "echte_Homonyme.txt" contains a list of (some) real homonyms that should stay in OdeNet.
echte_Homonyme.txt
I will add a list of all duplicated entries. What are the rules for keeping, deleting or merging entries?

@hdaSprachtechnologie
Copy link
Owner Author

There is a list of German homonyms in Wiktionary: https://de.wiktionary.org/wiki/Verzeichnis:Deutsch/Homonyme

@hdaSprachtechnologie
Copy link
Owner Author

This is a list of all duplicate lexical entries in OdeNet.
all_duplicated_lexentries_odenet.txt

hdaSprachtechnologie pushed a commit that referenced this issue Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant