Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not account for stopwords as much as we did #540

Open
alexgarel opened this issue Sep 20, 2024 · 0 comments
Open

Do not account for stopwords as much as we did #540

alexgarel opened this issue Sep 20, 2024 · 0 comments

Comments

@alexgarel
Copy link
Member

Currently we use stopwords to normalize entries, but this is wrong, Product Opener does not do this, it uses stop words only at matching time.

This leads taxonomy editor to think some entries are the same when they are not (eg. fruit juice concentrate vs fruit juice from concentrate) and may also rise synonyms positive that are not true positives.

Re-evaluate at each point where we use stop-words if we need them or not (for parent matching we might try both, but raising error if there is more than one match with stop words removed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants