Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove parenthesed portion of names #559

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

missinglink
Copy link
Member

@missinglink missinglink commented Aug 11, 2021

this DRAFT PR is to explore the idea of removing parenthesed portions of names.
I'm not 100% sure this is a great idea, the test cases illustrate some positive and some potentially negative results.

@missinglink
Copy link
Member Author

This was motivated by the following results from a TV series showing up for the query 90210:

Screenshot 2021-08-11 at 15 03 31

@missinglink
Copy link
Member Author

missinglink commented Aug 11, 2021

I also considered implementing something similar in pelias/schema where we would store the original text verbatim but only index the tokens outside the parenthesis. It's also not without its potential issues...

@orangejulius
Copy link
Member

I like this! I'm sure it has a downside somewhere, but I think it's worth exploring. Definitely worth kicking off a build. The diff in the Vancouver extract actually looks very positive.

@orangejulius
Copy link
Member

I came across this PR again today and figured we should test it out. Branch is rebased and a planet build is kicked off :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants