Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: skip street parse when its also the subject #1506

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

orangejulius
Copy link
Member

@orangejulius orangejulius commented Feb 4, 2021

In past work (#1469 and #1468), we've discovered that searching across both the name and address fields can be problematic. Records that are a poor name match but also match the address fields can easily outscore a record with an excellent name match.

This PR explores another general variant of that situation: times when the Pelias Parser returns the same value as both the subject parse and the street parse. By removing the street parse in this case, it looks like there are at least several cases where better POI results are returned.

Here's two examples from the acceptance tests, but more investigation into the effects is needed:
image

@missinglink
Copy link
Member

Sounds like a reasonable change, it's always a tradeoff, this approach will of course eliminate some legitimate street results, such as street: Wrigley Field or street: Union Square in other regions not associated with those popular venues.

Having said that I think it's still a good change.

Could you please put in some code comments, I had to read it twice to see what's going on even with a 10 line diff ;)

@orangejulius
Copy link
Member Author

Sure, will definitely add comments and tests for this code. Just wanted to float the idea early to see what everyone thought.

@missinglink
Copy link
Member

One other thought I had was that it might introduce jitter, although on reflection it's likely not an issue since the parser likely detects place and street classifications on the same keypress.

You might have to move the code closer to where the solutions are returned in order to detect it:
https://github.com/pelias/api/blob/master/sanitizer/_text_pelias_parser.js

I don't have my head fully wrapped around this issue but it might be more powerful to look at all the solutions returned from the parser in the t.solution Array and make changes there.

@orangejulius orangejulius force-pushed the subject-street-parsing branch from e5b712f to abc03c1 Compare February 9, 2021 16:34
This is just for testing and rollout, it's not intended to be permanent
:)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants