Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor matches against SO, likely due to use of underscores #57

Open
cmungall opened this issue Feb 19, 2021 · 3 comments
Open

Poor matches against SO, likely due to use of underscores #57

cmungall opened this issue Feb 19, 2021 · 3 comments

Comments

@cmungall
Copy link

Searching for splice site, I would expect HIGH confidence matches for SIO and SO, as these exactly match the main name:

$ curl -L -s 'http://www.ebi.ac.uk/spot/zooma/v2/api/services/annotate?propertyValue=splice+site' | jq '.[] | .confidence, .semanticTags, .annotatedProperty.propertyValue'
"MEDIUM"
[
  "http://semanticscience.org/resource/SIO_010451"
]
"splice site"
"MEDIUM"
[
  "http://purl.obolibrary.org/obo/SO_0000162"
]
"splice_site"

SO uses underscores in names (arguably a bug in SO, which I may be partly to blame for.. but it is how it is), and indeed if I search using underscores:

$ curl -L -s 'http://www.ebi.ac.uk/spot/zooma/v2/api/services/annotate?propertyValue=splice_site' | jq '.[] | .confidence, .semanticTags, .annotatedProperty.propertyValue'
"GOOD"
[
  "http://purl.obolibrary.org/obo/SO_0000162"
]
"splice_site"

However, a poor user is not likely to know to use underscores when searching SO

Recommendations/questions:

  1. treat underscore identical to space when both indexing and searching
  2. the first hit should return a high confidence match to SIO
@bgood-d4c
Copy link

@cmungall I think your indexing suggestion makes sense. (Probably same for -). If you tell zooma to look in the SIO specifically you can get it to give you a GOOD hit for 'splice site'.

@cmungall
Copy link
Author

cmungall commented Feb 19, 2021 via email

@henrietteharmse
Copy link
Contributor

If Zooma can be more resilient in spite of potential human error that will be helpful.

@henrietteharmse henrietteharmse added this to the Resilient searches milestone Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants