Add grounding refiner option for SPIRES #361

caufieldjh · 2024-04-01T16:16:29Z

(related to discussion with @enockniyonkuru re: MAXO extraction on Apr 1 2024)

SPIRES extraction does grounding recursively, but still doesn't always catch instances where the term to match is within a longer string, e.g.
lorem ipsum dolor TERM TO MATCH sit amet
Sometimes this is due to closely related texts not otherwise defined as synonyms, like vitamin supplementation (MAXO:0001129) vs vitamin therapy.

An additional refiner pass could assist with grounding by doing one or more of the above:

Performing an additional round of recursive searching, particularly in cases where the extracted string is longer than expected
Doing class-agnostic chunking with the LLM - essentially asking it to try again, but make it more/less specific.
Ditto, but with more traditional NLP methods, even just further tokenization
For ungrounded terms, replace common prefixes/suffixes with those more common in the source annotator. This may be better as a RAG approach but could also work in-context (i.e., instructions passed directly to the LLM) or as a post-processing step capable of recognizing repeated patterns among phrases. Sounds a bit like a curategpt thing though.

The text was updated successfully, but these errors were encountered:

caufieldjh added the enhancement New feature or request label Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add grounding refiner option for SPIRES #361

Add grounding refiner option for SPIRES #361

caufieldjh commented Apr 1, 2024

Add grounding refiner option for SPIRES #361

Add grounding refiner option for SPIRES #361

Comments

caufieldjh commented Apr 1, 2024