Improve inefficient SPARQL query template #307
Labels
performance
Performance improvement or fix to performance degradation.
released
This issue/pull request has been released.
Our current SPARQL query template has a couple of problems that make it slow:
Unnecessary nesting of OPTIONAL clauses
api/app/api/utility.py
Lines 232 to 237 in 1e9ef34
and
api/app/api/utility.py
Lines 258 to 264 in 1e9ef34
(even worse)
in order of importance
nb:hasAcquisition/nb:hasContrastType
-> very slow!OPTIONAL
statement is not necessary. If a subject does not have a phenotypic or imaging session, we don't need to look any further anyway?subject
from the outer scope, so no need to restate it herealso:
api/app/api/utility.py
Lines 207 to 211 in 1e9ef34
is most likely not necessary - it would only help to capture those subjects who do not have any file system path associated and thus would not be datalad gettable.
TODOs:
remove superfluousOPTIONAL
statements (they are expensive)remove repeated triple patterns in sub-queriesFrom initial testing, this will let us cut query execution time by an order of magnitude.
See also: neurobagel/planning#142
The text was updated successfully, but these errors were encountered: