SPARQL query matches same session too many times due to subclass inferencing #374
Closed
1 task done
Labels
released
This issue/pull request has been released.
Is there an existing issue for this?
Expected Behavior
If a specific session matches my query and has both imaging and phenotypic data, I expect to see in the unaggregated API results two entries for that subject-session: one
ImagingSession
and onePhenotypicSession
instance.Current Behavior
When the query is going to a graph that also has the Neurobagel vocabulary in it, a matching session (unexpectedly) has 3 instead of two entries in the results: one
ImagingSession
, onePhenotypicSession
, and oneSession
instance.Note that the number of matching subjects, as well as the
num_matching_{phenotypic,imaging}_sessions
for a particular subject are still calculated correctly. The main problem is that the extra session instance (which also appears as an extra row in a participant-level results TSV from the query tool) gives the impression of 3 different session types, which is not how the data is modeled in the graph.e.g.,
Error message
No response
Environment
How to reproduce
No response
Anything else?
This happens due to the new class relationships we have in the graph, specifically
from https://github.com/neurobagel/recipes/blob/main/vocab/nb_vocab.ttl.
When we first select for all sessions in the Neurobagel query:
api/docs/default_neurobagel_query.rq
Lines 19 to 20 in d2e090b
This will return
ImagingSession
,PhenotypicSession
, andSession
for session_type, since an instance of the first two is inferred through RDF inference to also be an instance ofSession
. By default in RDF, each class is a subclass of itself.The text was updated successfully, but these errors were encountered: