SPARQL query matches same session too many times due to subclass inferencing #374

alyssadai · 2024-11-05T04:35:07Z

Is there an existing issue for this?

I have searched the existing issues

Expected Behavior

If a specific session matches my query and has both imaging and phenotypic data, I expect to see in the unaggregated API results two entries for that subject-session: one ImagingSession and one PhenotypicSession instance.

Current Behavior

When the query is going to a graph that also has the Neurobagel vocabulary in it, a matching session (unexpectedly) has 3 instead of two entries in the results: one ImagingSession, one PhenotypicSession, and one Session instance.

Note that the number of matching subjects, as well as the num_matching_{phenotypic,imaging}_sessions for a particular subject are still calculated correctly. The main problem is that the extra session instance (which also appears as an extra row in a participant-level results TSV from the query tool) gives the impression of 3 different session types, which is not how the data is modeled in the graph.

e.g.,

    ...
    "subject_data": [
      {
        "sub_id": "sub-01",
        "session_id": "ses-01",
        "num_matching_phenotypic_sessions": 2,
        "num_matching_imaging_sessions": 2,
        "session_type": "http://neurobagel.org/vocab/ImagingSession",
        "age": null,
        "sex": null,
        "diagnosis": [
          null
        ],
        "subject_group": null,
        "assessment": [
          null
        ],
        "image_modal": [
          "http://purl.org/nidash/nidm#T1Weighted",
          "http://purl.org/nidash/nidm#FlowWeighted"
        ],
        "session_file_path": "/data/neurobagel/bagel-cli/bids-examples/synthetic/sub-01/ses-01",
        "completed_pipelines": {
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/fmriprep": [
            "23.1.3"
          ],
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/freesurfer": [
            "7.3.2"
          ]
        }
      },
      {
        "sub_id": "sub-01",
        "session_id": "ses-01",
        "num_matching_phenotypic_sessions": 2,
        "num_matching_imaging_sessions": 2,
        "session_type": "http://neurobagel.org/vocab/PhenotypicSession",
        "age": 34.1,
        "sex": "http://purl.bioontology.org/ontology/SNOMEDCT/248152002",
        "diagnosis": [
          null
        ],
        "subject_group": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C94342",
        "assessment": [
          "https://www.cognitiveatlas.org/task/id/trm_57964b8a66aed",
          "https://www.cognitiveatlas.org/task/id/tsk_4a57abb949ece"
        ],
        "image_modal": [
          null
        ],
        "session_file_path": null,
        "completed_pipelines": {}
      },
      {
        "sub_id": "sub-01",
        "session_id": "ses-01",
        "num_matching_phenotypic_sessions": 2,
        "num_matching_imaging_sessions": 2,
        "session_type": "http://neurobagel.org/vocab/Session",
        "age": 34.1,
        "sex": "http://purl.bioontology.org/ontology/SNOMEDCT/248152002",
        "diagnosis": [
          null
        ],
        "subject_group": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C94342",
        "assessment": [
          "https://www.cognitiveatlas.org/task/id/trm_57964b8a66aed",
          "https://www.cognitiveatlas.org/task/id/tsk_4a57abb949ece",
          null
        ],
        "image_modal": [
          null,
          "http://purl.org/nidash/nidm#T1Weighted",
          "http://purl.org/nidash/nidm#FlowWeighted"
        ],
        "session_file_path": "/data/neurobagel/bagel-cli/bids-examples/synthetic/sub-01/ses-01",
        "completed_pipelines": {
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/fmriprep": [
            "23.1.3"
          ],
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/freesurfer": [
            "7.3.2"
          ]
        }
      },
      ...

Error message

No response

Environment

OS:
Python/Node version:

How to reproduce

No response

Anything else?

This happens due to the new class relationships we have in the graph, specifically

nb:ImagingSession a rdfs:Class;
    rdfs:subClassOf nb:Session.

nb:PhenotypicSession a rdfs:Class;
    rdfs:subClassOf nb:Session.

...

nb:Session a rdfs:Class.

from https://github.com/neurobagel/recipes/blob/main/vocab/nb_vocab.ttl.

When we first select for all sessions in the Neurobagel query:

api/docs/default_neurobagel_query.rq

Lines 19 to 20 in d2e090b

    
           ?session a ?session_type; 
        
               nb:hasLabel ?session_id.

This will return ImagingSession, PhenotypicSession, and Session for session_type, since an instance of the first two is inferred through RDF inference to also be an instance of Session. By default in RDF, each class is a subclass of itself.

The text was updated successfully, but these errors were encountered:

neurobagel-bot · 2024-11-07T16:07:44Z

🚀 Issue was released in v0.4.2 🚀

alyssadai added type:bug and removed type:bug labels Nov 5, 2024

alyssadai added this to Neurobagel Nov 5, 2024

alyssadai moved this to Implement - Active in Neurobagel Nov 5, 2024

alyssadai mentioned this issue Nov 5, 2024

Regenerate example API responses neurobagel/neurobagel_examples#37

Closed

1 task

alyssadai self-assigned this Nov 5, 2024

alyssadai mentioned this issue Nov 6, 2024

[FIX] Filter for only ImagingSessions or PhenotypicSessions in SPARQL query #375

Merged

8 tasks

alyssadai moved this from Implement - Active to Implement - Done in Neurobagel Nov 6, 2024

alyssadai closed this as completed in #375 Nov 7, 2024

github-project-automation bot moved this from Review - Active to Review - Done in Neurobagel Nov 7, 2024

neurobagel-bot bot added the released This issue/pull request has been released. label Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARQL query matches same session too many times due to subclass inferencing #374

SPARQL query matches same session too many times due to subclass inferencing #374

alyssadai commented Nov 5, 2024 •

edited

Loading

neurobagel-bot bot commented Nov 7, 2024

SPARQL query matches same session too many times due to subclass inferencing #374

SPARQL query matches same session too many times due to subclass inferencing #374

Comments

alyssadai commented Nov 5, 2024 • edited Loading

Is there an existing issue for this?

Expected Behavior

Current Behavior

Error message

Environment

How to reproduce

Anything else?

neurobagel-bot bot commented Nov 7, 2024

alyssadai commented Nov 5, 2024 •

edited

Loading