Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARQL query matches same session too many times due to subclass inferencing #374

Closed
1 task done
alyssadai opened this issue Nov 5, 2024 · 1 comment · Fixed by #375
Closed
1 task done

SPARQL query matches same session too many times due to subclass inferencing #374

alyssadai opened this issue Nov 5, 2024 · 1 comment · Fixed by #375
Assignees
Labels
released This issue/pull request has been released.

Comments

@alyssadai
Copy link
Contributor

alyssadai commented Nov 5, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Expected Behavior

If a specific session matches my query and has both imaging and phenotypic data, I expect to see in the unaggregated API results two entries for that subject-session: one ImagingSession and one PhenotypicSession instance.

Current Behavior

When the query is going to a graph that also has the Neurobagel vocabulary in it, a matching session (unexpectedly) has 3 instead of two entries in the results: one ImagingSession, one PhenotypicSession, and one Session instance.

Note that the number of matching subjects, as well as the num_matching_{phenotypic,imaging}_sessions for a particular subject are still calculated correctly. The main problem is that the extra session instance (which also appears as an extra row in a participant-level results TSV from the query tool) gives the impression of 3 different session types, which is not how the data is modeled in the graph.

e.g.,

    ...
    "subject_data": [
      {
        "sub_id": "sub-01",
        "session_id": "ses-01",
        "num_matching_phenotypic_sessions": 2,
        "num_matching_imaging_sessions": 2,
        "session_type": "http://neurobagel.org/vocab/ImagingSession",
        "age": null,
        "sex": null,
        "diagnosis": [
          null
        ],
        "subject_group": null,
        "assessment": [
          null
        ],
        "image_modal": [
          "http://purl.org/nidash/nidm#T1Weighted",
          "http://purl.org/nidash/nidm#FlowWeighted"
        ],
        "session_file_path": "/data/neurobagel/bagel-cli/bids-examples/synthetic/sub-01/ses-01",
        "completed_pipelines": {
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/fmriprep": [
            "23.1.3"
          ],
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/freesurfer": [
            "7.3.2"
          ]
        }
      },
      {
        "sub_id": "sub-01",
        "session_id": "ses-01",
        "num_matching_phenotypic_sessions": 2,
        "num_matching_imaging_sessions": 2,
        "session_type": "http://neurobagel.org/vocab/PhenotypicSession",
        "age": 34.1,
        "sex": "http://purl.bioontology.org/ontology/SNOMEDCT/248152002",
        "diagnosis": [
          null
        ],
        "subject_group": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C94342",
        "assessment": [
          "https://www.cognitiveatlas.org/task/id/trm_57964b8a66aed",
          "https://www.cognitiveatlas.org/task/id/tsk_4a57abb949ece"
        ],
        "image_modal": [
          null
        ],
        "session_file_path": null,
        "completed_pipelines": {}
      },
      {
        "sub_id": "sub-01",
        "session_id": "ses-01",
        "num_matching_phenotypic_sessions": 2,
        "num_matching_imaging_sessions": 2,
        "session_type": "http://neurobagel.org/vocab/Session",
        "age": 34.1,
        "sex": "http://purl.bioontology.org/ontology/SNOMEDCT/248152002",
        "diagnosis": [
          null
        ],
        "subject_group": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C94342",
        "assessment": [
          "https://www.cognitiveatlas.org/task/id/trm_57964b8a66aed",
          "https://www.cognitiveatlas.org/task/id/tsk_4a57abb949ece",
          null
        ],
        "image_modal": [
          null,
          "http://purl.org/nidash/nidm#T1Weighted",
          "http://purl.org/nidash/nidm#FlowWeighted"
        ],
        "session_file_path": "/data/neurobagel/bagel-cli/bids-examples/synthetic/sub-01/ses-01",
        "completed_pipelines": {
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/fmriprep": [
            "23.1.3"
          ],
          "https://github.com/nipoppy/pipeline-catalog/tree/main/processing/freesurfer": [
            "7.3.2"
          ]
        }
      },
      ...

Error message

No response

Environment

  • OS:
  • Python/Node version:

How to reproduce

No response

Anything else?

This happens due to the new class relationships we have in the graph, specifically

nb:ImagingSession a rdfs:Class;
    rdfs:subClassOf nb:Session.

nb:PhenotypicSession a rdfs:Class;
    rdfs:subClassOf nb:Session.

...

nb:Session a rdfs:Class.

from https://github.com/neurobagel/recipes/blob/main/vocab/nb_vocab.ttl.

When we first select for all sessions in the Neurobagel query:

?session a ?session_type;
nb:hasLabel ?session_id.

This will return ImagingSession, PhenotypicSession, and Session for session_type, since an instance of the first two is inferred through RDF inference to also be an instance of Session. By default in RDF, each class is a subclass of itself.

@alyssadai alyssadai moved this to Implement - Active in Neurobagel Nov 5, 2024
@alyssadai alyssadai self-assigned this Nov 5, 2024
@alyssadai alyssadai moved this from Implement - Active to Implement - Done in Neurobagel Nov 6, 2024
@github-project-automation github-project-automation bot moved this from Review - Active to Review - Done in Neurobagel Nov 7, 2024
Copy link
Contributor

neurobagel-bot bot commented Nov 7, 2024

🚀 Issue was released in v0.4.2 🚀

@neurobagel-bot neurobagel-bot bot added the released This issue/pull request has been released. label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
released This issue/pull request has been released.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant