Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCHEMA: Update existence checks to consider empty lists #1747

Merged
merged 3 commits into from
Apr 18, 2024

Conversation

effigies
Copy link
Collaborator

@effigies effigies commented Mar 23, 2024

Fixes two issues found in real datasets:

  1. stim_file columns with all n/a. This is already handled in a duplicated check: Guessing this got rebased out before pushing.

  2. IntendedFor fields with an empty array ([]) as a value. Other reference checks suffer the same potential bug, so fixed at the same time.

@effigies effigies added schema Issues related to the YAML schema representation of the specification. Patch version release. exclude-from-changelog This item will not feature in the automatically generated changelog labels Mar 23, 2024
Copy link

codecov bot commented Mar 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.93%. Comparing base (bd08602) to head (f11fb28).
Report is 43 commits behind head on master.

❗ Current head f11fb28 differs from pull request most recent head 9467f62. Consider uploading reports for the commit 9467f62 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1747   +/-   ##
=======================================
  Coverage   87.93%   87.93%           
=======================================
  Files          16       16           
  Lines        1351     1351           
=======================================
  Hits         1188     1188           
  Misses        163      163           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -3,35 +3,31 @@ SubjectRelativeIntendedFor:
selectors:
- datatype != "ieeg"
- type(sidecar.IntendedFor) != "null"
- length(sidecar.IntendedFor) > 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but is it an issue really?
do we have some common requirement on why no empty lists or no columns full of n/a?

if not -- then I don't see why this particular attributes/columns are special.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are responses to existing datasets. We can warn on empty lists if desired, but we shouldn't raise these errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From in-person convo, special-casing empty lists isn't great. The problem here is that the value could be either a string or a list of strings. exists() handles that fine, but exists() == length() would fail on non-empty strings.

One way to do this is to update the check to:

exists(sidecar.IntendedFor, "bids-uri") + exists(sidecar.IntendedFor, "subject")
== length(type(sidecar.IntendedFor) == "string" && 1 || length(sidecar.IntendedFor))

@rwblair Any opinions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the RHS of that equality be wrapped in a length? Otherwise it's good. It is complicated enough that we should probably have some unit tests to make sure it works how we think it does.

Simple but more verbose option would be to have multiple rules based on type of IntendedFor. One for it being a string and another on it being a list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, should drop that outer length

Maybe two rules is best. Was trying to avoid it, but it would be more readable IMO.

@effigies effigies requested a review from rwblair April 18, 2024 19:30
@rwblair rwblair merged commit 1cb92eb into bids-standard:master Apr 18, 2024
23 of 24 checks passed
@effigies effigies deleted the schema/fix-rules branch April 18, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude-from-changelog This item will not feature in the automatically generated changelog schema Issues related to the YAML schema representation of the specification. Patch version release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants