Skip to content

Commit

Permalink
Overhaul global filtering of participants
Browse files Browse the repository at this point in the history
--participant-filter and --exclude-participant filter have had a couple
of bugs over the past few versions:

1. Both terms would apply to every single component, even components
   without the subject entity. These components would have all of their
   entries filtered out
2. Using --exclude-participant-filter would turn on regex matching
   mode for every single filter. This changed the meaning of the filters
   (e.g. allowing partial matches), leading to workflow disruption

The overhaul fixes both bugs, while improving the organization of the
input generation code.

Additionally, the documentation states the magic filter `regex_search:
True` can be used to enable regex searching for a block of filters. This
hasn't worked for the past few versions. Rather than fix it, the
behaviour has been silently disabled in preparation for an overhaul of
the regex filtering api. Mention of this feature has been removed from
documentation

Resolves #303
Resolves #216
  • Loading branch information
pvandyken committed Dec 14, 2023
1 parent 9a7fe2c commit c510230
Show file tree
Hide file tree
Showing 7 changed files with 500 additions and 182 deletions.
14 changes: 6 additions & 8 deletions docs/bids_app/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ Config Variables

### `pybids_inputs`

A dictionary that describes each type of input you want to grab from an input BIDS dataset. Snakebids will parse your dataset with {func}`generate_inputs() <snakebids.generate_inputs>`, converting each input type into a {class}`BidsComponent <snakebids.BidsComponent>`. The value of each item should be a dictionary with keys ``filters`` and ``wildcards``.
A dictionary that describes each type of input you want to grab from an input BIDS dataset. Snakebids will parse your dataset with {func}`generate_inputs() <snakebids.generate_inputs>`, converting each input type into a {class}`BidsComponent <snakebids.BidsComponent>`. The value of each item should be a dictionary with keys `filters` and `wildcards`.

The value of ``filters`` should be a dictionary where each key corresponds to a BIDS entity, and the value specifies which values of that entity should be grabbed. The dictionary for each input is sent to the [PyBIDS' get() function ](#bids.layout.BIDSLayout). `filters` can be set according to a few different formats:
The value of `filters` should be a dictionary where each key corresponds to a BIDS entity, and the value specifies which values of that entity should be grabbed. The dictionary for each input is sent to the [PyBIDS' `get()` function ](#bids.layout.BIDSLayout). `filters` can be set according to a few different formats:

* [string](#str): specifies an exact value for the entity. In the following example:
* `[string](#str)``: specifies an exact value for the entity. In the following example:
```yaml
pybids_inputs:
bold:
Expand All @@ -29,19 +29,17 @@ The value of ``filters`` should be a dictionary where each key corresponds to a
sub-xxx/.../func/ent1-xxx_ent2-xxx_..._bold.nii.gz
```
* [boolean](#bool): constrains presence or absence of the entity without restricting its value. `False` requires that the entity be **absent**, while `True` requires that the entity be **present**, regardless of value.
* `[boolean](#bool)``: constrains presence or absence of the entity without restricting its value. `False` requires that the entity be **absent**, while `True` requires that the entity be **present**, regardless of value.
```yaml
pybids_inputs:
derivs:
filters:
datatype: 'func'
desc: True # or true, or yes
acquisition: False # or false, or no
desc: True
acquisition: False
```
The above example maps all paths in the `func/` datatype folder that have a `_desc-` entity but do not have the `_acq-` entity.

In addition, the special filter `regex_search` can be set to `true`, which causes all other filters in the component to use regex matching instead of exact matching.

The value of ``wildcards`` should be a list of BIDS entities. Snakebids collects the values of any entities specified and saves them in the {attr}`entities <snakebids.BidsComponent.entities>` and {attr}`~snakebids.BidsComponent.zip_lists` entries of the corresponding {class}`BidsComponent <snakebids.BidsComponent>`. In other words, these are the entities to be preserved in output paths derived from the input being described. Placing an entity in `wildcards` does not require the entity be present. If an entity is not found, it will be left out of {attr}`entities <snakebids.BidsComponent.entities>`. To require the presence of an entity, place it under `filters` set to `true`.

In the following (YAML-formatted) example, the ``bold`` input type is specified. BIDS files with the datatype ``func``, suffix ``bold``, and extension ``.nii.gz`` will be grabbed, and the ``subject``, ``session``, ``acquisition``, ``task``, and ``run`` entities of those files will be left as wildcards. The `task` entity must be present, but there must not be any `desc`.
Expand Down
2 changes: 2 additions & 0 deletions snakebids/core/datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,8 @@ def filter(
if not isinstance(regex_search, bool):
msg = "regex_search must be a boolean"
raise TypeError(msg)
if not filters:
return self
return attr.evolve(
self,
zip_lists=filter_list(self.zip_lists, filters, regex_search=regex_search),
Expand Down
2 changes: 1 addition & 1 deletion snakebids/core/filtering.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def filter_list(
def filter_list(
zip_list: ZipListLike,
filters: Mapping[str, Iterable[str] | str],
return_indices_only: Literal[True] = ...,
return_indices_only: Literal[True],
regex_search: bool = ...,
) -> list[int]:
...
Expand Down
Loading

0 comments on commit c510230

Please sign in to comment.