Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension of BidsComponent.filter api #335

Open
pvandyken opened this issue Sep 6, 2023 · 0 comments
Open

Extension of BidsComponent.filter api #335

pvandyken opened this issue Sep 6, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@pvandyken
Copy link
Contributor

Background

Currently, BidsComponentRow.filter supports a position-only spec argument that takes directly the list of values to be kept. The idea is that since the row has only one entity, explicitly specifying the entity name via kwarg is redundant.

When implementing the above, we had raised the possibility of extending the spec filtering mechanism to BidsComponent and BidsPartialComponent (hereon treated synonymously). I'd like to propose a start, not necessarily comprehensive, for this API.

Note below when I refer to "entries", I refer to a specific entity combination in a component (e.g. (subject, session, run)

Proposal

Filtering with lists of tuples

Analogous to BidsComponentRow.filter taking a list of values, BidsComponent.filter shall take list of tuples of strings:

class BidsComponent:
	def filter(self, spec: Sequence[tuple[str, ...]], /, ...): ...

Each tuple shall of the same length, equal to the number of entities in the component. Each position in the tuple shall correspond to the one of the component entities, the order matching the internal entity order of the component. For example, for a component with entities "subject", "session" (in that order), a filter spec may look like: [("001", "01"), ("001", "02"), ("002", "01"), ...].

Filtering with BidsComponents

Any BidsComponent may be filtered against another BidsComponent. Only entries found in the template component shall be kept in the original component. Entities found in only one of the two components will not be considered. The logic would be identical to that proposed for the consensus logic in BidsDataset.exand()

Motivating Example

The above API would enable the following example:

# pandas dataframe containing metadata for the dataset, indexed by subject and session ids
df = get_metadata()

inputs = generate_inputs(...)
component = inputs["T1w"]

# Select only the subject/session combinations found in the metadata
component = component.filter(component["subject", "session"].filter(df.index))

This example is not easily possible using a combination of snakebids and pure python (my current workaround is to convert the snakebids component into a pandas dataframe). More particularly, indexing by multiple entities (e.g. subject/session) pairs at the same time is not possible with snakebids currently. The new API will enable this, as shown in the example.

@pvandyken pvandyken added the enhancement New feature or request label Feb 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant