-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coordinate-based coactivation-based parcellation class #533
Draft
tsalo
wants to merge
27
commits into
neurostuff:main
Choose a base branch
from
tsalo:coord-cbp
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 7 commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
08ae309
Add n option to get_studies_by_coordinate
tsalo 88d5c05
Mock up maybe passable CoordCBP class.
tsalo 3ce90f1
Fix docstrings.
tsalo 40f4585
Don't calculate distances.
tsalo 530048f
Remove extra newline.
tsalo 8382704
Clustering should be fairly good. Still need to implement metrics.
tsalo c79b251
Add empty methods for the different metrics.
tsalo 3e2d580
Work on filter-selection metric. Far from done.
tsalo 3365d5c
Add a couple of metrics.
tsalo 7081604
Improve metric documentation.
tsalo 12cddec
VI should be fairly good now.
tsalo 7023291
Merge branch 'main' into coord-cbp
tsalo f37751a
Draft voxel misclassification metric.
tsalo f555790
Merge branch 'main' into coord-cbp
tsalo 0896745
Draft another metric.
tsalo f9611db
Little cleanup.
tsalo 909c822
Move ratio calculations to main fit method.
tsalo 65857d1
More cleanup.
tsalo 65b9675
More work and debugging.
tsalo 118e718
Mention re-labeling (not implemented).
tsalo 267a135
Merge branch 'main' into coord-cbp
tsalo 530a4de
Remove added line.
tsalo e58a9d9
Merge branch 'main' into coord-cbp
tsalo a1bcf0f
Merge remote-tracking branch 'upstream/main' into jperaza-coord-cbp
JulioAPeraza 1c352f7
Update test_dataset.py
JulioAPeraza 917e943
Merge branch 'main' into coord-cbp
JulioAPeraza 0c60dd5
Update utils.py
JulioAPeraza File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,223 @@ | ||
"""Parcellation tools.""" | ||
import datetime | ||
import inspect | ||
import logging | ||
import os | ||
from tempfile import mkstemp | ||
|
||
import numpy as np | ||
from sklearn.cluster import KMeans | ||
|
||
from .base import NiMAREBase | ||
from .extract.utils import _get_dataset_dir | ||
from .meta.base import CBMAEstimator | ||
from .meta.cbma.ale import ALE | ||
from .results import MetaResult | ||
from .utils import add_metadata_to_dataframe, check_type, listify, use_memmap, vox2mm | ||
|
||
LGR = logging.getLogger(__name__) | ||
|
||
|
||
class CoordCBP(NiMAREBase): | ||
"""Perform coordinate-based coactivation-based parcellation. | ||
|
||
.. versionadded:: 0.0.10 | ||
|
||
Parameters | ||
---------- | ||
target_mask : :obj:`nibabel.Nifti1.Nifti1Image` | ||
Mask of target of parcellation. | ||
Currently must be in same space/resolution as Dataset mask. | ||
n_clusters : :obj:`list` of :obj:`int` | ||
Number of clusters to evaluate in clustering. | ||
Metrics will be calculated for each cluster-count in the list, to allow users to select | ||
the optimal cluster solution. | ||
r : :obj:`float` or :obj:`list` of :obj:`float` or None, optional | ||
Radius (in mm) within which to find studies. | ||
If a list of values is provided, then MACMs and clustering will be performed across all | ||
values, and a selection procedure will be performed to identify the optimal ``r``. | ||
Mutually exclusive with ``n``. Default is None. | ||
n : :obj:`int` or :obj:`list` of :obj:`int` or None, optional | ||
Number of closest studies to identify. | ||
If a list of values is provided, then MACMs and clustering will be performed across all | ||
values, and a selection procedure will be performed to identify the optimal ``n``. | ||
Mutually exclusive with ``r``. Default is None. | ||
meta_estimator : :obj:`nimare.meta.base.CBMAEstimator`, optional | ||
CBMA Estimator with which to run the MACMs. | ||
Default is :obj:`nimare.meta.cbma.ale.ALE`. | ||
target_image : :obj:`str`, optional | ||
Name of meta-analysis results image to use for clustering. | ||
Default is "ale", which is specific to the ALE estimator. | ||
""" | ||
|
||
_required_inputs = {"coordinates": ("coordinates", None)} | ||
|
||
def __init__( | ||
self, | ||
target_mask, | ||
n_clusters, | ||
r=None, | ||
n=None, | ||
meta_estimator=None, | ||
target_image="ale", | ||
*args, | ||
**kwargs, | ||
): | ||
super().__init__(*args, **kwargs) | ||
|
||
if meta_estimator is None: | ||
meta_estimator = ALE() | ||
else: | ||
meta_estimator = check_type(meta_estimator, CBMAEstimator) | ||
|
||
if r and n: | ||
raise ValueError("Only one of 'r' and 'n' may be provided.") | ||
elif not r and not n: | ||
raise ValueError("Either 'r' or 'n' must be provided.") | ||
|
||
self.meta_estimator = meta_estimator | ||
self.target_image = target_image | ||
self.n_clusters = listify(n_clusters) | ||
self.filter_selection = isinstance(r, list) or isinstance(n, list) | ||
self.filter_type = "r" if r else "n" | ||
self.r = listify(r) | ||
self.n = listify(n) | ||
|
||
def _preprocess_input(self, dataset): | ||
"""Mask required input images using either the dataset's mask or the estimator's. | ||
|
||
Also, insert required metadata into coordinates DataFrame. | ||
""" | ||
super()._preprocess_input(dataset) | ||
|
||
# All extra (non-ijk) parameters for a kernel should be overrideable as | ||
# parameters to __init__, so we can access them with get_params() | ||
kt_args = list(self.meta_estimator.kernel_transformer.get_params().keys()) | ||
|
||
# Integrate "sample_size" from metadata into DataFrame so that | ||
# kernel_transformer can access it. | ||
if "sample_size" in kt_args: | ||
self.inputs_["coordinates"] = add_metadata_to_dataframe( | ||
dataset, | ||
self.inputs_["coordinates"], | ||
metadata_field="sample_sizes", | ||
target_column="sample_size", | ||
filter_func=np.mean, | ||
) | ||
|
||
@use_memmap(LGR, n_files=1) | ||
def _fit(self, dataset): | ||
"""Perform coordinate-based coactivation-based parcellation on dataset. | ||
|
||
Parameters | ||
---------- | ||
dataset : :obj:`nimare.dataset.Dataset` | ||
Dataset to analyze. | ||
""" | ||
self.dataset = dataset | ||
self.masker = self.masker or dataset.masker | ||
|
||
# Loop through voxels in target_mask, selecting studies for each and running MACMs (no MCC) | ||
target_ijk = np.vstack(np.where(self.target_mask.get_fdata())) | ||
target_xyz = vox2mm(target_ijk, self.masker.mask_img.affine) | ||
n_target_voxels = target_xyz.shape[1] | ||
n_mask_voxels = self.masker.transform(self.masker.mask_img).shape[0] | ||
|
||
n_filters = len(getattr(self, self.filter_type)) | ||
labels = np.zeros((n_filters, len(self.n_clusters), n_target_voxels), dtype=int) | ||
kwargs = {"r": None, "n": None} | ||
|
||
# Use a memmapped 2D array | ||
start_time = datetime.datetime.now().strftime("%Y%m%dT%H%M%S") | ||
dataset_dir = _get_dataset_dir("temporary_files", data_dir=None) | ||
_, memmap_filename = mkstemp( | ||
prefix=self.__class__.__name__, | ||
suffix=start_time, | ||
dir=dataset_dir, | ||
) | ||
data = np.memmap( | ||
memmap_filename, | ||
dtype=float, | ||
mode="w+", | ||
shape=(n_target_voxels, n_mask_voxels), | ||
) | ||
|
||
for i_filter in range(n_filters): | ||
kwargs[self.filter_type] = getattr(self, self.filter_type)[i_filter] | ||
for j_coord in n_target_voxels: | ||
xyz = target_xyz[:, j_coord] | ||
macm_ids = dataset.get_studies_by_coordinate(xyz, **kwargs) | ||
coord_dset = dataset.slice(macm_ids) | ||
|
||
# This seems like a somewhat inelegant solution | ||
# Check if the meta method is a pairwise estimator | ||
if "dataset2" in inspect.getfullargspec(self.meta_estimator.fit).args: | ||
unselected_ids = sorted(list(set(dataset.ids) - set(macm_ids))) | ||
unselected_dset = dataset.slice(unselected_ids) | ||
self.meta_estimator.fit(coord_dset, unselected_dset) | ||
else: | ||
self.meta_estimator.fit(coord_dset) | ||
|
||
data[j_coord, :] = self.meta_estimator.results.get_map( | ||
self.target_image, | ||
return_type="array", | ||
) # data is overwritten across filters | ||
|
||
# Perform clustering | ||
for j_cluster, cluster_count in enumerate(self.n_clusters): | ||
kmeans = KMeans( | ||
n_clusters=cluster_count, | ||
init="k-means++", | ||
n_init=10, | ||
random_state=0, | ||
algorithm="elkan", | ||
).fit(data) | ||
labels[i_filter, j_cluster, :] = kmeans.labels_ | ||
|
||
# Clean up MACM data memmap | ||
LGR.info(f"Removing temporary file: {memmap_filename}") | ||
os.remove(memmap_filename) | ||
|
||
images = {"labels": labels} | ||
return images | ||
|
||
def _filter_selection(self): | ||
pass | ||
|
||
def _silhouette(self): | ||
pass | ||
|
||
def _voxel_misclassification(self): | ||
pass | ||
|
||
def _variation_of_information(self): | ||
pass | ||
|
||
def _nondominant_voxel_percentage(self): | ||
pass | ||
|
||
def _cluster_distance_ratio(self): | ||
pass | ||
|
||
def fit(self, dataset, drop_invalid=True): | ||
"""Perform coordinate-based coactivation-based parcellation on dataset. | ||
|
||
Parameters | ||
---------- | ||
dataset : :obj:`nimare.dataset.Dataset` | ||
Dataset to analyze. | ||
drop_invalid : :obj:`bool`, optional | ||
Whether to automatically ignore any studies without the required data or not. | ||
Default is True. | ||
""" | ||
self._validate_input(dataset, drop_invalid=drop_invalid) | ||
self._preprocess_input(dataset) | ||
maps = self._fit(dataset) | ||
|
||
if hasattr(self, "masker") and self.masker is not None: | ||
masker = self.masker | ||
else: | ||
masker = dataset.masker | ||
|
||
self.results = MetaResult(self, masker, maps) | ||
return self.results |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chase 2020:
I interpret this to mean: