-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coordinate-based coactivation-based parcellation class #533
base: main
Are you sure you want to change the base?
Conversation
@mriedel56 @62442katieb if possible, I'd love it if you could check out the new class (especially the Ultimately, I want this class to be fairly basic, meaning not including too many tunable parameters, with some documentation pointing toward Additional questions:
|
nimare/parcellate.py
Outdated
images = {"labels": labels} | ||
return images | ||
|
||
def _filter_selection(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chase 2020:
We implemented a two-step procedure that involved a decision on those filter sizes to be included in the final analysis and subsequently a decision on the optimal cluster solution. In the first step, we examined the consistency of the cluster assignment for the individual voxels across the cluster solutions of the co-occurrence maps performed at different filter sizes. We selected a filter range with the lowest number of deviants, that is, number of voxels that were assigned differently compared with the solution from the majority of filters. In other words, we identified those filter sizes which produced solutions most similar to the consensus-solution across all filter sizes. For example, the proportion of deviants for the second parcellation is illustrated in Figure S1; this shows the borders of the filter range to be used for subsequent steps was based on the Z-scores of the number of deviants.
I interpret this to mean:
- Derive mode array of label assignments for each cluster count across filter sizes.
- I assume this means mode of each voxel determined independently, rather than mode of full set of assignments.
- What if label numbers don't match? E.g., label 1 in filter size 1 is most similar to label 2 in filter size 2.
- I assume we should do some kind of synchronization, unless there's some inherent order to KMeans labels?
- Count number of voxels that don't match mode for each filter size.
- Calculate proportion of deviants in each cluster solution and filter size.
- Calculate weighted z-score for each filter size (across cluster solutions) somehow?
- What is it weighted by?
- Select range of filter sizes with lowest z-scores.
- How? Is there some kind of threshold? Figure S1 grabs range with z-scores < -0.5. No clue if that's a meaningful threshold or something like 2 standard deviations from avg z-score or what.
- What if there are multiple dips? Does amplitude (z-scores of filters below threshold) or width (number of filters below threshold) matter more?
Codecov ReportBase: 88.55% // Head: 84.29% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #533 +/- ##
==========================================
- Coverage 88.55% 84.29% -4.27%
==========================================
Files 38 36 -2
Lines 4370 4069 -301
==========================================
- Hits 3870 3430 -440
- Misses 500 639 +139
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Will move most of the work to the main fit loop, because this metric requires too much information.
@62442katieb has some code from her naturalistic meta-analysis that may implement some of these metrics: https://github.com/62442katieb/meta-analytic-kmeans/blob/daf3904caad990aeadc89bc98769aaed32857e09/evaluating_clustering_solutions.ipynb |
Closes #260. Tagging @DiveicaV in case she wants to look at this.
We are using Chase et al. (2020) as the basis for our general approach- especially the metrics we're using for kernel and order selection.
EDIT: A recommendation from @SBEickhoff is to look at Liu et al. (2020) and Plachti et al. (2019) as well.
To do:
r
andn
parameters. These correspond to the "filter sizes" in Chase et al. (2020).Changes proposed in this pull request:
n
option toDataset.get_studies_by_coordinate()
.parcellate
module withCoordCBP
class.