FeTS-AI · Linardos · Sep 3, 2024 · Sep 9, 2024 · Sep 11, 2024 · Sep 11, 2024
diff --git a/.gitignore b/.gitignore
diff --git a/README.md b/README.md
@@ -3,21 +3,16 @@
 
 # Federated Tumor Segmentation Challenge
 
-The repo for the FeTS Challenge: The 1st Computational Competition on Federated Learning.
+Repo for FeTS Challenge: The 1st Computational Competition on Federated Learning.
+Year 2024 and beyond
 
 ## Website
 
 https://www.synapse.org/#!Synapse:syn28546456
 
-## Challenge Tasks
+## Challenge Task
+The challenge involves customizing core functions of a baseline federated learning system implementation. The goal is to improve over the baseline consensus models in terms of robustness in final model scores to data heterogeneity across the simulated collaborators of the federation. For more details, please see [Task_1](./Task_1).
 
-### Task 1
-
-The first task of the challenge involves customizing core functions of a baseline federated learning system implementation. The goal is to improve over the baseline consensus models in terms of robustness in final model scores to data heterogeneity across the simulated collaborators of the federation. For more details, please see [Task_1](./Task_1).
-
-### Task 2
-
-This task utilizes federated testing across various sites of the FeTS initiative in order to evaluate model submissions across data from different medical institutions, MRI scanners, image acquisition parameters and populations. The goal of this task is to find algorithms (by whatever training technique you wish to apply) that score well across these data. For more details, please see [Task_2](./Task_2).
 
 ## Documentation and Q&A
 

diff --git a/Task_1/.gitignore b/Task_1/.gitignore
@@ -0,0 +1,8 @@
+FeTS_Challenge-flair.py
+FeTS_Challenge_FedPOD_partitioning2.py
+FeTS_Challenge_RL.py
+FeTS_Challenge_RecEng.py
+FeTS_Challenge_leonardklausman.py
+*final_submission*
+build*
+cert*
diff --git a/Task_1/FeTS_Challenge.py b/Task_1/FeTS_Challenge.py
@@ -518,15 +518,18 @@ def FedAvgM_Selection(local_tensors,
 # to those you specify immediately above. Changing the below value to False will change 
 # this fact, excluding the three hausdorff measurements. As hausdorff distance is 
 # expensive to compute, excluding them will speed up your experiments.
-include_validation_with_hausdorff=True
+include_validation_with_hausdorff=True # IF SET TO FALSE WHOLE CODE RUNS SEGMENTATION FOR SOME REASON
 
 # We encourage participants to experiment with partitioning_1 and partitioning_2, as well as to create
 # other partitionings to test your changes for generalization to multiple partitionings.
 #institution_split_csv_filename = 'partitioning_1.csv'
-institution_split_csv_filename = 'small_split.csv'
+#institution_split_csv_filename = 'partitioning_1.csv'
+institution_split_csv_filename = '/home/locolinux2/datasets/MICCAI_FeTS2022_TrainingData/partitioning_2.csv' 
+institution_split_csv_filename = '/home/locolinux2/datasets/MICCAI_FeTS2022_TrainingData/sanity_partitioning.csv' # a small subset for sanity checks and debugging. Comment out to run the actual challenge partition.
 
 # change this to point to the parent directory of the data
-brats_training_data_parent_dir = '/raid/datasets/FeTS22/MICCAI_FeTS2022_TrainingData'
+brats_training_data_parent_dir = '/home/locolinux2/datasets/MICCAI_FeTS2022_TrainingData'
+brats_training_data_parent_dir = '/home/locolinux2/datasets/MICCAI_FeTS2022_Resized'
 
 # increase this if you need a longer history for your algorithms
 # decrease this if you need to reduce system RAM consumption
@@ -537,7 +540,7 @@ def FedAvgM_Selection(local_tensors,
 
 # you'll want to increase this most likely. You can set it as high as you like, 
 # however, the experiment will exit once the simulated time exceeds one week. 
-rounds_to_train = 5
+rounds_to_train = 20
 
 # (bool) Determines whether checkpoints should be saved during the experiment. 
 # The checkpoints can grow quite large (5-10GB) so only the latest will be saved when this parameter is enabled

diff --git a/Task_1/README.md b/Task_1/README.md
@@ -1,4 +1,4 @@
-# FeTS 2022 Challenge Task 1
+# FeTS 2024+ Challenge Task 1
 Task 1 (**"Federated Training"**) aims at effective weight aggregation methods for the creation of a consensus model given a pre-defined segmentation algorithm for training, while also (optionally) accounting for network outages.
 
 Please ask any additional questions in our discussion pages on our github site and we will try to update this README.md as we identify confusions/gaps in our explanations and instructions.
@@ -20,18 +20,18 @@ Please ask any additional questions in our discussion pages on our github site a
 2. ```git clone https://github.com/FETS-AI/Challenge.git```
 3. ```cd Challenge/Task_1```
 4. ```git lfs pull```
-5. Create virtual environment (python 3.6-3.8): using Anaconda, a new environment can be created and activated using the following commands: 
+5. Create virtual environment (python 3.9): using Anaconda, a new environment can be created and activated using the following commands: 
     ```sh
     ## create venv in specific path
-    conda create -p ./venv python=3.7 -y
+    conda create -p ./venv python=3.9 -y
     conda activate ./venv
     ```
 6. ```pip install --upgrade pip```
-7. Install Pytorch LTS (1.8.2) for your system (use CUDA 11):
-   ```pip3 install torch==1.8.2 torchvision==0.9.2 torchaudio==0.8.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111```
-*Note all previous versions of pytorch can be found in [these instructions]([https://pytorch.org/get-started/locally/](https://pytorch.org/get-started/previous-versions/))
-9. Set the environment variable `SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True` (to avoid sklearn deprecation error)
-10. ```pip install .``` 
+7. Install Pytorch (2.3.1) for your system:
+   ```pip install torch==2.3.1 torchvision==0.18.1```
+*Note all previous versions of pytorch can be found in these instructions:[https://pytorch.org/get-started/previous-versions/](https://pytorch.org/get-started/previous-versions/)
+8. Set the environment variable `SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True` (to avoid sklearn deprecation error)
+9. ```pip install .``` 
 > * _Note: if you run into ```ERROR: Failed building wheel for SimpleITK```, try running ```pip install SimpleITK --only-binary :all:``` then rerunning ```pip install .```_
 10. ```python FeTS_Challenge.py```
 > * _Note: if you run into ```ImportError: /home/locolinux/FETS2024/fets2024env/bin/../lib/libstdc++.so.6: version GLIBCXX_3.4.30' not found (required by /home/locolinux/FETS2024/fets2024env/lib/python3.7/site-packages/SimpleITK/_SimpleITK.so)```, try installing a previous version of SimpleITK (version 2.2.0 works)

diff --git a/Task_1/fets_challenge.egg-info/PKG-INFO b/Task_1/fets_challenge.egg-info/PKG-INFO
@@ -0,0 +1,115 @@
+Metadata-Version: 2.1
+Name: fets_challenge
+Version: 2.0
+Summary: FeTS Challenge Part 1
+Home-page: https://github.com/FETS-AI/Challenge
+Author: Sarthak Pati and Ujjwal Baid and Maximilian Zenk and Brandon Edwards and Micah Sheller and G. Anthony Reina and Patrick Foley and Alexey Gruzdev and Jason Martin and Shadi Albarqouni and Yong Chen and Russell Taki Shinohara and Annika Reinke and David Zimmerer and John B. Freymann and Justin S. Kirby and Christos Davatzikos and Rivka R. Colen and Aikaterini Kotrotsou and Daniel Marcus and Mikhail Milchenko and Arash Nazer and Hassan Fathallah-Shaykh and Roland Wiest Andras Jakab and Marc-Andre Weber and Abhishek Mahajan and Lena Maier-Hein and Jens Kleesiek and Bjoern Menze and Klaus Maier-Hein and Spyridon Bakas
+Classifier: Environment :: Console
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Classifier: Topic :: Scientific/Engineering :: Image Recognition
+Classifier: License :: OSI Approved :: FETS UI License
+Classifier: Programming Language :: Python :: 3
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: openfl@ git+https://github.com/securefederatedai/openfl.git@kta-intel/fets-2024-patch-1
+Requires-Dist: GANDLF@ git+https://github.com/CBICA/[email protected]
+Requires-Dist: fets@ git+https://github.com/FETS-AI/Algorithms.git@fets_challenge
+
+# FeTS 2024+ Challenge Task 1
+Task 1 (**"Federated Training"**) aims at effective weight aggregation methods for the creation of a consensus model given a pre-defined segmentation algorithm for training, while also (optionally) accounting for network outages.
+
+Please ask any additional questions in our discussion pages on our github site and we will try to update this README.md as we identify confusions/gaps in our explanations and instructions.
+
+## Getting started
+
+### System requirements
+
+1. [Git](https://git-scm.com/downloads)
+2. [Git LFS](https://github.com/git-lfs/git-lfs#downloading)
+2. Python with virtual environment management system: we recommend using [Anaconda](https://www.anaconda.com/products/individual).
+3. **Windows- Only**: Pickle5 requires Microsoft C++ 14.0 or greater from the [C++ build tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/).
+> * _Note: if you run into ```UnicodeDecodeError``` during installation, trying pinning ```openfl @ git+https://github.com/intel/[email protected]``` in [setup.py](https://github.com/FeTS-AI/Challenge/blob/main/Task_1/setup.py#L31)_
+4. Use CUDA 11 for your installation as CUDA 12 is not compatible with this codebase.
+
+### Instructions --- IMPORTANT
+
+1. Register for the FeTS 2022 Challenge [here](https://www.synapse.org/#!Synapse:syn28546456/wiki/617093) and submit a data request.
+2. ```git clone https://github.com/FETS-AI/Challenge.git```
+3. ```cd Challenge/Task_1```
+4. ```git lfs pull```
+5. Create virtual environment (python 3.9): using Anaconda, a new environment can be created and activated using the following commands: 
+    ```sh
+    ## create venv in specific path
+    conda create -p ./venv python=3.9 -y
+    conda activate ./venv
+    ```
+6. ```pip install --upgrade pip```
+7. Install Pytorch (2.3.1) for your system:
+   ```pip install torch==2.3.1 torchvision==0.18.1```
+*Note all previous versions of pytorch can be found in these instructions:[https://pytorch.org/get-started/previous-versions/](https://pytorch.org/get-started/previous-versions/)
+8. Set the environment variable `SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True` (to avoid sklearn deprecation error)
+9. ```pip install .``` 
+> * _Note: if you run into ```ERROR: Failed building wheel for SimpleITK```, try running ```pip install SimpleITK --only-binary :all:``` then rerunning ```pip install .```_
+10. ```python FeTS_Challenge.py```
+> * _Note: if you run into ```ImportError: /home/locolinux/FETS2024/fets2024env/bin/../lib/libstdc++.so.6: version GLIBCXX_3.4.30' not found (required by /home/locolinux/FETS2024/fets2024env/lib/python3.7/site-packages/SimpleITK/_SimpleITK.so)```, try installing a previous version of SimpleITK (version 2.2.0 works)
+11. All lower-level details are in the [FeTS Challenge python file](./FeTS_Challenge.py)
+12. To view intermediate results with TensorBoard during training, you can run the following command: ```tensorboard --logdir ~/.local/workspace/logs/tensorboard```
+
+## Time to Convergence Metric (formerly "communication cost")
+Along with the typical DICE and Hausdorff metrics, we include a "time to convergence metric" intended to encourage solutions that converge to good scores quickly in terms of time. We simulate the time taken to run each round so that competitors don't need to be concerned with runtime optimizations such as compiled vs. interpreted code, and so that final scoring will not depend on the hardware used. This simulated time is computed in the experiment.py file and provided in the metrics output of the experiment execution.
+
+The time to convergence metric will be computed as the area under the validation learning curve over 1 week of simulated time where the horizontal axis measures simulated runtime and the vertical axis measures the current best score, computed as the average of enhancing tumor, tumor core, and whole tumor DICE scores over the validation split of the training data.
+
+You can find the code for the "time to convergence metric" in the experiment.py file by searching for ## CONVERGENCE METRIC COMPUTATION.
+
+### How Simulated Time is computed
+The simulated time is stochastic, and computed per collaborator, per round, with the round time equaling the greatest round time of all collaborators in the round.
+
+A given collaborator's round time is computed as the sum of:
+- The simulated time taken to download the shared model
+- The simulated time taken to validate the shared model
+- The simulated time taken to train the model (if training)
+- The simulated time taken to validate that collaborator's trained model (if training)
+- The simulated time taken to upload that collaborator's model update (if training)
+
+During the experiment, to generate these simulated times, we first assign each collaborator four normal distrubitions representing:
+1. download speed
+2. upload speed
+3. training speed
+4. validation speed
+
+We then draw from the appropriate distribution when generating one of the times listed above (at each round).
+
+We assign these network and compute distributions by drawing uniform-randomly from lists of normal distributions created using timing information collected from a subset of the 50+ participants in the May FeTS initiative training of this same model. In this way, the statistics used to simulate timing information come from timing information collected over an actual federation of hospitals that trained this exact model. In particular, for each actual hospital in our subset, we collected:
+1. The mean and stdev seconds to download the model
+2. The mean and stdev seconds to train a batch
+3. The mean and stdev seconds to validate a batch
+4. The mean and stdev seconds to upload the model.
+
+For a given collaborator, these normal distributions are constant throughout the experiment. Again, each possible timing distribution is based on actual timing information from a subset of the hospitals in the FeTS intitiative. You can find these distributions in the experiment.py file (search for ## COLLABORATOR TIMING DISTRIBUTIONS), as well as the random seed used to ensure reproducibility.
+
+### Use in Ranking
+
+For ranking of multidimensional outcomes (or metrics), for each team, we will compute the summation of their ranks across the average of the **7** metrics (i.e., time to convergence, and Dice & Hausdorff from 3 regions of interest) described as a univariate overall summary measure. This measure will decide the overall ranking for each specific team. Notably, since all teams are ranked per patient, whereas the communication cost is only accounted once for the complete training phase, the communication cost **will be weighted** according to the number of testing subjects in order to give it **equal importance** to the quality of the tumor segmentations.
+
+
+## Data Partitioning and Sharding
+The FeTS 2022 data release consists of a training set and two CSV files - each providing information for how to partition the training data into non-IID institutional subsets. The release will contain subfolders for single patient records whose names have the format `FeTS2022_###`, and two CSV files: 
+- **partitioning_1.csv**
+- **partitioning_2.csv**
+
+Each of the partitioning CSV files has two columns, `Partition_ID` and `Subject_ID`. The Subject_ID column exhausts of the patient records contained in the release. The InstitutionName column provides an integer identifier indicating to which institution the record should be assigned. The path to a partition CSV can be provided as the value of the parameter ```institution_split_csv_filename``` to the jupyter notebook function run_challenge_experiment to specify the institutional split used when running experimental federated training on your custom federation logic. A description of each of these split CSVs is provided in Table 1. We encourage participants to create and explore training performance for other non-IID splits of the training data to help in developing generalizable customizations to the federated logic that will perform well during the validation and testing phase. A third CSV is hidden from participants and defines a test partitioning to be used in the challenge testing phase. This hidden partitioning (also described in Table 1) is another refinement of the institution split, having similar difficulty level to the institution tumor size split in our own experiments using the default customization functions.
+
+Table 1: Information for partitionings provided in the FeTS 2022 data release as well as the hidden partitioning not provided in the release (to be used in the competition testing phase).
+
+|     Split name                      |     CSV filename                         |     Description                                                                                                                                                                                       |     Number of institutions      |
+|-------------------------------------|------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------|
+|     Institution Split               |     partitioning_1.csv                |     Split of FeTS 2022 training data by originating institution.                                                                                                                                    |     23                          |
+|     Institution Tumor Size Split    |     partitioning_2.csv      |     Refinement of the institution split by tumor size, further   splitting the larger institutions according to whether a record’s tumor size   fell above or below the mean size for that institution.    |     33                          |
+|     Test Split                      |          - not provided -       |     Undisclosed refinement of the institution split.                                                                                                                                                  |     Hidden from participants    |
+
+
+
diff --git a/Task_1/fets_challenge.egg-info/SOURCES.txt b/Task_1/fets_challenge.egg-info/SOURCES.txt
@@ -0,0 +1,29 @@
+LICENSE
+MANIFEST.in
+README.md
+setup.py
+fets_challenge/__init__.py
+fets_challenge/checkpoint_utils.py
+fets_challenge/custom_aggregation_wrapper.py
+fets_challenge/experiment.py
+fets_challenge/gandlf_csv_adapter.py
+fets_challenge/inference.py
+fets_challenge/spec_sens_code.py
+fets_challenge.egg-info/PKG-INFO
+fets_challenge.egg-info/SOURCES.txt
+fets_challenge.egg-info/dependency_links.txt
+fets_challenge.egg-info/requires.txt
+fets_challenge.egg-info/top_level.txt
+openfl-workspace/fets_challenge_workspace/partitioning_1.csv
+openfl-workspace/fets_challenge_workspace/partitioning_2.csv
+openfl-workspace/fets_challenge_workspace/requirements.txt
+openfl-workspace/fets_challenge_workspace/small_split.csv
+openfl-workspace/fets_challenge_workspace/smaller_split.csv
+openfl-workspace/fets_challenge_workspace/validation.csv
+openfl-workspace/fets_challenge_workspace/plan/cols.yaml
+openfl-workspace/fets_challenge_workspace/plan/data.yaml
+openfl-workspace/fets_challenge_workspace/plan/defaults
+openfl-workspace/fets_challenge_workspace/plan/plan.yaml
+openfl-workspace/fets_challenge_workspace/src/__init__.py
+openfl-workspace/fets_challenge_workspace/src/challenge_assigner.py
+openfl-workspace/fets_challenge_workspace/src/fets_challenge_model.py
diff --git a/Task_1/fets_challenge.egg-info/dependency_links.txt b/Task_1/fets_challenge.egg-info/dependency_links.txt
@@ -0,0 +1 @@
+
diff --git a/Task_1/fets_challenge.egg-info/requires.txt b/Task_1/fets_challenge.egg-info/requires.txt
@@ -0,0 +1,3 @@
+openfl@ git+https://github.com/securefederatedai/openfl.git@kta-intel/fets-2024-patch-1
+GANDLF@ git+https://github.com/CBICA/[email protected]
+fets@ git+https://github.com/FETS-AI/Algorithms.git@fets_challenge
diff --git a/Task_1/fets_challenge.egg-info/top_level.txt b/Task_1/fets_challenge.egg-info/top_level.txt
@@ -0,0 +1,2 @@
+fets_challenge
+openfl-workspace
diff --git a/Task_1/fets_challenge/custom_aggregation_wrapper.py b/Task_1/fets_challenge/custom_aggregation_wrapper.py
@@ -1,4 +1,4 @@
-from openfl.component.aggregation_functions.experimental import PrivilegedAggregationFunction
+from openfl.interface.aggregation_functions.experimental import PrivilegedAggregationFunction
 
 
 # extends the openfl agg func interface to include challenge-relevant information