MicrobleedNet

This repository contains the source code, data, and results for the project titled "Automatic Detection of Cerebral Microbleeds: A Clinically Robust Deep Learning Framework.". The project was conducted as part of my 30 ECTS Master's thesis in collaboration with CEREBRIU during my MSc in Data Science at the IT University of Copenhagen (ITU).

Supporting documents:

Full detailed report: Full Report
Poster presented at the D3A conference 2.0: D3A Poster

Authors

Jorge del Pozo Lerida (ITU)(CEREBRIU)
Veronika Cheplygina (ITU)
Mathias Perslev (CEREBRIU)
Silvia Ingala (CEREBRIU)
Akshay Pai (CEREBRIU)

Abstract

Cerebral Microbleeds (CMBs) are neuroimaging biomarkers visible as small round hypointensities on magnetic resonance images (MRI) in T2*-weighted (T2S) or susceptibility-weighted imaging (SWI). Associated with over 30 medical conditions, the accurate quantification and localization of CMBs are crucial for diagnostic and prognostic assessments. However, their manual detection by radiologists is labor-intensive and error-prone, particularly when numerous CMBs present, positioning them as prime candidates for automated detection. Despite this, automation remains challenging due to CMBs’ small size, the scarcity of publicly available annotated data, and their similarity to various other biological mimics. The interpretation of the performance of existing methods is compounded by a lack of standardized metrics and task definitions, incomplete metrics reporting, imperfect evaluations, and the absence of a robust benchmark for comparison. Current methods often exhibit suboptimal performance, characterized by a high rate of false positives, and are not trained or evaluated on data representative of the variations found in clinical settings, which typically include a broad range of demographic, pathological, and MRI data variation. In this study, we develop a sequence-agnostic model applicable to SWI or T2S that is robust against the data variation commonly found in real-life clinical settings. We enhance our model’s robustness through the curation of a large collection of public and private data, supplemented by advanced data augmentation and an initial pretraining phase with a large set of negative samples and synthetic CMBs. In parallel, we investigate the benefit of using transfer learning from a bigger source segmentation task with a larger and more representative dataset but find no tangible improvement. We conduct a rigorous evaluation first on a public dataset for benchmarking our model, achieving 69% recall and 84% precision across the entire test set with an average of 0.1 false positives per scan. Next, we evaluate the model on an in-house annotated dataset, crafted to simulate challenging real-life conditions. Performance metrics show a drop to 30% recall and 70% precision, with 0.13 false positives per scan. We find that the reason is the model struggling with insufficient slice thickness, which makes CMBs that look elongated like veins become false negatives. Additionally, the high number of CMBs per scan poses a significant challenge and leads us to raise concerns about the reliability of inter-rater agreement assessments for existing rating methods

Our Approach

3D U-Net based segmentation network
MRI sequence agnostic (GRE T2* or SWI)
Standardization of 7 different datasets (public & private)
Transfer Learning from thousands of clinical studies
Intense and diverse data augmentation
Pretraining on a large dataset of synthetic CMBs and negative scans with a wide range of acquisition parameters
Location-based post-processing (SynthSeg)
Comprehensive evaluation on 2 unbiased datasets and 3 tasks: segmentation, detection, and classification
Performance analysis by CMB features and MRI acquisition parameters

Repository structure overview

The following folders exist in the repository:

cmbnet

This folder is structured as a Python package with several modules and submodules. Scripts have many interdependencies and some are simply utility functions for other scripts.

Some of the main scripts are the following:

data_preprocessing.py: preprocess a full dataset
data_post-processing.py: postprocesses predictions from model
evaluate_CMBlevel.py: Used to evaluate the performance of the trained model at a CMB level.
generate_radiomics_metadata.py

Many other scripts exist for various purposes the project, all of whih are docuemnted with self-explanatory name.

Note: training and prediction are perfomed as part of a bigger MLOps codebase which could not be included in repo

data-misc

Contains metadata from datasets preprocessing, splits files, training config files, images for report, CSVs generated for analysis...etc

notebooks

Contains Python notebooks used to visualize and get overviews of different steps of the project

R

Contains R code used for different data anlysis and processing purposes

Some results

Detection performance on the DOU test set compared to the best-performing published methods for this dataset. Metrics are computed globally for all CMB detections. FPscan is $n_{FP}/n_{scans}$. DOU is a public dataset with 20 scans and 74 CMBs.

Note: To see the results in more detail, please refer to the Full Report.

Note: To see results from preliminary work (7.5 ECTS research course), please visit Segmentation_CMB.

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
R		R
cmbnet		cmbnet
data-misc		data-misc
notebooks		notebooks
.gitignore		.gitignore
D3A_poster.pdf		D3A_poster.pdf
README.md		README.md
report.pdf		report.pdf
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MicrobleedNet

Supporting documents:

Authors

Abstract

Our Approach

Repository structure overview

cmbnet

data-misc

notebooks

R

Some results

About

Releases

Packages

Languages

purrlab/msc-microbleeds

Folders and files

Latest commit

History

Repository files navigation

MicrobleedNet

Supporting documents:

Authors

Abstract

Our Approach

Repository structure overview

cmbnet

data-misc

notebooks

R

Some results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages