Skip to content
Amir Reza Sadri edited this page Aug 2, 2020 · 96 revisions

MRQy

Table of Contents

Introduction

Quality control (QC) is a basic step in the analysis of Magnetic Resonance Imaging (MRI) data through a machine learning standard pipeline. Individual datasets usually have too many imaging scans obtained from different centers and using a variety of scanner equipment. A key aspect of utilizing these cohorts for reliable model development and optimization of computational imaging tools is to curate datasets with minimal to no artifacts; implying they are relatively homogeneous in appearance [1]. Evaluating variations and relative image quality between cohorts is thus critical to determine whether a machine learning model that was trained on one cohort will perform reproducibly on a different cohort.

This wiki describes the usage of MRQy, a new open-source quality control tool for MR imaging data.

Properties

MRQy builds on the Python-JavaScript framework and has been specialized for analyzing large-scale MRI cohorts through the following modules: (i) automatic foreground detection for any MR image from any body region, from which it will (ii) extract a series of imaging-specific metadata and quality measures generalized to work with any structural MR sequence, in order to (iii) compute representations that capture relevant MR image quality trends in a data cohort.

These are presented within a specialized HTML5-based front-end which can be easily interrogated by the end-user to identify batch effects and imaging artifacts towards curation of MR imaging cohorts of acceptable quality for model development; as we will demonstrate using a representative large-scale MRI cohort from TCIA as well as an in-house rectal dataset.

MRQy works in an unsupervised standalone setting run efficiently on a standard computer and has a modular design to allow for easy incorporation of additional algorithms and metrics as plugins in the future. The compatibility of MRQy with multi-organs makes it unique along with other similar tools. This feature increases the application of the tool to assess the individual as well as whole-body parts. MRQy can provide quantitative metrics for benchmarking the quality and consistency of MRI data.

Format and Usage

After the installation of all the prerequisite Python packages (specified in the installation instructions), MRQy can be run on a directory containing files for a given cohort via the command:

python QC.py output_folder_name "input directory address" 

No additional configuration files need to be specified. This results in the following steps being executed:

  • Thumbnail images are generated for all 2D sections in each MRI dataset and saved as .png files within the UserInterface/Data folder.
  • Each dataset is processed to detect the foreground and background region.
  • Metadata are extracted from file headers for each dataset. Measurements are computed based on the detected foreground region for each dataset.
  • Both metadata and measurements are saved for each dataset within a tab-separated file (results.tsv) that is stored within the UserInterface/Data folder.
  • For a given cohort, a single UMAP and a single t-SNE embedding are computed for all the datasets based on the 23 measures (after whitening). The embedding coordinates are also saved into the results.tsv file.

Further interrogation of cohort variations and artifact trends may be done reading results.tsv into any common data analytic tool (e.g. MATLAB or R). A specialized front-end HTML interface ({\it{index.html}}) is available within the UserInterface folder designed for real-time manipulation and visualization. Quality control can be performed via multiple pathways:

  • Using sorting arrows available on each table column to re-order measures and examine numeric trends. Users can further annotate rows or remove non-informative patients.
  • As the different interface components are synchronized, if a patient row is highlighted in either Table, a corresponding highlight appears on a line within the PC chart, on a bar in the bar chart, as well as shading the patient-specific bubble in the embedding plots. Thumbnail images for this patient volume are shown in the interface.
  • Using the PC and bar charts to directly compare a specific measure across all the subject scans. This can help quickly determine which of the metadata or measures are consistent across the entire cohort as well as identify outliers. The PC chart can also be used to evaluate positive or negative relationships between different measures and thus determine the trade-off in processing for specific artifacts.
  • Using embedding plots (t-SNE and UMAP) to track specific site- or scanner-specific trends within the cohort. By visualizing the 2D space into which the entire cohort has been mapped, any clusters that can be identified typically correspond to site- and scanner-specific variations. The overall distribution of points in space also provides an indication of the variability within the entire cohort.

User-Specified Settings

MRQy extracts basic metadata that provides general information from the input data. For example, for a _.dcm _file input, the metadata contains 10 standard tags as shown in Table. As designed, additional tag fields or private metadata can also be pulled on each scan. The user easily saves the desire metadata names in a .txt file and calls the command

python QC.py output_folder_name "input directory" -t "tags .txt address"

In addition, the user can supply a configuration for the foreground detection algorithm. By using the following command, the tool computes the foreground region for each image objects separately:

python QC.py output_folder_name "input directory" -c "True"

The default value for the -c flag is False in which the whole body part is considered as the foreground mask.

Clone this wiki locally