Skip to content

Project logic

YiranCdr edited this page May 22, 2021 · 25 revisions

Note: if you're unfamiliar with the terminology mentioned on this page, checking out React conceptual pages may be a good choice.

C-PAC_GUI

C-PAC_GUI can be divided into 4 major components.

  • schedulers
  • datasets
  • pipelines
  • executions

In most cases, for each component (except for pipelines), React-Redux and Redux-Saga are used to control their behaviors. One may find Redux actions under the app/actions folder, find Redux reducers under the app/reducers folder, and Redux-Saga effects under the app/sagas folder. With the help of Redux-Saga, one may consecutively dispatch a series of actions (like an "action chain"). Understanding such chains for different operations is the key to understand the logic of C-PAC_GUI.

Schedulers

The C-PAC project structure: C-PAC dashboard and C-PAC clusters (TACC and AWS clusters, in this case). The above image comes from Yiran Su and belongs to the public domain.

Scheduler is one of the most important concepts in C-PAC. As shown in the above figure, C-PAC schedulers receive preprocessing instructions from the web backend API and schedule the preprocess of C-PAC containers.

The dashboard accesses schedulers by communicating with the tornado web APIs. Since the dashboard is built locally, scheduler information needs to be added to the dashboard manually. Currently, adding a scheduler requires a name, an address (localhost for all current cases), a port number, and a string called authKey provided by the C-PAC clusters.

Datasets

It comes to the case to specify the MRI scans being pre-processed in a C-PAC execution. This specified scan data is called dataConfig.

DataConfig is a list of local file paths or URLs to anatomical and functional MRI scans, with additional necessary information listed. Intuitively speaking, dataConfig is not a list of fMRI scans, but a list of paths to these fMRI scans. DataConfig is organized in a YAML file, for example

- anat: s3://fcp-indi/data/Projects/ABIDE/RawDataBIDS/CMU_a/sub-0050646/anat/sub-0050646_T1w.nii.gz
  func: 
    rest_run -1:
      scan: s3://fcp-indi/data/Projects/ABIDE/RawDataBIDS/CMU_a/sub-0050646/func/sub-0050646_task-rest_run-1_bold.nii.gz 
      scan_parameters: s3://fcp-indi/data/Projects/ABIDE/RawDataBIDS/CMU_a/task-rest_bold.json 
  site: CMU_a
  subject_id: '0050646' 
  unique_id: ses-1

When submitting a C-PAC execution, the dashboard will send this dataConfig to the selected scheduler. There're several options to generate dataConfig.

  • Build from BIDS. Neuroimage data can be stored in a file structure called BIDS, a popular file structure solution among neuroimage research institutes and individuals. However, the data format that C-PAC can accept is dataConfig. As a result, C-PAC has a functionality that converts BIDS into dataConfig.
  • Parse uploaded dataConfig YAML file.
  • Fetch a dataConfig YAML file from a url.

Pipelines

Defining what kind of workflow (MRI scans pre-processing steps) will be included in a single C-PAC execution is necessary. This workflow information will be sent to a selected scheduler once a C-PAC execution is submitted to that scheduler by the dashboard.

The "workflow" is called ”pipeline”. Defining a pipeline requires the user to configure a large amount of pre-processing options. By configuring a pipeline, the user can decide what kind of pre-process will happen in the C-PAC clusters. Meanwhile, Features like importing and saving a pipeline are convenient for pipeline re-use.

Executions

The execution stage will bring together all the previous configurations: specifying execution pipeline and dataConfig, as well as selecting an available scheduler to run a C-PAC pre-processing execution. To improve user experience, execution status should be presented in the dashboard as well.

C-PAC backend simple explain

A tornado web API set is applied in the C-PAC cluster for dashboard-cluster communication. HTTP and WebSocket are both used. A debouncer strategy is applied in the WebSocket to relieve the pressure for the backend, which stacks node-log information and sends them every 1 second in WebSocket. An authKey is also presented every time the service is restarted.

Clone this wiki locally