-
Notifications
You must be signed in to change notification settings - Fork 0
Project logic
Note: if you're unfamiliar with the terminology mentioned on this page, checking out React conceptual pages may be a good choice.
C-PAC_GUI can be divided into 4 major components.
- schedulers
- datasets
- pipelines
- executions
In most cases, for each component (except for pipelines
), React-Redux and Redux-Saga are used to control their behaviors. One may find Redux actions
under the app/actions
folder, find Redux reducers
under the app/reducers
folder, and Redux-Saga effects
under the app/sagas
folder. With the help of Redux-Saga, one may consecutively dispatch a series of actions (like an "action chain"). Understanding such chains for different operations is the key to understand the logic of C-PAC_GUI.
The C-PAC project structure: C-PAC dashboard and C-PAC clusters (TACC and AWS clusters, in this case). The above image comes from Yiran Su and belongs to the public domain.
Scheduler is one of the most important concepts in C-PAC. As shown in the above figure, C-PAC schedulers receive preprocessing instructions from the web backend API and schedule the preprocess of C-PAC containers.
The dashboard accesses schedulers by communicating with the tornado web APIs. Since the dashboard is built locally, scheduler information needs to be added to the dashboard manually. Currently, adding a scheduler requires a name, an address (localhost
for all current cases), a port number, and a string called authKey
provided by the C-PAC clusters.
It comes to the case to specify the MRI scans being pre-processed in a C-PAC execution. This specified scan data is called dataConfig
.
DataConfig
is a list of local file paths or URLs to anatomical and functional MRI scans, with additional necessary information listed. Intuitively speaking, dataConfig
is not a list of fMRI scans, but a list of paths to these fMRI scans. DataConfig
is organized in a YAML file, for example
- anat: s3://fcp-indi/data/Projects/ABIDE/RawDataBIDS/CMU_a/sub-0050646/anat/sub-0050646_T1w.nii.gz
func:
rest_run -1:
scan: s3://fcp-indi/data/Projects/ABIDE/RawDataBIDS/CMU_a/sub-0050646/func/sub-0050646_task-rest_run-1_bold.nii.gz
scan_parameters: s3://fcp-indi/data/Projects/ABIDE/RawDataBIDS/CMU_a/task-rest_bold.json
site: CMU_a
subject_id: '0050646'
unique_id: ses-1
When submitting a C-PAC execution, the dashboard will send this dataConfig
to the selected scheduler.
There're several options to generate dataConfig
.
- Build from BIDS. Neuroimage data can be stored in a file structure called BIDS, a popular file structure solution among neuroimage research institutes and individuals. However, the data format that C-PAC can accept is
dataConfig
. As a result, C-PAC has a functionality that converts BIDS intodataConfig
. - Parse uploaded
dataConfig
YAML file. - Fetch a
dataConfig
YAML file from a url.
Defining what kind of workflow (MRI scans pre-processing steps) will be included in a single C-PAC execution is necessary. This workflow information will be sent to a selected scheduler once a C-PAC execution is submitted to that scheduler by the dashboard.
The "workflow" is called ”pipeline”. Defining a pipeline requires the user to configure a large amount of pre-processing options. By configuring a pipeline, the user can decide what kind of pre-process will happen in the C-PAC clusters. Meanwhile, Features like importing and saving a pipeline are convenient for pipeline re-use.
The execution stage will bring together all the previous configurations: specifying execution pipeline and dataConfig
, as well as selecting an available scheduler to run a C-PAC pre-processing execution. To improve user experience, execution status should be presented in the dashboard as well.
A tornado web API set is applied in the C-PAC cluster for dashboard-cluster communication. HTTP and WebSocket are both used. A debouncer strategy is applied in the WebSocket to relieve the pressure for the backend, which stacks node-log information and sends them every 1 second in WebSocket. An authKey
is also presented every time the service is restarted.