diff --git a/INSTALL.md b/INSTALL.md index f1589893..b6142cc0 100644 --- a/INSTALL.md +++ b/INSTALL.md @@ -1,82 +1,85 @@ # How to install NARPS Open Pipelines ? -## 1 - Get the code +## 1 - Fork the repository -First, [fork](https://docs.github.com/en/get-started/quickstart/fork-a-repo) the repository, so you have your own working copy of it. +[Fork](https://docs.github.com/en/get-started/quickstart/fork-a-repo) the repository, so you have your own working copy of it. -Then, you have two options to [clone](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository) the project : +## 2 - Clone the code -### Option 1: Using DataLad (recommended) +First, install [Datalad](https://www.datalad.org/). This will allow you to access the NARPS data easily, as it is included in the repository as [datalad subdatasets](http://handbook.datalad.org/en/latest/basics/101-106-nesting.html). -Cloning the fork using [Datalad](https://www.datalad.org/) will allow you to get the code as well as "links" to the data, because the NARPS data is bundled in this repository as [datalad subdatasets](http://handbook.datalad.org/en/latest/basics/101-106-nesting.html). +Then, [clone](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository) the project : ```bash +# Replace YOUR_GITHUB_USERNAME in the following command. datalad install --recursive https://github.com/YOUR_GITHUB_USERNAME/narps_open_pipelines.git ``` -### Option 2: Using Git +> [!WARNING] +> It is still possible to clone the fork using [git](https://git-scm.com/) ; but by doing this, you will only get the code. +> ```bash +> # Replace YOUR_GITHUB_USERNAME in the following command. +> git clone https://github.com/YOUR_GITHUB_USERNAME/narps_open_pipelines.git +> ``` -Cloning the fork using [git](https://git-scm.com/) ; by doing this, you will only get the code. +## 3 - Get the data -```bash -git clone https://github.com/YOUR_GITHUB_USERNAME/narps_open_pipelines.git -``` - -## 2 - Get the data +Now that you cloned the repository using Datalad, you are able to get the data : -Ignore this step if you used DataLad (option 1) in the previous step. - -Otherwise, there are several ways to get the data. +```bash +# Move inside the root directory of the repository. +cd narps_open_pipelines -## 3 - Set up the environment +# Select the data you want to download. Here is an example to get data of the first 4 subjects. +datalad get data/original/ds001734/sub-00[1-4] -J 12 +datalad get data/original/ds001734/derivatives/fmriprep/sub-00[1-4] -J 12 +``` -The Narps Open Pipelines project is build upon several dependencies, such as [Nipype](https://nipype.readthedocs.io/en/latest/) but also the original software packages used by the pipelines (SPM, FSL, AFNI...). +> [!NOTE] +> For further information and alternatives on how to get the data, see the corresponding documentation page [docs/data.md](docs/data.md). -To facilitate this step, we created a Docker container based on [Neurodocker](https://github.com/ReproNim/neurodocker) that contains the necessary Python packages and software. To install the Docker image, two options are available. +## 4 - Set up the environment -### Option 1: Using Dockerhub +[Install Docker](https://docs.docker.com/engine/install/) then pull the Docker image : ```bash docker pull elodiegermani/open_pipeline:latest ``` -The image should install itself. Once it's done you can check the image is available on your system: +Once it's done you can check the image is available on your system : ```bash docker images + REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/elodiegermani/open_pipeline latest 0f3c74d28406 9 months ago 22.7 GB ``` -### Option 2: Using a Dockerfile +> [!NOTE] +> Feel free to read this documentation page [docs/environment.md](docs/environment.md) to get further information about this environment. + +## 5 - Run the project + +Start a Docker container from the Docker image : + +```bash +# Replace PATH_TO_THE_REPOSITORY in the following command (e.g.: with /home/user/dev/narps_open_pipelines/) +docker run -it -v PATH_TO_THE_REPOSITORY:/home/neuro/code/ elodiegermani/open_pipeline +``` -The Dockerfile used to create the image stored on DockerHub is available at the root of the repository ([Dockerfile](Dockerfile)). But you might want to personalize this Dockerfile. To do so, change the command below that will generate a new Dockerfile: +Install NARPS Open Pipelines inside the container : ```bash -docker run --rm repronim/neurodocker:0.7.0 generate docker \ - --base neurodebian:stretch-non-free --pkg-manager apt \ - --install git \ - --fsl version=6.0.3 \ - --afni version=latest method=binaries install_r=true install_r_pkgs=true install_python2=true install_python3=true \ - --spm12 version=r7771 method=binaries \ - --user=neuro \ - --workdir /home \ - --miniconda create_env=neuro \ - conda_install="python=3.8 traits jupyter nilearn graphviz nipype scikit-image" \ - pip_install="matplotlib" \ - activate=True \ - --env LD_LIBRARY_PATH="/opt/miniconda-latest/envs/neuro:$LD_LIBRARY_PATH" \ - --run-bash "source activate neuro" \ - --user=root \ - --run 'chmod 777 -Rf /home' \ - --run 'chown -R neuro /home' \ - --user=neuro \ - --run 'mkdir -p ~/.jupyter && echo c.NotebookApp.ip = \"0.0.0.0\" > ~/.jupyter/jupyter_notebook_config.py' > Dockerfile +source activate neuro +cd /home/neuro/code/ +pip install . ``` -When you are satisfied with your Dockerfile, just build the image: +Finally, you are able to run pipelines : ```bash -docker build --tag [name_of_the_image] - < Dockerfile +python narps_open/runner.py + usage: runner.py [-h] -t TEAM (-r RSUBJECTS | -s SUBJECTS [SUBJECTS ...] | -n NSUBJECTS) [-g | -f] [-c] ``` -When the image is built, follow the instructions in [docs/environment.md](docs/environment.md) to start the environment from it. +> [!NOTE] +> For further information, read this documentation page [docs/running.md](docs/running.md). diff --git a/README.md b/README.md index 7ad3172c..20125d83 100644 --- a/README.md +++ b/README.md @@ -72,6 +72,6 @@ This project is developed in the Empenn team by Boris Clenet, Elodie Germani, Je In addition, this project was presented and received contributions during the following events: - OHBM Brainhack 2022 (June 2022): Elodie Germani, Arshitha Basavaraj, Trang Cao, Rémi Gau, Anna Menacher, Camille Maumet. - - e-ReproNim FENS NENS Cluster Brainhack: - - OHBM Brainhack 2023 (July 2023): + - e-ReproNim FENS NENS Cluster Brainhack (June 2023) : Liz Bushby, Boris Clénet, Michael Dayan, Aimee Westbrook. + - OHBM Brainhack 2023 (July 2023): Arshitha Basavaraj, Boris Clénet, Rémi Gau, Élodie Germani, Yaroslav Halchenko, Camille Maumet, Paul Taylor. - ORIGAMI lab hackathon (Sept 2023): diff --git a/docs/environment.md b/docs/environment.md index 98addd6a..edab9b4d 100644 --- a/docs/environment.md +++ b/docs/environment.md @@ -1,100 +1,82 @@ -# Set up the environment to run pipelines +# About the environment of NARPS Open Pipelines -## Run a docker container :whale: +## The Docker container :whale: -Start a container using the command below: +The NARPS Open Pipelines project is build upon several dependencies, such as [Nipype](https://nipype.readthedocs.io/en/latest/) but also the original software packages used by the pipelines (SPM, FSL, AFNI...). Therefore, we created a Docker container based on [Neurodocker](https://github.com/ReproNim/neurodocker) that contains software dependencies. -```bash -docker run -ti \ - -p 8888:8888 \ - elodiegermani/open_pipeline -``` - -On this command line, you need to add volumes to be able to link with your local files (original dataset and git repository). If you stored the original dataset in `data/original`, just make a volume with the `narps_open_pipelines` directory: - -```bash -docker run -ti \ - -p 8888:8888 \ - -v /users/egermani/Documents/narps_open_pipelines:/home/ \ - elodiegermani/open_pipeline -``` - -If it is in another directory, make a second volume with the path to your dataset: - -```bash -docker run -ti \ - -p 8888:8888 \ - -v /Users/egermani/Documents/narps_open_pipelines:/home/ \ - -v /Users/egermani/Documents/data/NARPS/:/data/ \ - elodiegermani/open_pipeline -``` - -After that, your container will be launched! - -## Other useful docker commands - -### START A CONTAINER - -```bash -docker start [name_of_the_container] -``` - -### VERIFY A CONTAINER IS IN THE LIST - -```bash -docker ps -``` - -### EXECUTE BASH OR ATTACH YOUR CONTAINER +The simplest way to start the container using the command below : ```bash -docker exec -ti [name_of_the_container] bash +docker run -it elodiegermani/open_pipeline ``` -**OR** +From this command line, you need to add volumes to be able to link with your local files (code repository). ```bash -docker attach [name_of_the_container] -``` +# Replace PATH_TO_THE_REPOSITORY in the following command (e.g.: with /home/user/dev/narps_open_pipelines/) +docker run -it \ + -v PATH_TO_THE_REPOSITORY:/home/neuro/code/ \ + elodiegermani/open_pipeline +``` -## Useful commands inside the container +## Use Jupyter with the container -### ACTIVATE CONDA ENVIRONMENT +If you wish to use [Jupyter](https://jupyter.org/) to run the code, a port forwarding is needed : ```bash -source activate neuro -``` +docker run -it \ + -v PATH_TO_THE_REPOSITORY:/home/neuro/code/ \ + -p 8888:8888 \ + elodiegermani/open_pipeline +``` -### LAUNCH JUPYTER NOTEBOOK +Then, from inside the container : ```bash jupyter notebook --port=8888 --no-browser --ip=0.0.0.0 ``` -## If you did not use your container for a while +You can now access Jupyter using the address provided by the command line. -Verify it still runs : +> [!NOTE] +> Find useful information on the [Docker documentation page](https://docs.docker.com/get-started/). Here is a [cheat sheet with Docker commands](https://docs.docker.com/get-started/docker_cheatsheet.pdf) -```bash -docker ps -l -``` +## Create a custom Docker image -If your container is in the list, run : +The `elodiegermani/open_pipeline` Docker image is based on [Neurodocker](https://github.com/ReproNim/neurodocker). It was created using the following command line : ```bash -docker start [name_of_the_container] +docker run --rm repronim/neurodocker:0.7.0 generate docker \ + --base neurodebian:stretch-non-free --pkg-manager apt \ + --install git \ + --fsl version=6.0.3 \ + --afni version=latest method=binaries install_r=true install_r_pkgs=true install_python2=true install_python3=true \ + --spm12 version=r7771 method=binaries \ + --user=neuro \ + --workdir /home \ + --miniconda create_env=neuro \ + conda_install="python=3.8 traits jupyter nilearn graphviz nipype scikit-image" \ + pip_install="matplotlib" \ + activate=True \ + --env LD_LIBRARY_PATH="/opt/miniconda-latest/envs/neuro:$LD_LIBRARY_PATH" \ + --run-bash "source activate neuro" \ + --user=root \ + --run 'chmod 777 -Rf /home' \ + --run 'chown -R neuro /home' \ + --user=neuro \ + --run 'mkdir -p ~/.jupyter && echo c.NotebookApp.ip = \"0.0.0.0\" > ~/.jupyter/jupyter_notebook_config.py' > Dockerfile ``` -Else, relaunch it with : +If you wish to create your own custom environment, make changes to the parameters, and build your custom image from the generated Dockerfile. ```bash -docker run -ti \ - -p 8888:8888 \ - -v /home/egermani:/home \ - [name_of_the_image] +# Replace IMAGE_NAME in the following command +docker build --tag IMAGE_NAME - < Dockerfile ``` -### To use SPM inside the container, use this command at the beginning of your script: +## Good to know + +To use SPM inside the container, use this command at the beginning of your script: ```python from nipype.interfaces import spm diff --git a/docs/running.md b/docs/running.md index 6344c042..eb614eef 100644 --- a/docs/running.md +++ b/docs/running.md @@ -1,6 +1,33 @@ -# :running: How to run NARPS open pipelines ? +# How to run NARPS open pipelines ? :running: -## Using the `PipelineRunner` +## Using the runner application + +The `narps_open.runner` module allows to run pipelines from the command line : + +```bash +python narps_open/runner.py -h + usage: runner.py [-h] -t TEAM (-r RANDOM | -s SUBJECTS [SUBJECTS ...]) [-g | -f] + + Run the pipelines from NARPS. + + options: + -h, --help show this help message and exit + -t TEAM, --team TEAM the team ID + -r RANDOM, --random RANDOM the number of subjects to be randomly selected + -s SUBJECTS [SUBJECTS ...], --subjects SUBJECTS [SUBJECTS ...] a list of subjects + -g, --group run the group level only + -f, --first run the first levels only (preprocessing + subjects + runs) + -c, --check check pipeline outputs (runner is not launched) + +python narps_open/runner.py -t 2T6S -s 001 006 020 100 +python narps_open/runner.py -t 2T6S -r 4 +python narps_open/runner.py -t 2T6S -r 4 -f +python narps_open/runner.py -t 2T6S -r 4 -f -c # Check the output files without launching the runner +``` + +In this usecase, the paths where to store the outputs and to the dataset are picked by the runner from the [configuration](docs/configuration.md). + +## Using the `PipelineRunner` object The class `PipelineRunner` is available from the `narps_open.runner` module. You can use it from inside python code, as follows : @@ -35,30 +62,3 @@ runner.start(True, True) runner.get_missing_first_level_outputs() runner.get_missing_group_level_outputs() ``` - -## Using the runner application - -The `narps_open.runner` module also allows to run pipelines from the command line : - -```bash -python narps_open/runner.py -h - usage: runner.py [-h] -t TEAM (-r RANDOM | -s SUBJECTS [SUBJECTS ...]) [-g | -f] - - Run the pipelines from NARPS. - - options: - -h, --help show this help message and exit - -t TEAM, --team TEAM the team ID - -r RANDOM, --random RANDOM the number of subjects to be randomly selected - -s SUBJECTS [SUBJECTS ...], --subjects SUBJECTS [SUBJECTS ...] a list of subjects - -g, --group run the group level only - -f, --first run the first levels only (preprocessing + subjects + runs) - -c, --check check pipeline outputs (runner is not launched) - -python narps_open/runner.py -t 2T6S -s 001 006 020 100 -python narps_open/runner.py -t 2T6S -r 4 -python narps_open/runner.py -t 2T6S -r 4 -f -python narps_open/runner.py -t 2T6S -r 4 -f -c # Check the output files without launching the runner -``` - -In this usecase, the paths where to store the outputs and to the dataset are picked by the runner from the [configuration](docs/configuration.md).