Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Credits section + install documentation #91

Merged
merged 10 commits into from
Oct 5, 2023
91 changes: 47 additions & 44 deletions INSTALL.md
Original file line number Diff line number Diff line change
@@ -1,82 +1,85 @@
# How to install NARPS Open Pipelines ?

## 1 - Get the code
## 1 - Fork the repository

First, [fork](https://docs.github.com/en/get-started/quickstart/fork-a-repo) the repository, so you have your own working copy of it.
[Fork](https://docs.github.com/en/get-started/quickstart/fork-a-repo) the repository, so you have your own working copy of it.

Then, you have two options to [clone](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository) the project :
## 2 - Clone the code

### Option 1: Using DataLad (recommended)
First, install [Datalad](https://www.datalad.org/). This will allow you to access the NARPS data easily, as it is included in the repository as [datalad subdatasets](http://handbook.datalad.org/en/latest/basics/101-106-nesting.html).

Cloning the fork using [Datalad](https://www.datalad.org/) will allow you to get the code as well as "links" to the data, because the NARPS data is bundled in this repository as [datalad subdatasets](http://handbook.datalad.org/en/latest/basics/101-106-nesting.html).
Then, [clone](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository) the project :

```bash
# Replace YOUR_GITHUB_USERNAME in the following command.
datalad install --recursive https://github.com/YOUR_GITHUB_USERNAME/narps_open_pipelines.git
```

### Option 2: Using Git
> [!WARNING]
> It is still possible to clone the fork using [git](https://git-scm.com/) ; but by doing this, you will only get the code.
> ```bash
> # Replace YOUR_GITHUB_USERNAME in the following command.
> git clone https://github.com/YOUR_GITHUB_USERNAME/narps_open_pipelines.git
> ```

Cloning the fork using [git](https://git-scm.com/) ; by doing this, you will only get the code.
## 3 - Get the data

```bash
git clone https://github.com/YOUR_GITHUB_USERNAME/narps_open_pipelines.git
```

## 2 - Get the data
Now that you cloned the repository using Datalad, you are able to get the data :

Ignore this step if you used DataLad (option 1) in the previous step.

Otherwise, there are several ways to get the data.
```bash
# Move inside the root directory of the repository.
cd narps_open_pipelines

## 3 - Set up the environment
# Select the data you want to download. Here is an example to get data of the first 4 subjects.
datalad get data/original/ds001734/sub-00[1-4] -J 12
datalad get data/original/ds001734/derivatives/fmriprep/sub-00[1-4] -J 12
```

The Narps Open Pipelines project is build upon several dependencies, such as [Nipype](https://nipype.readthedocs.io/en/latest/) but also the original software packages used by the pipelines (SPM, FSL, AFNI...).
> [!NOTE]
> For further information and alternatives on how to get the data, see the corresponding documentation page [docs/data.md](docs/data.md).

To facilitate this step, we created a Docker container based on [Neurodocker](https://github.com/ReproNim/neurodocker) that contains the necessary Python packages and software. To install the Docker image, two options are available.
## 4 - Set up the environment

### Option 1: Using Dockerhub
[Install Docker](https://docs.docker.com/engine/install/) then pull the Docker image :

```bash
docker pull elodiegermani/open_pipeline:latest
```

The image should install itself. Once it's done you can check the image is available on your system:
Once it's done you can check the image is available on your system :

```bash
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/elodiegermani/open_pipeline latest 0f3c74d28406 9 months ago 22.7 GB
```

### Option 2: Using a Dockerfile
> [!NOTE]
> Feel free to read this documentation page [docs/environment.md](docs/environment.md) to get further information about this environment.

## 5 - Run the project

Start a Docker container from the Docker image :

```bash
# Replace PATH_TO_THE_REPOSITORY in the following command (e.g.: with /home/user/dev/narps_open_pipelines/)
docker run -it -v PATH_TO_THE_REPOSITORY:/home/neuro/code/ elodiegermani/open_pipeline
```

The Dockerfile used to create the image stored on DockerHub is available at the root of the repository ([Dockerfile](Dockerfile)). But you might want to personalize this Dockerfile. To do so, change the command below that will generate a new Dockerfile:
Install NARPS Open Pipelines inside the container :

```bash
docker run --rm repronim/neurodocker:0.7.0 generate docker \
--base neurodebian:stretch-non-free --pkg-manager apt \
--install git \
--fsl version=6.0.3 \
--afni version=latest method=binaries install_r=true install_r_pkgs=true install_python2=true install_python3=true \
--spm12 version=r7771 method=binaries \
--user=neuro \
--workdir /home \
--miniconda create_env=neuro \
conda_install="python=3.8 traits jupyter nilearn graphviz nipype scikit-image" \
pip_install="matplotlib" \
activate=True \
--env LD_LIBRARY_PATH="/opt/miniconda-latest/envs/neuro:$LD_LIBRARY_PATH" \
--run-bash "source activate neuro" \
--user=root \
--run 'chmod 777 -Rf /home' \
--run 'chown -R neuro /home' \
--user=neuro \
--run 'mkdir -p ~/.jupyter && echo c.NotebookApp.ip = \"0.0.0.0\" > ~/.jupyter/jupyter_notebook_config.py' > Dockerfile
source activate neuro
cd /home/neuro/code/
pip install .
```

When you are satisfied with your Dockerfile, just build the image:
Finally, you are able to run pipelines :

```bash
docker build --tag [name_of_the_image] - < Dockerfile
python narps_open/runner.py
usage: runner.py [-h] -t TEAM (-r RSUBJECTS | -s SUBJECTS [SUBJECTS ...] | -n NSUBJECTS) [-g | -f] [-c]
```

When the image is built, follow the instructions in [docs/environment.md](docs/environment.md) to start the environment from it.
> [!NOTE]
> For further information, read this documentation page [docs/running.md](docs/running.md).
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,6 @@ This project is developed in the Empenn team by Boris Clenet, Elodie Germani, Je

In addition, this project was presented and received contributions during the following events:
- OHBM Brainhack 2022 (June 2022): Elodie Germani, Arshitha Basavaraj, Trang Cao, Rémi Gau, Anna Menacher, Camille Maumet.
- e-ReproNim FENS NENS Cluster Brainhack: <ADD_NAMES_HERE>
- OHBM Brainhack 2023 (July 2023): <ADD_NAMES_HERE>
- e-ReproNim FENS NENS Cluster Brainhack (June 2023) : Liz Bushby, Boris Clénet, Michael Dayan, Aimee Westbrook.
- OHBM Brainhack 2023 (July 2023): Arshitha Basavaraj, Boris Clénet, Rémi Gau, Élodie Germani, Yaroslav Halchenko, Camille Maumet, Paul Taylor.
- ORIGAMI lab hackathon (Sept 2023):
116 changes: 49 additions & 67 deletions docs/environment.md
Original file line number Diff line number Diff line change
@@ -1,100 +1,82 @@
# Set up the environment to run pipelines
# About the environment of NARPS Open Pipelines

## Run a docker container :whale:
## The Docker container :whale:

Start a container using the command below:
The NARPS Open Pipelines project is build upon several dependencies, such as [Nipype](https://nipype.readthedocs.io/en/latest/) but also the original software packages used by the pipelines (SPM, FSL, AFNI...). Therefore, we created a Docker container based on [Neurodocker](https://github.com/ReproNim/neurodocker) that contains software dependencies.

```bash
docker run -ti \
-p 8888:8888 \
elodiegermani/open_pipeline
```

On this command line, you need to add volumes to be able to link with your local files (original dataset and git repository). If you stored the original dataset in `data/original`, just make a volume with the `narps_open_pipelines` directory:

```bash
docker run -ti \
-p 8888:8888 \
-v /users/egermani/Documents/narps_open_pipelines:/home/ \
elodiegermani/open_pipeline
```

If it is in another directory, make a second volume with the path to your dataset:

```bash
docker run -ti \
-p 8888:8888 \
-v /Users/egermani/Documents/narps_open_pipelines:/home/ \
-v /Users/egermani/Documents/data/NARPS/:/data/ \
elodiegermani/open_pipeline
```

After that, your container will be launched!

## Other useful docker commands

### START A CONTAINER

```bash
docker start [name_of_the_container]
```

### VERIFY A CONTAINER IS IN THE LIST

```bash
docker ps
```

### EXECUTE BASH OR ATTACH YOUR CONTAINER
The simplest way to start the container using the command below :

```bash
docker exec -ti [name_of_the_container] bash
docker run -it elodiegermani/open_pipeline
```

**OR**
From this command line, you need to add volumes to be able to link with your local files (code repository).

```bash
docker attach [name_of_the_container]
```
# Replace PATH_TO_THE_REPOSITORY in the following command (e.g.: with /home/user/dev/narps_open_pipelines/)
docker run -it \
-v PATH_TO_THE_REPOSITORY:/home/neuro/code/ \
elodiegermani/open_pipeline
```

## Useful commands inside the container
## Use Jupyter with the container

### ACTIVATE CONDA ENVIRONMENT
If you wish to use [Jupyter](https://jupyter.org/) to run the code, a port forwarding is needed :

```bash
source activate neuro
```
docker run -it \
-v PATH_TO_THE_REPOSITORY:/home/neuro/code/ \
-p 8888:8888 \
elodiegermani/open_pipeline
```

### LAUNCH JUPYTER NOTEBOOK
Then, from inside the container :

```bash
jupyter notebook --port=8888 --no-browser --ip=0.0.0.0
```

## If you did not use your container for a while
You can now access Jupyter using the address provided by the command line.

Verify it still runs :
> [!NOTE]
> Find useful information on the [Docker documentation page](https://docs.docker.com/get-started/). Here is a [cheat sheet with Docker commands](https://docs.docker.com/get-started/docker_cheatsheet.pdf)

```bash
docker ps -l
```
## Create a custom Docker image

If your container is in the list, run :
The `elodiegermani/open_pipeline` Docker image is based on [Neurodocker](https://github.com/ReproNim/neurodocker). It was created using the following command line :

```bash
docker start [name_of_the_container]
docker run --rm repronim/neurodocker:0.7.0 generate docker \
--base neurodebian:stretch-non-free --pkg-manager apt \
--install git \
--fsl version=6.0.3 \
--afni version=latest method=binaries install_r=true install_r_pkgs=true install_python2=true install_python3=true \
--spm12 version=r7771 method=binaries \
--user=neuro \
--workdir /home \
--miniconda create_env=neuro \
conda_install="python=3.8 traits jupyter nilearn graphviz nipype scikit-image" \
pip_install="matplotlib" \
activate=True \
--env LD_LIBRARY_PATH="/opt/miniconda-latest/envs/neuro:$LD_LIBRARY_PATH" \
--run-bash "source activate neuro" \
--user=root \
--run 'chmod 777 -Rf /home' \
--run 'chown -R neuro /home' \
--user=neuro \
--run 'mkdir -p ~/.jupyter && echo c.NotebookApp.ip = \"0.0.0.0\" > ~/.jupyter/jupyter_notebook_config.py' > Dockerfile
```

Else, relaunch it with :
If you wish to create your own custom environment, make changes to the parameters, and build your custom image from the generated Dockerfile.

```bash
docker run -ti \
-p 8888:8888 \
-v /home/egermani:/home \
[name_of_the_image]
# Replace IMAGE_NAME in the following command
docker build --tag IMAGE_NAME - < Dockerfile
```

### To use SPM inside the container, use this command at the beginning of your script:
## Good to know

To use SPM inside the container, use this command at the beginning of your script:

```python
from nipype.interfaces import spm
Expand Down
58 changes: 29 additions & 29 deletions docs/running.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,33 @@
# :running: How to run NARPS open pipelines ?
# How to run NARPS open pipelines ? :running:

## Using the `PipelineRunner`
## Using the runner application

The `narps_open.runner` module allows to run pipelines from the command line :

```bash
python narps_open/runner.py -h
usage: runner.py [-h] -t TEAM (-r RANDOM | -s SUBJECTS [SUBJECTS ...]) [-g | -f]

Run the pipelines from NARPS.

options:
-h, --help show this help message and exit
-t TEAM, --team TEAM the team ID
-r RANDOM, --random RANDOM the number of subjects to be randomly selected
-s SUBJECTS [SUBJECTS ...], --subjects SUBJECTS [SUBJECTS ...] a list of subjects
-g, --group run the group level only
-f, --first run the first levels only (preprocessing + subjects + runs)
-c, --check check pipeline outputs (runner is not launched)

python narps_open/runner.py -t 2T6S -s 001 006 020 100
python narps_open/runner.py -t 2T6S -r 4
python narps_open/runner.py -t 2T6S -r 4 -f
python narps_open/runner.py -t 2T6S -r 4 -f -c # Check the output files without launching the runner
```

In this usecase, the paths where to store the outputs and to the dataset are picked by the runner from the [configuration](docs/configuration.md).

## Using the `PipelineRunner` object

The class `PipelineRunner` is available from the `narps_open.runner` module. You can use it from inside python code, as follows :

Expand Down Expand Up @@ -35,30 +62,3 @@ runner.start(True, True)
runner.get_missing_first_level_outputs()
runner.get_missing_group_level_outputs()
```

## Using the runner application

The `narps_open.runner` module also allows to run pipelines from the command line :

```bash
python narps_open/runner.py -h
usage: runner.py [-h] -t TEAM (-r RANDOM | -s SUBJECTS [SUBJECTS ...]) [-g | -f]

Run the pipelines from NARPS.

options:
-h, --help show this help message and exit
-t TEAM, --team TEAM the team ID
-r RANDOM, --random RANDOM the number of subjects to be randomly selected
-s SUBJECTS [SUBJECTS ...], --subjects SUBJECTS [SUBJECTS ...] a list of subjects
-g, --group run the group level only
-f, --first run the first levels only (preprocessing + subjects + runs)
-c, --check check pipeline outputs (runner is not launched)

python narps_open/runner.py -t 2T6S -s 001 006 020 100
python narps_open/runner.py -t 2T6S -r 4
python narps_open/runner.py -t 2T6S -r 4 -f
python narps_open/runner.py -t 2T6S -r 4 -f -c # Check the output files without launching the runner
```

In this usecase, the paths where to store the outputs and to the dataset are picked by the runner from the [configuration](docs/configuration.md).