Skip to content

Latest commit

 

History

History
131 lines (95 loc) · 4.96 KB

3-cluster-install.md

File metadata and controls

131 lines (95 loc) · 4.96 KB

Set up your environment

Install python

First, connect to your interactive HPC node.

You will need python 3, and how this is done may vary per cluster.
For example, if your cluster has a module system, the command may be something like this:

module load python3

You may also directly be able to install python with the following if you have root privileges:

sudo apt-get update
sudo apt-get install python3.8

Check with your sysadmin or documentation for more information.
Record any commands required, then check the command is healthy and in your path:

python3 --version

Clone the Cast repo and set up a virtual environment

Add pipenv, which isolates our script dependencies, to your homedir:

python3 -m pip install pipenv

Snag a copy of this repository:

# If you are tracking your settings as shown previously,
# add the following to your HPC user's ~/ssh/config:
Host github.com
  IdentityFile ~/fw-cast-st-jude-key

git clone <your-github-location> fw-cast

cd fw-cast

Set up the pipenv for the fw-cast project:

cd src
python3 -m pipenv install
cd ../

Run the setup script

Prepare your cluster-specific files by running the setup script. You may have to prepend bash or sh.

./process/setup.sh

Important - in a shared environment, protect your credentials:

chmod 0600 ./settings/credentials.sh

Configure

A new settings folder was just generated.
You need to edit each of these files in turn to configure it for your cluster:

Filename Purpose
cast.yml High-level settings
credentials.sh Sensitive information and Singularity environment config options
start-cast.sh Bootstrap script

Each file has a variety of comments to guide you through the process.
Work with your collaborating Flywheel employee on these settings, particularly the
connection credential (i.e., SCITRAN_CORE_DRONE_SECRET in credentials.sh).

Folder settings

There are five different directories/folders that one should consider. Four of these
default folders can be changed by exporting/setting the corresponding environment
variable in fw-cast/settings/credentials.sh

"When building a container, or pulling/running a SingularityCE container from a Docker/OCI source,
a temporary working space is required. The container is constructed in this temporary space
before being packaged into a SingularityCE SIF image."

"The working directory to be used for /tmp, /var/tmp and $HOME (if -c or --contain was also used)".
Instead of mounting to the default directory of the OS--i.e., tmp (not to be confused
with the singularity image's tmp directory)--one can mount a drive that can handle intermediate
files generated when the singularity image is run.

Note: when the singularity container is built and Cast executes singularity, it passes the flag
--containall, which does not mount a user's $HOME directory and additionally contains
PID, IPC, and environment. One can set this flag when developing and testing singularity
images to simulate similar conditions.

When a gear is pulled and converted to a sif file, this folder is where both docker and
sif images are stored. The cache is created at $HOME/.singularity/cache by default.

Engine folders

The folders ENGINE_CACHE_DIR and ENGINE_TEMP_DIR are where gear inputs and output files
will be stored. These should be set to a location that will be able to handle the size of both
input and output files, and both should be set to the same directory.

Log folders

When Cast finds a job from a Flywheel instance, it creates an executable script (.sh) for the
job and its associated log file. The job id will be the in the title of executable and
its .txt log file; they are stored in the directories fw-cast/logs/generated and
fw-cast/logs/queue, respectively.

The executable job script is created from a SCRIPT_TEMPLATE (found in fw-cast/src/cluster),
depending on the HPC's job scheduler/cluster type (e.g., slurm). If you need to customize it for your HPC, it is recommended that you change create your own template in settings/cast.yml using the variable script. The start-cast.sh file
logs this template in fw-cast/logs/cast.log. When troubleshooting an HPC gear, it is
convenient to use the command tail -60 fw-cast/logs/cast.log to print out the last 60 lines from the
log file, since this can get quite long.