Skip to content

Latest commit

 

History

History
235 lines (169 loc) · 13.1 KB

hpc.md

File metadata and controls

235 lines (169 loc) · 13.1 KB

Biowulf and High-Performance Computing at NIH

In SFIM, we use the NIH Biowulf cluster to do most of our data analysis. This is a GNU/Linux parallel processing system designed and built at the NIH that permits running large number of simultaneous jobs with high requirements of memory and processing power. This document collates resources for using Biowulf and HPC. Much of this is useful for setting up your account on Biowulf to benefit our typical workflows.

Set Up

Install Python on Biowulf

Sometimes Biowulf and Python can have issues. HPC has a fairly comprehensive guide here that you can read, which includes information about the default versions of Python that are installed on Biowulf and common pitfalls. They have also created a guide on how to use conda and mamba to manage environments on Biowulf; there are a variety of important steps to do this correctly, so we have additional step-by-step instructions for setting up conda on biowulf.

It is important to note that there can be a naming conflict ("dbus") when using conda on Biowulf. This confict can cause NoMachine to fail intermittently. The solution is to first try to remove the lines inserted by conda into your .bashrc, so that conda does not load by default. In the case that this does not work, try removing dbus with

conda uninstall dbus

which will remove the dbus package. This can cause issues if you have packages which depend on it.

Automount Helix Drives

These are instructions for setting Biowulf drives to automount on MacOS. This allows you to see your Biowulf drives in Finder as though they are a local directory on your laptop. In order to automatically mount Biowulf drives, you should create a script using Script Editor called BiowulfAutoMount.scpt that looks like this:

tell application "Finder"
    mount volume "smb://hpcdrive.nih.gov/SFIM_100RUNS" as user name "USERNAME"
    mount volume "smb://hpcdrive.nih.gov/NIMH_SFIM" as user name "USERNAME"
    mount volume "smb://hpcdrive.nih.gov/USERNAME" as user name "USERNAME"
    mount volume "smb://hpcdrive.nih.gov/SFIMLBC" as user name "USERNAME"
    mount volume "smb://hpcdrive.nih.gov/SFIM" as user name "USERNAME"
    mount volume "smb://hpcdrive.nih.gov/data" as user name "USERNAME"
end tell

with USERNAME your NIH username. The USERNAME address will mount to /home/USERNAME on biowulf. The /data address will mount to /data/USERNAME. All others will go to /data/DIRNAME with DIRNAME the directory name. Repeat for as many directories as you need. Then export it as an application in Script Editor. When you run it, it will mount the directories. You should be able to access them in MacOS under /Volumes/ in Terminal.

Follow additional instructions here for MacOS in order to improve mount performance.

SSH Keys

To connect from a local laptop

You may find it useful to create an SSH key to connect to Biowulf without having to type in your password every time. To do so, you can follow the instructions from HPC. The main difference here is that you need to create your key on your local laptop, then add the public key to your ~/.ssh/authorized_keys file on Biowulf.

For Github

You will want an SSH key on Biowulf to connect with GitHub via the command line. You can follow the same instructions used for setting up your laptop.

If you are using Git on Biowulf, you might get some weird fatal errors when you start a new session. If this happens, try to restart the ssh-agent and re-add the key (instructions here from GitHub).

Using Biowulf

HPC Tutorials

The HPC team has put together an incredible amount of helpful tutorials, including a user guide for common tasks and commands. There are many of them, but the following may be particularly useful as you get started:

  • Connect to NIH HPC systems on your Mac or via NoMachine
    • You can connect to Biowulf through the Terminal using ssh, but using NoMachine may be necessary if you are using graphical applications.
  • Using Jupyter Notebooks on Biowulf
    • If you want to run a Jupyter Notebook that connects to a compute node, you must create an SSH Tunnel. This requires a few specific steps outlined in the HPC documentation.
  • NIMH-specific resources
    • Specifically, more information about spersist sessions. These can be useful for setting up SSH tunneling when you want to have a longer session.
  • Swarm guide
    • Swarm simplifies submitting a group of commands to the batch system on Biowulf.

In addition to the HPC user guide and tutorials, the Data Sharing and Science Team has helpfully created additional Biowulf resources. Several key tools are the ability to store an environment in an spersist node on the cluster, and the ability to easily run BIDS and fMRIPrep validation.

Using Jupyterlab

Creating an spersist session with two tunnels

When using Jupyterlab, you need to create two SSH tunnels. First, open a terminal and connect to Biowulf:

ssh biowulf.nih.gov

Next, create a tmux session:

module load tmux
tmux new

Now, you can create an spersist session. The command below will also start a VNC server, which is useful if you're using graphically demanding applications (ex. AFNI), but it's not necessary. CPUs and memory can also be edited to suit your needs. The important thing here is that there are two --tunnel flags, which will allow you to connect to Jupyterlab:

spersist --tunnel --tunnel --cpus-per-task=16 --mem=32g --vnc

Copy the SSH command it gives you, then open a new terminal and paste it. It will look something like this:

ssh  -L 00000:localhost:00000 -L 11111:localhost:11111 -L 22222:localhost:22222 [email protected]

Make sure you save this command, as you will need to input it whenever you lose connection.

Opening Jupyterlab

Back in the tmux session, cd to the directory you will be working in and activate your Jupyterlab environment. Then, execute this command, replacing ${PORT1} with the first port in your SSH command. In this example, it is 00000.

jupyter-lab --port ${PORT1} --ip localhost --no-browser

Paste the URL it gives you into your browser and bookmark it. Now you can use Jupyterlab!

Close the tmux window by pressing the 'X' button – do not type 'exit' or it will end the session.

Managing your tmux session

If you ever want to reopen your tmux session, run:

module load tmux
tmux ls

The ls command will output the session number. Using it, you can open your session:

tmux attach -t <session number>

To exit your session, you can simply run:

exit

Useful Modules

Biowulf has a module system that is sometimes useful for loading common programs that aren't loaded by default. You can use it to load either the newest version on Biowulf or an older version for a range of programs. Several modules that may be particularly useful for neuroimagers are

module load afni # usually kept up-to-date
module load R
module load git # default git is a very old verion. This will load a more up-to-date version
module load fsl
module load fmriprep
module load mriqc
module load matlab

It's a bit messy, but some additional common programs are in /data/NIMH_SFIM/CommonScripts and some common anatomical parcellations are in /data/NIMH_SFIM/CommonParcellations. These might be out-of-date, but if you want to install a newer version there so that others can benefit, just ask around before updating.

Directory Permissions

If you're collaborating with others in a directory on Biowulf, you may need to change the permissions to allow others to write or read content. On Biowulf, each file or directory is part of a group. That group should be SFIM or the name of the /data/[group] directory. New files sometimes have the group as an individual's user ID, which means others won't be able to see it. chgrp -R SFIM directory will change the group for directory and all of the files inside. You can then adjust group access with chmod -R 2770 directory. The first 2 means that new files within a directory will (theoretically) inherit the same group name and permissions. The next 3 digits define file owner, group, and world permissions. 7 means a file is read/write/executable, 4 is read only, and 0 is no permissions. You can also consider adding umask 007 to your ~/.bashrc on Biowulf, which will allow all members of the directory's group to have write access to new subdirectories automatically.

Here are some useful options

chmod -R 2770 directory # Owner and group can read/write/execute
chmod -R 2700 directory # Only owner can read/write/execute
chmod -R 2740 directory # Owner can read/write/execute and group members can read, but not alter files

Unless you are sharing files that do not contain personnally identifiable information (PII) with an out-group Biowulf user, never make files world-readable on Biowulf. If you have an ongoing collaboration with someone outside of SFIM, request a new group where you can define who has access. For example, SFIMLBC has been used for some collaborations between SFIM and other LBC lab members.

For more details and guidance from the HPC team, see their documetnation here.

Using Git on Biowulf

Every cloned GitHub repo has a named owner. That means, if multiple people are working on the same code in the same Biowulf directory, it will be impossible to see who made which edit. Having multiple people work on code in one location can be convenient if one person is making most of the edits with some paired coding support from someone else. If this is the case, make sure to set chmod permissions for the code, including the .git directory so that they both have write access.

If a repo has more than one active contributor, storing the codebase in a single directory can cause problems. In this case, it is better for each person to have their own clone of the repo where they edit and commit changes to GitHub while specifying one location for where code is run from. Concretely, this might look like:

SFIM_shared_dir/ # shared directory on Biowulf
├── data # store data here
├── personA # person A's clone of a git repo
│   └── .git
└── personB # person B's clone of the git repo
    └── .git

If there is a shared directory and files (particularly those in the .git directory) have permissions get changed so they are associated with a single user (rather than the group), you may get an error where Git does not realize a repo exists. If this is the case, you can try to change the group of the directory. You may have to run this command specifically for the .git directory if chgrp does not touch the hidden directories.

chgrp -R SFIM directory/.git

If you aren't sure what the best structure for your project is, talk to the Scientific Programmer at the beginning of your project to get things set up.

Special note when using Git with VSCode

In order to interact with GitHub repos when you're connected to a compute node on VSCode via the Remote Developer Extension, you must have added the following to ~/.ssh/config on Biowulf, replacing $username with your Biowulf username. Note that for this to work, the repository must be cloned using https not ssh.

Host github.com
  User git
  ProxyCommand /usr/bin/ssh -o ForwardAgent=yes [email protected] nc -w 120ms %h %p

For additional information on using VSCode, check out our VSCode Guide.