diff --git a/mkdocs/docs/HPC/FAQ.md b/mkdocs/docs/HPC/FAQ.md index 9c4df0328e3..4e70ff712b7 100644 --- a/mkdocs/docs/HPC/FAQ.md +++ b/mkdocs/docs/HPC/FAQ.md @@ -283,10 +283,11 @@ of files so other users can access the data. For example, the following command will enable a user named "otheruser" to read the file named `dataset.txt`. See -
$ setfacl -m u:otheruser:r dataset.txt
-$ ls -l dataset.txt
+```
+$ setfacl -m u:otheruser:r dataset.txt
+$ ls -l dataset.txt
 -rwxr-x---+ 2 {{userid}} mygroup      40 Apr 12 15:00 dataset.txt
-
+``` For more information about `chmod` or `setfacl`, see [Linux tutorial](linux-tutorial/manipulating_files_and_directories.md#changing-permissions-chmod). diff --git a/mkdocs/docs/HPC/HOD.md b/mkdocs/docs/HPC/HOD.md index ec51b4d3e22..01313600f4b 100644 --- a/mkdocs/docs/HPC/HOD.md +++ b/mkdocs/docs/HPC/HOD.md @@ -16,8 +16,9 @@ Before using HOD, you first need to load the `hod` module. We don't specify a version here (this is an exception, for most other modules you should, see [Using explicit version numbers](../running_batch_jobs/#using-explicit-version-numbers)) because newer versions might include important bug fixes. -
$ module load hod
-
+``` +module load hod +``` ### Compatibility with login nodes @@ -31,16 +32,17 @@ cluster module before loading the `hod` module and subsequently running For example, this will work as expected: -
$ module swap cluster/{{othercluster}}
-$ module load hod
-$ hod
+```
+$ module swap cluster/{{othercluster}}
+$ module load hod
+$ hod
 hanythingondemand - Run services within an HPC cluster
 usage: hod  [subcommand options]
 Available subcommands (one of these must be specified!):
     batch           Submit a job to spawn a cluster on a PBS job controller, run a job script, and tear down the cluster when it's done
     clean           Remove stale cluster info.
 ...
-
+``` Note that also modules named `hanythingondemand/*` are available. These should however not be used directly, since they may not be compatible @@ -52,13 +54,14 @@ for). The `hod` module will also put a basic configuration in place for HOD, by defining a couple of `$HOD_*` environment variables: -
$ module load hod
-$ env | grep HOD | sort
+```
+$ module load hod
+$ env | grep HOD | sort
 HOD_BATCH_HOD_MODULE=hanythingondemand/3.2.2-intel-2016b-Python-2.7.12
 HOD_BATCH_WORKDIR=$VSC_SCRATCH/hod
 HOD_CREATE_HOD_MODULE=hanythingondemand/3.2.2-intel-2016b-Python-2.7.12
 HOD_CREATE_WORKDIR=$VSC_SCRATCH/hod
-
+``` By defining these environment variables, we avoid that you have to specify `--hod-module` and `--workdir` when using `hod batch` or @@ -85,26 +88,27 @@ will be marked as ``. You should occasionally clean this up using `hod clean`: -
$ module list
+```
+$ module list
 Currently Loaded Modulefiles:
   1) cluster/{{defaultcluster}}(default)   2) pbs_python/4.6.0            3) vsc-base/2.4.2              4) hod/3.0.0-cli
 
-$ hod list
+$ hod list
 Cluster label	Job ID		   State                Hosts
 example1        {{jobid}}         <job-not-found>     <none>
 
-$ hod clean
+$ hod clean
 Removed cluster localworkdir directory /user/scratch/gent/vsc400/vsc40000/hod/hod/{{jobid}} for cluster labeled example1
 Removed cluster info directory /user/home/gent/vsc400/vsc40000/.config/hod.d/wordcount for cluster labeled example1
 
-$ module swap cluster/{{othercluster}}
+$ module swap cluster/{{othercluster}}
 Cluster label	Job ID				            State              	Hosts
 example2		98765.master19.{{othercluster}}.gent.vsc	<job-not-found>     <none>
 
-$ hod clean
+$ hod clean
 Removed cluster localworkdir directory /user/scratch/gent/vsc400/vsc40000/hod/hod/98765.master19.{{othercluster}}.gent.vsc for cluster labeled example2
 Removed cluster info directory /user/home/gent/vsc400/vsc40000/.config/hod.d/wordcount for cluster labeled example2
-
+``` Note that **only HOD clusters that were submitted to the currently loaded `cluster` module will be cleaned up**. ## Getting help diff --git a/mkdocs/docs/HPC/MATLAB.md b/mkdocs/docs/HPC/MATLAB.md index 84109fe6b24..2fdd910c85d 100644 --- a/mkdocs/docs/HPC/MATLAB.md +++ b/mkdocs/docs/HPC/MATLAB.md @@ -37,11 +37,12 @@ To access the MATLAB compiler, the `MATLAB` module should be loaded first. Make sure you are using the same `MATLAB` version to compile and to run the compiled MATLAB program. -
$ module avail MATLAB/
+```
+$ module avail MATLAB/
 ----------------------/apps/gent/RHEL8/zen2-ib/modules/all----------------------
    MATLAB/2021b    MATLAB/2022b-r5 (D)
-$ module load MATLAB/2021b
-
+$ module load MATLAB/2021b +``` After loading the `MATLAB` module, the `mcc` command can be used. To get help on `mcc`, you can run `mcc -?`. @@ -53,12 +54,14 @@ flag means verbose output). To show how `mcc` can be used, we use the First, we copy the `magicsquare.m` example that comes with MATLAB to `example.m`: -
$ cp $EBROOTMATLAB/extern/examples/compiler/magicsquare.m example.m
-
+``` +cp $EBROOTMATLAB/extern/examples/compiler/magicsquare.m example.m +``` To compile a MATLAB program, use `mcc -mv`: -
mcc -mv example.m
+```
+mcc -mv example.m
 Opening log file:  {{homedir}}/java.log.34090
 Compiler version: 8.3 (R2021b)
 Dependency analysis by REQUIREMENTS.
@@ -67,7 +70,7 @@ Parsing file "{{homedir}}/example.m"
 Deleting 0 temporary MEX authorization files.
 Generating file "{{homedir}}/readme.txt".
 Generating file "run\_example.sh".
-
+``` ### Libraries @@ -90,8 +93,9 @@ MATLAB program on the login nodes, consider tweaking the default maximum heap size (128M) of Java using the `_JAVA_OPTIONS` environment variable with: -
$ export _JAVA_OPTIONS="-Xmx64M"
-
+``` +export _JAVA_OPTIONS="-Xmx64M" +``` The MATLAB compiler spawns multiple Java processes. Because of the default memory limits that are in effect on the login nodes, this might @@ -102,14 +106,16 @@ to fit in memory. Another possible issue is that the heap size is too small. This could result in errors like: -
Error: Out of memory
-
+``` +Error: Out of memory +``` A possible solution to this is by setting the maximum heap size to be bigger: -
$ export _JAVA_OPTIONS="-Xmx512M"
-
+``` +export _JAVA_OPTIONS="-Xmx512M" +``` ## Multithreading @@ -130,8 +136,7 @@ you requested when submitting your job script (the `ppn` value, see [Generic res You can determine the right number of workers to use via the following code snippet in your MATLAB program: -
-- parpool.m --
-```matlab +```matlab title="parpool.m" {% include "./examples/MATLAB/parpool.m" %} ``` @@ -143,22 +148,25 @@ documentation](https://nl.mathworks.com/help/distcomp/parpool.html). Each time MATLAB is executed, it generates a Java log file in the users home directory. The output log directory can be changed using: -
$ MATLAB_LOG_DIR=<OUTPUT_DIR>
-
+``` +MATLAB_LOG_DIR= +``` where `` is the name of the desired output directory. To create and use a temporary directory for these logs: -
# create unique temporary directory in $TMPDIR (or /tmp/$USER if
+```
+# create unique temporary directory in $TMPDIR (or /tmp/$USER if
 $TMPDIR is not defined)
 # instruct MATLAB to use this directory for log files by setting $MATLAB_LOG_DIR
-$  export MATLAB_LOG_DIR=$ (mktemp -d -p $TMPDIR:-/tmp/$USER)
-
+$ export MATLAB_LOG_DIR=$ (mktemp -d -p $TMPDIR:-/tmp/$USER) +``` You should remove the directory at the end of your job script: -
$  rm -rf $MATLAB_LOG_DIR
-
+``` +rm -rf $MATLAB_LOG_DIR +``` ## Cache location @@ -169,9 +177,10 @@ location and size of this cache can be changed through the The snippet below would set the maximum cache size to 1024MB and the location to `/tmp/testdirectory`. -
$ export MATLAB_CACHE_ROOT=/tmp/testdirectory 
-$ export MATLAB_CACHE_SIZE=1024M 
-
+``` +export MATLAB_CACHE_ROOT=/tmp/testdirectory +export MATLAB_CACHE_SIZE=1024M +``` So when MATLAB is running, it can fill up to 1024MB of cache in `/tmp/testdirectory`. @@ -182,7 +191,6 @@ All of the tweaks needed to get MATLAB working have been implemented in an example job script. This job script is also available on the HPC. -
-- jobscript.sh --
-```bash +```bash title="jobscript.sh" {% include "./examples/MATLAB/jobscript.sh" %} ``` diff --git a/mkdocs/docs/HPC/VNC.md b/mkdocs/docs/HPC/VNC.md index e42fdae5437..9bce16bf93d 100644 --- a/mkdocs/docs/HPC/VNC.md +++ b/mkdocs/docs/HPC/VNC.md @@ -15,21 +15,22 @@ infrastructure from your own computer. First login on the login node (see [First time connection to the HPC infrastructure](../connecting/#first-time-connection-to-the-hpc-infrastructure), then start `vncserver` with: -
$ vncserver -geometry 1920x1080 -localhost
+```
+$ vncserver -geometry 1920x1080 -localhost
 You will require a password to access your desktops.
 
-Password:<enter a secure password>
-Verify:<enter the same password>
-Would you like to enter a view-only password (y/n)? n
+Password: 
+Verify: 
+Would you like to enter a view-only password (y/n)? n
 A view-only password is not used
 
-New '{{loginhost}}:6 ({{userid}})' desktop is {{loginhost}}:6
+New '{{loginhost}}:6 ({{userid}})' desktop is {{loginhost}}:6
 
 Creating default startup script {{homedir}}.vnc/xstartup
 Creating default config {{homedir}}.vnc/config
 Starting applications specified in {{homedir}}.vnc/xstartup
 Log file is {{homedir}}.vnc/{{loginhost}}:6.log
-
+``` **When prompted for a password, make sure to enter a secure password: if someone can guess your password, they will be able to do anything with your account you can!** @@ -46,12 +47,13 @@ each time you want to connect. You can get a list of running VNC servers on a node with -
$ vncserver -list
+```
+$ vncserver -list
 TigerVNC server sessions:
 
 X DISPLAY #	PROCESS ID
 :6		    30713
-
+``` This only displays the running VNC servers on **the login node you run the command on**. @@ -59,11 +61,12 @@ To see what login nodes you are running a VNC server on, you can run the `ls .vnc/*.pid` command in your home directory: the files shown have the hostname of the login node in the filename: -
$ cd $HOME
-$ ls .vnc/*.pid
+```
+$ cd $HOME
+$ ls .vnc/*.pid
 .vnc/{{loginhost}}:6.pid
 .vnc/{{altloginhost}}:8.pid
-
+``` This shows that there is a VNC server running on `{{loginhost}}` on port 5906 and another one running `{{altloginhost}}` on port 5908 (see also [Determining the source/destination port](./#determining-the-sourcedestination-port)). @@ -153,8 +156,9 @@ tunnel, by entering the settings in the and fields in [SSH tunnel](../running_in {% else %} Execute the following command to set up the SSH tunnel. -
$ ssh -L 5906:localhost:12345  {{userid}}@{{loginnode}}
-
+``` +ssh -L 5906:localhost:12345 {{userid}}@{{loginnode}} +``` **Replace the source port `5906`, destination port `12345` and user ID {{userid}} with your own!** @@ -172,9 +176,9 @@ you have picked is actually still available (see [Picking an intermediate port t You can check using the following command (**do not forget to replace `12345` the value you picked for your intermediate port): -
$ netstat -an | grep -i listen | grep tcp | grep 12345
-$
-
+``` +netstat -an | grep -i listen | grep tcp | grep 12345 +``` If you see no matching lines, then the port you picked is still available, and you can continue. @@ -182,11 +186,12 @@ available, and you can continue. If you see one or more matching lines as shown below, **you must disconnect the first SSH tunnel, pick a different intermediate port, and set up the first SSH tunnel again using the new value**. -
$ netstat -an | grep -i listen | grep tcp | grep 12345
+```
+$ netstat -an | grep -i listen | grep tcp | grep 12345
 tcp        0      0 0.0.0.0:12345           0.0.0.0:*               LISTEN
 tcp6       0      0 :::12345                :::*                    LISTEN
 $
-
+``` #### Setting up the second SSH tunnel to the correct login node @@ -197,10 +202,11 @@ running (`{{loginhost}}` in our running example, see [Starting a VNC server](./# To do this, run the following command: -
$ ssh -L 12345:localhost:5906 {{loginhost}}
-$ hostname
+```
+$ ssh -L 12345:localhost:5906 {{loginhost}}
+$ hostname
 {{loginhost}}
-
+``` With this, we are forwarding port `12345` on the login node we are connected to (which is referred to as `localhost`) through to port @@ -251,18 +257,20 @@ When prompted for default or empty panel, choose default. If you have an empty panel, you can reset your settings with the following commands: -
$ xfce4-panel --quit ; pkill xfconfd
-$ mkdir ~/.oldxfcesettings
-$ mv ~/.config/xfce4 ~/.oldxfcesettings
-$ xfce4-panel
-
+``` +xfce4-panel --quit ; pkill xfconfd +mkdir ~/.oldxfcesettings +mv ~/.config/xfce4 ~/.oldxfcesettings +xfce4-panel +``` ## Stopping the VNC server The VNC server can be killed by running -
vncserver -kill :6
-
+``` +vncserver -kill :6 +``` where `6` is the port number we noted down earlier. If you forgot, you can get it with `vncserver -list` (see [List running VNC servers](./#list-running-vnc-servers)). diff --git a/mkdocs/docs/HPC/account.md b/mkdocs/docs/HPC/account.md index a6292be41fb..c33c829231c 100644 --- a/mkdocs/docs/HPC/account.md +++ b/mkdocs/docs/HPC/account.md @@ -201,9 +201,10 @@ On all popular Linux distributions, the OpenSSH software is readily available, and most often installed by default. You can check whether the OpenSSH software is installed by opening a terminal and typing: -
$ ssh -V
+```
+$ ssh -V
 OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
-
+``` To access the clusters and transfer your files, you will use the following commands: @@ -222,17 +223,20 @@ A key pair might already be present in the default location inside your home directory. Therefore, we first check if a key is available with the "list short" ("ls") command: -
$ ls ~/.ssh
-
+``` +ls ~/.ssh +``` If a key-pair is already available, you would normally get: -
authorized_keys     id_rsa      id_rsa.pub      known_hosts
-
+``` +authorized_keys id_rsa id_rsa.pub known_hosts +``` Otherwise, the command will show: -
ls: .ssh: No such file or directory
-
+``` +ls: .ssh: No such file or directory +``` You can recognise a public/private key pair when a pair of files has the same name except for the extension ".pub" added to one of them. In this @@ -260,13 +264,9 @@ private and should stay private. You should not even copy it to one of your other machines, instead, you should create a new public/private key pair for each machine. -
$ ssh-keygen -t rsa -b 4096
-Generating public/private rsa key pair. Enter file in which to save the
-key (/home/user/.ssh/id_rsa): Enter passphrase (empty for no
-passphrase): Enter same passphrase again: Your identification has been
-saved in /home/user/.ssh/id_rsa. Your public key has been saved in
-/home/user/.ssh/id_rsa.pub.
-
+``` +ssh-keygen -t rsa -b 4096 +``` This will ask you for a file name to store the private and public key, and a passphrase to protect your private key. It needs to be emphasised @@ -335,15 +335,17 @@ to keep and manage the user SSH keys. If you use one of these derivatives you ** the SSH manager keyring to be able to connect to the HPC cluster. If not, SSH client will display an error message (see [Connecting](../connecting)) similar to this: -
Agent admitted failure to sign using the key. 
+```
+Agent admitted failure to sign using the key. 
 Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
-
+``` This could be fixed using the `ssh-add` command. You can include the new private keys' identities in your keyring with: -
$ ssh-add
-
+```shell +ssh-add +``` !!! tip Without extra options `ssh-add` adds any key located at `$HOME/.ssh` @@ -353,8 +355,9 @@ private keys' identities in your keyring with: Check that your key is available from the keyring with: -
$ ssh-add -l
-
+``` +ssh-add -l +``` After these changes the key agent will keep your SSH key to connect to the clusters as usual. @@ -466,7 +469,8 @@ account. Within one day, you should receive a Welcome e-mail with your VSC account details. -
Dear (Username), 
+```
+Dear (Username), 
 Your VSC-account has been approved by an administrator.
 Your vsc-username is {{ userid }}
 
@@ -479,7 +483,7 @@ For further info please visit https://www.vscentrum.be/user-portal
 
 Kind regards,
 -- The VSC administrators
-
+``` Now, you can start using the {{ hpc }}. You can always look up your VSC id later by visiting . diff --git a/mkdocs/docs/HPC/alphafold.md b/mkdocs/docs/HPC/alphafold.md index 9c502992620..4fbd3d26d79 100644 --- a/mkdocs/docs/HPC/alphafold.md +++ b/mkdocs/docs/HPC/alphafold.md @@ -199,7 +199,7 @@ To run the job scripts you need to create a file named `T1050.fasta` with the fo >T1050 A7LXT1, Bacteroides Ovatus, 779 residues| MASQSYLFKHLEVSDGLSNNSVNTIYKDRDGFMWFGTTTGLNRYDGYTFKIYQHAENEPGSLPDNYITDIVEMPDGRFWINTARGYVLFDKERDYFITDVTGFMKNLESWGVPEQVFVDREGNTWLSVAGEGCYRYKEGGKRLFFSYTEHSLPEYGVTQMAECSDGILLIYNTGLLVCLDRATLAIKWQSDEIKKYIPGGKTIELSLFVDRDNCIWAYSLMGIWAYDCGTKSWRTDLTGIWSSRPDVIIHAVAQDIEGRIWVGKDYDGIDVLEKETGKVTSLVAHDDNGRSLPHNTIYDLYADRDGVMWVGTYKKGVSYYSESIFKFNMYEWGDITCIEQADEDRLWLGTNDHGILLWNRSTGKAEPFWRDAEGQLPNPVVSMLKSKDGKLWVGTFNGGLYCMNGSQVRSYKEGTGNALASNNVWALVEDDKGRIWIASLGGGLQCLEPLSGTFETYTSNNSALLENNVTSLCWVDDNTLFFGTASQGVGTMDMRTREIKKIQGQSDSMKLSNDAVNHVYKDSRGLVWIATREGLNVYDTRRHMFLDLFPVVEAKGNFIAAITEDQERNMWVSTSRKVIRVTVASDGKGSYLFDSRAYNSEDGLQNCDFNQRSIKTLHNGIIAIGGLYGVNIFAPDHIRYNKMLPNVMFTGLSLFDEAVKVGQSYGGRVLIEKELNDVENVEFDYKQNIFSVSFASDNYNLPEKTQYMYKLEGFNNDWLTLPVGVHNVTFTNLAPGKYVLRVKAINSDGYVGIKEATLGIVVNPPFKLAAALQHHHHHH ``` -source: +source: ### Job script for running AlphaFold on GPU @@ -212,9 +212,7 @@ Swap to the `joltik` GPU before submitting it: module swap cluster/joltik ``` -
-- AlphaFold-gpu-joltik.sh --
- -```bash +```bash title="AlphaFold-gpu-joltik.sh" {% include "./examples/AlphaFold/AlphaFold-gpu-joltik.sh" %} ``` @@ -222,9 +220,7 @@ module swap cluster/joltik Jobscript that runs AlphaFold on CPU using 24 cores on one node. -
-- AlphaFold-cpu-doduo.sh --
- -```bash +```bash title="AlphaFold-cpu-doduo.sh" {% include "./examples/AlphaFold/AlphaFold-cpu-doduo.sh" %} ``` diff --git a/mkdocs/docs/HPC/apptainer.md b/mkdocs/docs/HPC/apptainer.md index f06d943b90a..4d26cdeeab1 100644 --- a/mkdocs/docs/HPC/apptainer.md +++ b/mkdocs/docs/HPC/apptainer.md @@ -84,8 +84,9 @@ We strongly recommend the use of Docker Hub, see Copy testing image from `/apps/gent/tutorials/Singularity` to `$VSC_SCRATCH`: -
$ cp /apps/gent/tutorials/Singularity/CentOS7_EasyBuild.img $VSC_SCRATCH/
-
+``` +cp /apps/gent/tutorials/Singularity/CentOS7_EasyBuild.img $VSC_SCRATCH/ +``` Create a job script like: @@ -118,8 +119,9 @@ Apptainer/Singularity image yourself Copy testing image from `/apps/gent/tutorials` to `$VSC_SCRATCH`: -
$ cp /apps/gent/tutorials/Singularity/Ubuntu14.04_tensorflow.img $VSC_SCRATCH/
-
+``` +cp /apps/gent/tutorials/Singularity/Ubuntu14.04_tensorflow.img $VSC_SCRATCH/ +``` ```bash #!/bin/sh @@ -158,19 +160,21 @@ following requirements apply: Copy the testing image from `/apps/gent/tutorials/Singularity` to `$VSC_SCRATCH` -
$ cp /apps/gent/tutorials/Singularity/Debian8_UGentMPI.img $VSC_SCRATCH/
-
+``` +cp /apps/gent/tutorials/Singularity/Debian8_UGentMPI.img $VSC_SCRATCH/ +``` For example to compile an [MPI example](https://github.com/open-mpi/ompi/blob/master/examples/ring_c.c): -
$ module load intel
-$ apptainer shell $VSC_SCRATCH/Debian8_UGentMPI.img
-$ export LANG=C
-$ export C_INCLUDE_PATH=/usr/include/x86_64-linux-gnu/:$C_INCLUDE_PATH
-$ mpiicc ompi/examples/ring_c.c -o ring_debian
-$ exit
-
+``` +module load intel +apptainer shell $VSC_SCRATCH/Debian8_UGentMPI.img +export LANG=C +export C_INCLUDE_PATH=/usr/include/x86_64-linux-gnu/:$C_INCLUDE_PATH +mpiicc ompi/examples/ring_c.c -o ring_debian +exit +``` Example MPI job script: diff --git a/mkdocs/docs/HPC/compiling_your_software.md b/mkdocs/docs/HPC/compiling_your_software.md index c01b1b7ef88..5cf5e27ed3d 100644 --- a/mkdocs/docs/HPC/compiling_your_software.md +++ b/mkdocs/docs/HPC/compiling_your_software.md @@ -90,9 +90,11 @@ A typical process looks like: We assume you've copied your software to the {{hpc}}. The next step is to request your private compute node. -
$ qsub -I
+
+```bash
+$ qsub -I
 qsub: waiting for job {{jobid}} to start
-
+``` ### Compiling a sequential program in C @@ -100,23 +102,23 @@ qsub: waiting for job {{jobid}} to start Go to the examples for chapter [Compiling and testing your software on the HPC](compiling_your_software.md#compiling-and-building-on-the-hpc) and load the foss module: -
$ cd ~/{{exampledir}}
-$ module load foss
-
+``` +cd ~/{{exampledir}} +module load foss +``` We now list the directory and explore the contents of the "*hello.c*" program: -
$ ls -l
+```
+$ ls -l
 total 512
 -rw-r--r-- 1 {{userid}} 214 Sep 16 09:42 hello.c
 -rw-r--r-- 1 {{userid}} 130 Sep 16 11:39 hello.pbs*
 -rw-r--r-- 1 {{userid}} 359 Sep 16 13:55 mpihello.c
 -rw-r--r-- 1 {{userid}} 304 Sep 16 13:55 mpihello.pbs
-
- -

hello.c

+``` -```shell +```shell title="hello.c" {% include "examples/Compiling-and-testing-your-software-on-the-HPC/hello.c" %} ``` @@ -130,14 +132,15 @@ First, check the command line options for *"gcc" (GNU C-Compiler)*, then we compile. the `O2` option enables a moderate level of optimization when compiling the code. It instructs the compiler to optimize the code for better performance without significantly increasing compilation time. Finally, list the contents of the directory again: -
$ gcc -help
-$ gcc -O2 -o hello hello.c
-$ ls -l
+```
+$ gcc -help
+$ gcc -O2 -o hello hello.c
+$ ls -l
 total 512
 -rwxrwxr-x 1 {{userid}} 7116 Sep 16 11:43 hello*
 -rw-r--r-- 1 {{userid}}  214 Sep 16 09:42 hello.c
 -rwxr-xr-x 1 {{userid}}  130 Sep 16 11:39 hello.pbs*
-
+``` A new file "hello" has been created. Note that this file has "execute" rights, i.e., it is an executable. More often than not, calling gcc -- @@ -152,35 +155,39 @@ that produces a warning does not go unnoticed. Let's test this program on the local compute node, which is at your disposal after the `qsub --I` command: -
$ ./hello
+```
+$ ./hello
 Hello #0
 Hello #1
 Hello #2
 Hello #3
 Hello #4
 ...
-
+``` It seems to work, now run it on the {{hpc}} -
$ qsub hello.pbs
+``` +qsub hello.pbs +``` ### Compiling a parallel program in C/MPI -
$ cd ~/{{exampledir}}
+``` +cd ~/{{exampledir}} +``` List the directory and explore the contents of the "*mpihello.c*" program: -
$ ls -l
+```
+$ ls -l
 total 512
 total 512
 -rw-r--r-- 1 {{userid}} 214 Sep 16 09:42 hello.c
 -rw-r--r-- 1 {{userid}} 130 Sep 16 11:39 hello.pbs*
 -rw-r--r-- 1 {{userid}} 359 Sep 16 13:55 mpihello.c
 -rw-r--r-- 1 {{userid}} 304 Sep 16 13:55 mpihello.pbs
-
- -

mpihello.c

+``` -```shell +```shell title="mpihello.c" {% include "examples/Compiling-and-testing-your-software-on-the-HPC/mpihello.c" %} ``` @@ -191,21 +198,27 @@ Then, check the command line options for *"mpicc" (GNU C-Compiler with MPI extensions)*, then we compile and list the contents of the directory again: -
$ mpicc --help
-$ mpicc -o mpihello mpihello.c
-$ ls -l
+``` +mpicc --help +mpicc -o mpihello mpihello.c +ls -l +``` A new file "hello" has been created. Note that this program has "execute" rights. Let's test this program on the "login" node first: -
$ ./mpihello
-Hello World from Node 0.
+``` +$ ./mpihello +Hello World from Node 0. +``` It seems to work, now run it on the {{hpc}}. -
$ qsub mpihello.pbs
+``` +qsub mpihello.pbs +``` ### Compiling a parallel program in Intel Parallel Studio Cluster Edition @@ -213,30 +226,39 @@ We will now compile the same program, but using the Intel Parallel Studio Cluster Edition compilers. We stay in the examples directory for this chapter: -
$ cd ~/{{exampledir}}
+``` +cd ~/{{exampledir}} +``` We will compile this C/MPI -file into an executable with the Intel Parallel Studio Cluster Edition. First, clear the modules (purge) and then load the latest "intel" module: -
$ module purge
-$ module load intel
-
+``` +module purge +module load intel +``` Then, compile and list the contents of the directory again. The Intel equivalent of mpicc is mpiicc. -
$ mpiicc -o mpihello mpihello.c
-$ ls -l
+``` +mpiicc -o mpihello mpihello.c +ls -l +``` Note that the old "mpihello" file has been overwritten. Let's test this program on the "login" node first: -
$ ./mpihello
-Hello World from Node 0.
+``` +$ ./mpihello +Hello World from Node 0. +``` It seems to work, now run it on the {{hpc}}. -
$ qsub mpihello.pbs
+``` +qsub mpihello.pbs +``` Note: The {{association}} only has a license for the Intel Parallel Studio Cluster Edition for a fixed number of users. As such, it might happen that you @@ -246,44 +268,10 @@ for your use. Note: The Intel Parallel Studio Cluster Edition contains equivalent compilers for all GNU compilers. Hereafter the overview for C, C++ and Fortran compilers. -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Sequential ProgramParallel Program (with MPI)
GNUIntelGNUIntel
Cgcciccmpiccmpiicc
C++g++icpcmpicxxmpiicpc
Fortrangfortranifortmpif90mpiifort
-
+ +| | **Sequential Program** | | **Parallel Program (with MPI)** | | +|-------------|------------------------|-----------|---------------------------------|-----------| +| | **GNU** | **Intel** | **GNU** | **Intel** | +| **C** | gcc | icc | mpicc | mpiicc | +| **C++** | g++ | icpc | mpicxx | mpiicpc | +| **Fortran** | gfortran | ifort | mpif90 | mpiifort | diff --git a/mkdocs/docs/HPC/connecting.md b/mkdocs/docs/HPC/connecting.md index e7035511fbe..86f1f443a01 100644 --- a/mkdocs/docs/HPC/connecting.md +++ b/mkdocs/docs/HPC/connecting.md @@ -56,8 +56,9 @@ Trying to establish an SSH connection from an IP address that does not adhere to these restrictions will result in an immediate failure to connect, with an error message like: -
ssh_exchange_identification: read: Connection reset by peer
-
+``` +ssh_exchange_identification: read: Connection reset by peer +``` ## First Time connection to the HPC infrastructure @@ -164,11 +165,12 @@ to the {{ hpc }} cluster via the login node "***{{ loginnode }}***". 10. To check you can now "Print the Working Directory" (pwd) and check the name of the computer, where you have logged in (hostname): -
    $ pwd
+    ```
+    $ pwd
     {{ homedir }}
-    $ hostname -f
+    $ hostname -f
     {{ loginhost }}
-    
+ ``` 11. For future PuTTY sessions, just select your saved session (i.e. "**{{ hpcname }}**") from the list, ++"Load"++ it and press ++"Open"++. @@ -183,8 +185,9 @@ Open up a terminal and enter the following command to connect to the {{ hpc }}. You can open a terminal by navigation to Applications and then Utilities in the finder and open Terminal.app, or enter Terminal in Spotlight Search. {% endif %} -
$ ssh {{ userid }}@{{ loginnode }}
-
+``` +ssh {{ userid }}@{{ loginnode }} +``` Here, user {{ userid }} wants to make a connection to the "{{ hpcname }}" cluster at {{ university }} via the login node "{{ loginnode }}", so replace {{ userid }} with your own VSC id in the above command. @@ -197,32 +200,36 @@ A possible error message you can get if you previously saved your private key somewhere else than the default location (`$HOME/.ssh/id_rsa`): -
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
-
+``` +Permission denied (publickey,gssapi-keyex,gssapi-with-mic). +``` In this case, use the `-i` option for the `ssh` command to specify the location of your private key. For example: -
$ ssh -i /home/example/my_keys
-
+``` +ssh -i /home/example/my_keys +``` {% endif %} **Congratulations, you're on the {{ hpc }} infrastructure now!** To find out where you have landed you can print the current working directory: -
$ pwd
+```
+$ pwd
 {{ homedir }}
-
+``` Your new private home directory is "{{ homedir }}". Here you can create your own subdirectory structure, copy and prepare your applications, compile and test them and submit your jobs on the {{ hpc }}. -
$ cd {{ tutorialdir }}
-$ ls
+```
+$ cd {{ tutorialdir }}
+$ ls
 Intro-HPC/
-
+``` This directory currently contains all training material for the ***Introduction to the {{ hpc }}***. More relevant training material to work with the {{ hpc }} can always be added later in @@ -233,8 +240,9 @@ You can now explore the content of this directory with the "ls --l" (**l**ist**s As we are interested in the use of the ***HPC***, move further to ***Intro-HPC*** and explore the contents up to 2 levels deep: -
$ cd Intro-HPC
-$ tree -L 2
+```
+$ cd Intro-HPC
+$ tree -L 2
 .
 '-- examples
     |-- Compiling-and-testing-your-software-on-the-HPC
@@ -248,7 +256,7 @@ contents up to 2 levels deep:
     |-- example.pbs
     '-- example.sh
 9 directories, 5 files
-
+``` This directory contains: @@ -257,8 +265,9 @@ This directory contains: 2. An ***examples*** subdirectory, containing all the examples that you need in this Tutorial, as well as examples that might be useful for your specific applications. -
$ cd examples
-
+``` +cd examples +``` !!! tip Typing `cd ex` followed by ++tab++ (the Tab-key) will generate the `cd examples` @@ -275,20 +284,23 @@ your home directory, so that you have your own personal copy and that you can start using the examples. The "-r" option of the copy command will also copy the contents of the sub-directories "*recursively*". -
$ cp -r {{ examplesdir }} ~/
-
+``` +cp -r {{ examplesdir }} ~/ +``` {%- if site == gent %} Go to your home directory, check your own private examples directory, ... and start working. -
$ cd
-$ ls -l
-
+``` +cd +ls -l +``` Upon connecting you will see a login message containing your last login time stamp and a basic overview of the current cluster utilisation. -
Last login: Thu Mar 18 13:15:09 2021 from gligarha02.gastly.os
+```
+Last login: Thu Mar 18 13:15:09 2021 from gligarha02.gastly.os
 
  STEVIN HPC-UGent infrastructure status on Mon, 19 Feb 2024 10:00:01
       cluster         - full - free -  part - total - running - queued
@@ -305,14 +317,15 @@ Upon connecting you will see a login message containing your last login time sta
 For a full view of the current loads and queues see:
 https://hpc.ugent.be/clusterstate/
 Updates on current system status and planned maintenance can be found on https://www.ugent.be/hpc/en/infrastructure/status
-
+``` {% endif %} {%- if site == brussel %} Upon connecting you will see a login message containing your last login time stamp, some useful environment variable definitions and the message of the day (MOTD). -
Last login: Thu Nov  6 16:05:21 2014 from example.vub.ac.be
+```
+Last login: Thu Nov  6 16:05:21 2014 from example.vub.ac.be
 
 Initialising your working environment...
 System variables to use in your scripts/programs:
@@ -333,14 +346,15 @@ Message of the day:
         \  (oo)____
            (__)    )\
               ||--||
-
+``` {% endif %} {%- if site == leuven %} Upon connecting you will see a login message containing your last login time stamp and some useful links. -
Last login: Mon Jan 12 18:52:20 2015 from example.kuleuven.be
+```
+Last login: Mon Jan 12 18:52:20 2015 from example.kuleuven.be
 **********************************************
 *                                            *
 * Please check the following site for        *
@@ -355,13 +369,14 @@ Upon connecting you will see a login message containing your last login time sta
 *                                            *
 *                                            *
 **********************************************
-
+``` {% endif %} {%- if site == antwerpen %} Upon connection, you will get a welcome message containing your last login timestamp and some pointers to information about the system. On Leibniz, the system will also show your disk quota. -
Last login: Mon Feb  2 17:58:13 2015 from mylaptop.uantwerpen.be
+```
+Last login: Mon Feb  2 17:58:13 2015 from mylaptop.uantwerpen.be
 
 ---------------------------------------------------------------
 
@@ -396,15 +411,16 @@ Your quota is:
    small            1389     100000     110000     none
 
 ---------------------------------------------------------------
-
+``` {% endif %} You can exit the connection at anytime by entering: -
$ exit
+```
+$ exit
 logout
 Connection to {{ loginnode }} closed.
-
+``` !!! tip "tip: Setting your Language right" You may encounter a warning message similar to the following one during connecting: @@ -449,14 +465,15 @@ Connection to {{ loginnode }} closed. Open the `.bashrc` on your local machine with your favourite editor and add the following lines: -
$ nano ~/.bashrc
+    ```
+    $ nano ~/.bashrc
     ...
     export LANGUAGE="en_US.UTF-8"
     export LC_ALL="en_US.UTF-8"
     export LC_CTYPE="en_US.UTF-8"
     export LANG="en_US.UTF-8"
     ...
-    
+ ``` !!! tip "tip: vi" To start entering text in vi: move to the place you want to start @@ -467,11 +484,12 @@ Connection to {{ loginnode }} closed. or alternatively (if you are not comfortable with the Linux editors), again on your local machine: -
$ echo "export LANGUAGE=\"en_US.UTF-8\"" >> ~/.profile
-    $ echo "export LC_ALL=\"en_US.UTF-8\"" >> ~/.profile
-    $ echo "export LC_CTYPE=\"en_US.UTF-8\"" >> ~/.profile
-    $ echo "export LANG=\"en_US.UTF-8\"" >> ~/.profile
-    
+ ``` + echo "export LANGUAGE=\"en_US.UTF-8\"" >> ~/.profile + echo "export LC_ALL=\"en_US.UTF-8\"" >> ~/.profile + echo "export LC_CTYPE=\"en_US.UTF-8\"" >> ~/.profile + echo "export LANG=\"en_US.UTF-8\"" >> ~/.profile + ``` You can now log out, open a new terminal/shell on your local machine and reconnect to the login node, and you should not get these warnings anymore. @@ -587,9 +605,10 @@ you have symlinks to them in your home directory. See {{ LinuxManualURL.LinuxMan Open an additional terminal window and check that you're working on your local machine. -
$ hostname
+```
+$ hostname
 
-
+``` If you're still using the terminal that is connected to the {{ hpc }}, close the connection by typing "exit" in the terminal window. @@ -602,29 +621,31 @@ end: if you forget it, it will just create a file named {{ userid }}@{{ loginnod local filesystem. You can even specify where to save the file on the remote filesystem by putting a path after the colon. -
$ echo "Hello" > localfile.txt
-$ ls -l 
+```
+$ echo "Hello" > localfile.txt
+$ ls -l 
 ...
 -rw-r--r-- 1 user  staff   6 Sep 18 09:37 localfile.txt
 $ scp localfile.txt {{ userid }}@{{ loginnode }}:
 localfile.txt     100%   6     0.0KB/s     00:00
-
+``` Connect to the {{ hpc }} via another terminal, print the working directory (to make sure you're in the home directory) and check whether the file has arrived: -
$ pwd
+```
+$ pwd
 {{ homedir }}
-$ ls -l 
+$ ls -l 
 total 1536
 drwxrwxr-x 2
 drwxrwxr-x 2
 drwxrwxr-x 10
 -rw-r--r-- 1
-$ cat localfile.txt
+$ cat localfile.txt
 Hello
-
+``` The **scp** command can also be used to copy files from the cluster to your local machine. Let us copy the remote file "intro-HPC-{{ OS }}-{{ site }}.pdf" from your "docs" @@ -633,22 +654,24 @@ subdirectory on the cluster to your local computer. First, we will confirm that the file is indeed in the "docs" subdirectory. In the terminal on the login node, enter: -
$ cd ~/docs
-$ ls -l
+```
+$ cd ~/docs
+$ ls -l
 total 1536
 -rw-r--r-- 1 {{ userid }} Sep 11 09:53 intro-HPC-{{ OS }}-{{ site }}.pdf
-
+``` Now we will copy the file to the local machine. On the terminal on your own local computer, enter: -
$ scp {{ userid }}@{{ loginnode }}:./docs/intro-HPC-{{ OS }}-{{ site }}.pdf .
+```
+$ scp {{ userid }}@{{ loginnode }}:./docs/intro-HPC-{{ OS }}-{{ site }}.pdf .
 intro-HPC-{{ OS }}-{{ site }}.pdf 100% 725KB 724.6KB/s 00:01
-$ ls -l
+$ ls -l
 total 899
 -rw-r--r-- 1 user staff 741995 Sep 18 09:53
 -rw-r--r-- 1 user staff      6 Sep 18 09:37 localfile.txt
-
+``` The file has been copied from the HPC to your local computer. @@ -657,15 +680,17 @@ the `-r` flag. For example, if we want to copy the local directory `dataset` to `$VSC_SCRATCH`, we can use the following command (assuming you've created the `scratch` symlink): -
$ scp -r dataset {{ userid }}@{{ loginnode }}:scratch
-
+``` +scp -r dataset {{ userid }}@{{ loginnode }}:scratch +``` If you don't use the `-r` option to copy a directory, you will run into the following error: -
$ scp -r dataset {{ userid }}@{{ loginnode }}:scratch
+```
+$ scp dataset {{ userid }}@{{ loginnode }}:scratch
 dataset: not a regular file
-
+``` ### Using sftp @@ -682,8 +707,9 @@ it uses the secure ssh protocol to connect to the clusters. One easy way of starting a sftp session is -
$ sftp {{ userid }}@{{ loginnode }}
-
+``` +sftp {{ userid }}@{{ loginnode }} +``` Typical and popular commands inside an sftp session are: @@ -772,13 +798,14 @@ ssh {{loginhost}} This is also possible the other way around. If you want to find out which login host you are connected to, you can use the `hostname` command. -
$ hostname
+```
+$ hostname
 {{loginhost}}
-$ ssh {{altloginhost}}
+$ ssh {{altloginhost}}
 
-$ hostname
+$ hostname
 {{altloginhost}}
-
+``` Rather than always starting a new session on the HPC, you can also use a terminal multiplexer like `screen` or `tmux`. diff --git a/mkdocs/docs/HPC/crontab.md b/mkdocs/docs/HPC/crontab.md index d259f52fb60..78a2f6f6d50 100644 --- a/mkdocs/docs/HPC/crontab.md +++ b/mkdocs/docs/HPC/crontab.md @@ -13,24 +13,26 @@ as usual with your vsc user's account (see section Check if any cron script is already set in the current login node with: -
$ crontab -l
-
+``` +crontab -l +``` At this point you can add/edit (with `vi` editor) any cron script running the command: -
$ crontab -e
-
+``` +crontab -e +``` #### Example cron job script -
 15 5 * * * ~/runscript.sh >& ~/job.out
-
+``` + 15 5 * * * ~/runscript.sh >& ~/job.out +``` where `runscript.sh` has these lines in this example: -
-- runscript.sh --
-```bash +```bash title="runscript.sh" {% include "./examples/Cron-scripts/runscript.sh" %} ``` @@ -42,5 +44,6 @@ Please note that you should login into the same login node to edit your previously generated crontab tasks. If that is not the case you can always jump from one login node to another with: -
$ ssh gligar<id>
-
+``` +ssh gligar07 # or gligar08 +``` diff --git a/mkdocs/docs/HPC/easybuild.md b/mkdocs/docs/HPC/easybuild.md index 8ae366880d5..668eb9ad767 100644 --- a/mkdocs/docs/HPC/easybuild.md +++ b/mkdocs/docs/HPC/easybuild.md @@ -34,8 +34,9 @@ Before you use EasyBuild, you need to configure it: This is where EasyBuild can find software sources: -
$ EASYBUILD_SOURCEPATH=$VSC_DATA/easybuild/sources:/apps/gent/source 
-
+``` +EASYBUILD_SOURCEPATH=$VSC_DATA/easybuild/sources:/apps/gent/source +``` - the first directory `$VSC_DATA/easybuild/sources` is where EasyBuild will (try to) automatically download sources if they're not @@ -50,8 +51,9 @@ This is where EasyBuild can find software sources: This directory is where EasyBuild will build software in. To have good performance, this needs to be on a fast filesystem. -
$ export EASYBUILD_BUILDPATH=${TMPDIR:-/tmp/$USER}
-
+``` +export EASYBUILD_BUILDPATH=${TMPDIR:-/tmp/$USER} +``` On cluster nodes, you can use the fast, in-memory `/dev/shm/$USER` location as a build directory. @@ -63,8 +65,9 @@ modules) to. For example, to let it use `$VSC_DATA/easybuild`, use: -
$ export EASYBUILD_INSTALLPATH=$VSC_DATA/easybuild/$VSC_OS_LOCAL/$VSC_ARCH_LOCAL$VSC_ARCH_SUFFIX
-
+``` +export EASYBUILD_INSTALLPATH=$VSC_DATA/easybuild/$VSC_OS_LOCAL/$VSC_ARCH_LOCAL$VSC_ARCH_SUFFIX +``` Using the `$VSC_OS_LOCAL`, `$VSC_ARCH` and `$VSC_ARCH_SUFFIX` environment variables ensures that your install software to a location @@ -84,8 +87,9 @@ We don't specify a version here (this is an exception, for most other modules you should see [Using explicit version numbers](../running_batch_jobs/#using-explicit-version-numbers)) because newer versions might include important bug fixes. -
module load EasyBuild
-
+``` +module load EasyBuild +``` ### Installing supported software @@ -95,18 +99,20 @@ versions, combined with a particular toolchain version. Use the (build recipes, see ) are available: -
$ eb -S example-1.2
+```
+$ eb -S example-1.2
 CFGS1=/apps/gent/CO7/sandybridge/software/EasyBuild/3.6.2/lib/python2.7/site-packages/easybuild_easyconfigs-3.6.2-py2.7.egg/easybuild/easyconfigs
  * $CFGS1/e/example/example-1.2.1-foss-{{ current_year}}a.eb
  * $CFGS1/e/example/example-1.2.3-foss-{{ current_year}}b.eb
  * $CFGS1/e/example/example-1.2.5-intel-{{ current_year}}a.eb
-
+``` For readily available easyconfigs, just specify the name of the easyconfig file to build and install the corresponding software package: -
$ eb example-1.2.1-foss-{{ current_year}}a.eb --robot
-
+``` +eb example-1.2.1-foss-2024a.eb --robot +``` ### Installing variants on supported software @@ -117,13 +123,15 @@ corresponding `--try-X` options: To try to install `example v1.2.6`, based on the easyconfig file for `example v1.2.5`: -
$ eb example-1.2.5-intel-{{ current_year}}a.eb --try-software-version=1.2.6
-
+``` +eb example-1.2.5-intel-{{ current_year}}a.eb --try-software-version=1.2.6 +``` To try to install example v1.2.5 with a different compiler toolchain: -
$ eb example-1.2.5-intel-{{ current_year}}a.eb --robot --try-toolchain=intel,{{ current_year}}b
-
+``` +eb example-1.2.5-intel-{{ current_year}}a.eb --robot --try-toolchain=intel,{{ current_year}}b +``` ### Install other software @@ -137,8 +145,9 @@ for more information. To use the modules you installed with EasyBuild, extend `$MODULEPATH` to make them accessible for loading: -
$ module use $EASYBUILD_INSTALLPATH/modules/all
-
+``` +module use $EASYBUILD_INSTALLPATH/modules/all +``` It makes sense to put this `module use` command and all `export` commands in your `.bashrc` login script. @@ -149,4 +158,3 @@ See also [the section on `.bashrc` in the "Beyond the basics" chapter of the intro to Linux](../linux-tutorial/beyond_the_basics/#bashrc-login-script) -{{ site_url }} diff --git a/mkdocs/docs/HPC/fine_tuning_job_specifications.md b/mkdocs/docs/HPC/fine_tuning_job_specifications.md index fa6849ceae1..20499a091ef 100644 --- a/mkdocs/docs/HPC/fine_tuning_job_specifications.md +++ b/mkdocs/docs/HPC/fine_tuning_job_specifications.md @@ -49,8 +49,11 @@ This chapter shows you how to measure: 5. Network bottlenecks First, we allocate a compute node and move to our relevant directory: -
$ qsub -I
-$ cd ~/examples/Fine-tuning-Job-Specifications
+ +``` +qsub -I +cd ~/examples/Fine-tuning-Job-Specifications +``` ## Specifying Walltime [//]: # (sec:specifying-walltime-requirements) @@ -67,10 +70,13 @@ execute your executable to the standard error stream. The calculated times are reported in seconds. Test the time command: -
$ time sleep 75
+
+```
+$ time sleep 75
 real 1m15.005s
 user 0m0.001s
-sys 0m0.002s
+sys 0m0.002s +``` It is a good practice to correctly estimate and specify the run time (duration) of an application. Of course, a margin of 10% to 20% can be @@ -82,10 +88,15 @@ should be appropriate in case your application will run on the "slowest" (oldest) compute nodes. The walltime can be specified in a job scripts as: -
#PBS -l walltime=3:00:00:00
+ +``` +#PBS -l walltime=3:00:00:00 +``` or on the command line -
$ qsub -l walltime=3:00:00:00
+``` +qsub -l walltime=3:00:00:00 +``` It is recommended to always specify the walltime for a job. @@ -106,9 +117,12 @@ parameter, the amount of gigabytes of memory which needs to be allocated. First compile the program on your machine and then test it for 1 GB: -
$ gcc -o eat_mem eat_mem.c
-$ ./eat_mem 1
-Consuming 1 gigabyte of memory.
+ +``` +$ gcc -o eat_mem eat_mem.c +$ ./eat_mem 1 +Consuming 1 gigabyte of memory. +``` {% endif %} ### Available Memory on the machine @@ -118,12 +132,15 @@ computer. The "*free*" command displays the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel. We also use the options "-m" to see the results expressed in Mega-Bytes and the "-t" option to get totals. -
$ free -m -t
+
+```
+$ free -m -t
                 total   used   free  shared  buffers  cached
 Mem:            16049   4772  11277       0      107     161
 -/+ buffers/cache:      4503  11546
 Swap:           16002   4185  11816
-Total:          32052   8957  23094
+Total: 32052 8957 23094 +``` Important is to note the total amount of memory available in the machine (i.e., 16 GB in this example) and the amount of used and free memory @@ -146,14 +163,18 @@ release. To start using monitor, first load the appropriate module. Then we study the "eat_mem.c" program and compile it: -
$ module load monitor
-$ cat eat_mem.c
-$ gcc -o eat_mem eat_mem.c
-
+ +``` +$ module load monitor +$ cat eat_mem.c +$ gcc -o eat_mem eat_mem.c +``` Starting a program to monitor is very straightforward; you just add the "monitor" command before the regular command line. -
$ monitor ./eat_mem 3
+
+```
+$ monitor ./eat_mem 3
 time (s) size (kb) %mem %cpu
 Consuming 3 gigabyte of memory.
 5  252900 1.4 0.6
@@ -172,7 +193,7 @@ Consuming 3 gigabyte of memory.
 70  3167280 19.2 0.2
 75  9264  0 0.5
 80  9264  0 0.4
-
+``` Whereby: @@ -198,24 +219,33 @@ This is the rate at which monitor samples the program's metrics. Since monitor's output may interfere with that of the program to monitor, it is often convenient to use a log file. The latter can be specified as follows: -
$ monitor -l test1.log eat_mem 2
+
+```
+$ monitor -l test1.log eat_mem 2
 Consuming 2 gigabyte of memory.
-$ cat test1.log
+$ cat test1.log +``` For long-running programs, it may be convenient to limit the output to, e.g., the last minute of the programs' execution. Since monitor provides metrics every 5 seconds, this implies we want to limit the output to the last 12 values to cover a minute: -
$ monitor -l test2.log -n 12 eat_mem 4
-Consuming 4 gigabyte of memory.
+ +``` +$ monitor -l test2.log -n 12 eat_mem 4 +Consuming 4 gigabyte of memory. +``` Note that this option is only available when monitor writes its metrics to a log file, not when standard error is used. The interval at which monitor will show the metrics can be modified by specifying delta, the sample rate: -
$ monitor -d 1 ./eat_mem
-Consuming 3 gigabyte of memory.
+ +``` +$ monitor -d 1 ./eat_mem +Consuming 3 gigabyte of memory. +``` Monitor will now print the program's metrics every second. Note that the minimum delta value is 1 second. {% endif %} @@ -239,8 +269,6 @@ htop the machine and allows to scroll the list vertically and horizontally to see all processes and their full command lines. -
$ top
-$ htop
### Setting the memory parameter {: #pbs_mem } @@ -252,19 +280,25 @@ a margin of about 10%. The maximum amount of physical memory used by the job per node can be specified in a job script as: {% else %} -Sequential or single-node applications: +Sequential or single-node applications: The maximum amount of physical memory used by the job can be specified in a job script as: {% endif %} -
#PBS -l mem=4gb
+ +``` +#PBS -l mem=4gb +``` or on the command line -
$ qsub -l mem=4gb
+ +``` +qsub -l mem=4gb +``` {% if site!=gent %} This setting is ignored if the number of nodes is not 1. -Parallel or multi-node applications: +Parallel or multi-node applications: When you are running a parallel application over multiple cores, you can also specify the memory requirements per processor (pmem). This @@ -273,10 +307,16 @@ process in the job. For example, if the job would run four processes and each would use up to 2 GB (gigabytes) of memory, then the memory directive would read: -
#PBS -l pmem=2gb
+ +``` +#PBS -l pmem=2gb +``` or on the command line -
$ qsub -l pmem=2gb
+ +``` +$ qsub -l pmem=2gb +``` (and of course this would need to be combined with a CPU cores directive such as nodes=1:ppn=4). In this example, you request 8 GB of memory in @@ -306,7 +346,9 @@ The */proc/cpuinfo* stores info about your CPU architecture like number of CPUs, threads, cores, information about CPU caches, CPU family, model and much more. So, if you want to detect how many cores are available on a specific machine: -
$ less /proc/cpuinfo
+
+```
+$ less /proc/cpuinfo
 processor       : 0
 vendor_id       : GenuineIntel
 cpu family      : 6
@@ -315,10 +357,13 @@ model name      : Intel(R) Xeon(R) CPU  E5420  @ 2.50GHz
 stepping        : 10
 cpu MHz         : 2500.088
 cache size      : 6144 KB
-...
+... +``` Or if you want to see it in a more readable format, execute: -
$ grep processor /proc/cpuinfo
+
+```
+$ grep processor /proc/cpuinfo
 processor : 0
 processor : 1
 processor : 2
@@ -326,7 +371,8 @@ processor : 3
 processor : 4
 processor : 5
 processor : 6
-processor : 7
+processor : 7 +``` !!! note Unless you want information of the login nodes, you'll have to issue @@ -335,10 +381,16 @@ processor : 7 In order to specify the number of nodes and the number of processors per node in your job script, use: -
#PBS -l nodes=N:ppn=M
+ +``` +#PBS -l nodes=N:ppn=M +``` or with equivalent parameters on the command line -
$ qsub -l nodes=N:ppn=M
+ +``` +qsub -l nodes=N:ppn=M +``` This specifies the number of nodes (nodes=N) and the number of processors per node (ppn=M) that the job should use. PBS treats a @@ -347,11 +399,15 @@ node can have ppn=8 as its maximum ppn request. {% if site !=antwerpen %} You can also use this statement in your job script: -
#PBS -l nodes=N:ppn=all
+``` +#PBS -l nodes=N:ppn=all +``` to request all cores of a node, or -
#PBS -l nodes=N:ppn=half
+``` +#PBS -l nodes=N:ppn=half +``` to request half of them. {% endif %} @@ -369,12 +425,17 @@ The previously used "monitor" tool also shows the overall CPU-load. The We first load the monitor modules, study the "eat_cpu.c" program and compile it: -
$ module load monitor
-$ cat eat_cpu.c
-$ gcc -o eat_cpu eat_cpu.c
+ +``` +$ module load monitor +$ cat eat_cpu.c +$ gcc -o eat_cpu eat_cpu.c +``` And then start to monitor the *eat_cpu* program: -
$ monitor -d 1 ./eat_cpu
+
+```
+$ monitor -d 1 ./eat_cpu
 time  (s) size (kb) %mem %cpu
 1  52852  0.3 100
 2  52852  0.3 100
@@ -383,7 +444,8 @@ time  (s) size (kb) %mem %cpu
 5  52852  0.3  99
 6  52852  0.3 100
 7  52852  0.3 100
-8  52852  0.3 100
+8 52852 0.3 100 +``` We notice that it the program keeps its CPU nicely busy at 100%. @@ -399,7 +461,11 @@ programs of this type are running on a computer with n cores, the CPU usage can go up to \(\text{n} \times 100\%\). {% endif %} This could also be monitored with the _**htop**_ command: -
$ htop
+ +``` +htop +``` +Example output: ``` {% include "examples/Fine-tuning-Job-Specifications/htop-output" %} ``` @@ -512,19 +578,23 @@ conventionally appears in the form of three numbers, which represent the system load during the last **one**-, **five**-, and **fifteen**-minute periods. The **uptime** command will show us the average load -
$ uptime
+
+```
+$ uptime
 10:14:05 up 86 days, 12:01, 11 users, load average: 0.60, 0.41, 0.41
-
+``` Now, compile and start a few instances of the "*eat_cpu*" program in the background, and check the effect on the load again: -
$ gcc -O2 eat_cpu.c -o eat_cpu
-$ ./eat_cpu&
-$ ./eat_cpu&
-$ ./eat_cpu&
-$ uptime
+
+```
+$ gcc -O2 eat_cpu.c -o eat_cpu
+$ ./eat_cpu&
+$ ./eat_cpu&
+$ ./eat_cpu&
+$ uptime
 10:14:42 up 86 days, 12:02, 11 users, load average: 2.60, 0.93, 0.58
-
+``` You can also read it in the **htop** command. ### Fine-tuning your executable and/or job script @@ -565,14 +635,18 @@ by the *$VSC_SCRATCH_NODE* environment variable. {% if site != gent %} We first load the monitor modules, study the "eat_disk.c" program and compile it: -
$ module load monitor
-$ cat eat_disk.c
-$ gcc -o eat_disk eat_disk.c
-
+ +``` +$ module load monitor +$ cat eat_disk.c +$ gcc -o eat_disk eat_disk.c +``` The *monitor* tool provides an option (-f) to display the size of one or more files: -
$ monitor -f $VSC_SCRATCH/test.txt ./eat_disk
+
+```
+$ monitor -f $VSC_SCRATCH/test.txt ./eat_disk
 time (s) size (kb) %mem %cpu
 5  1276  0 38.6 168820736
 10  1276  0 24.8 238026752
@@ -581,7 +655,7 @@ time (s) size (kb) %mem %cpu
 25  1276  0 26.9 614465536
 30  1276  0 27.7 760217600
 ...
-
+``` Here, the size of the file "*test.txt*" in directory $VSC_SCRATCH will be monitored. Files can be specified by absolute as well as relative @@ -623,11 +697,17 @@ specialised high bandwidth, low latency network that enables large parallel jobs to run as efficiently as possible. The parameter to add in your job script would be: -
#PBS -l ib
+ +``` +#PBS -l ib +``` If for some other reasons, a user is fine with the gigabit Ethernet network, he can specify: -
#PBS -l gbe
+ +``` +#PBS -l gbe +``` {% if site!=gent %} ## Some more tips on the Monitor tool @@ -637,7 +717,10 @@ network, he can specify: Many programs, e.g., MATLAB, take command line options. To make sure these do not interfere with those of monitor and vice versa, the program can for instance be started in the following way: -
$ monitor -delta 60 -- matlab -nojvm -nodisplay computation.m
+ +``` +$ monitor -delta 60 -- matlab -nojvm -nodisplay computation.m +``` The use of `--` will ensure that monitor does not get confused by MATLAB's `-nojvm` and `-nodisplay` options. @@ -660,7 +743,10 @@ value. It is also possible to "attach" monitor to a program or process that is already running. One simply determines the relevant process ID using the ps command, e.g., 18749, and starts monitor: -
$ monitor -p 18749
+ +``` +$ monitor -p 18749 +``` Note that this feature can be (ab)used to monitor specific sub-processes. {% endif %} diff --git a/mkdocs/docs/HPC/getting_started.md b/mkdocs/docs/HPC/getting_started.md index 8fe33ebc513..861b73df10e 100644 --- a/mkdocs/docs/HPC/getting_started.md +++ b/mkdocs/docs/HPC/getting_started.md @@ -45,11 +45,13 @@ it is recommended to make use of the `ssh` command in a terminal to get the most Assuming you have already generated SSH keys in the previous step ([Getting Access](#getting-access)), and that they are in a default location, you should now be able to login by running the following command: -
ssh {{userid}}@{{loginnode}}
+```shell +ssh {{userid}}@{{loginnode}} +``` !!! Warning "User your own VSC account id" - Replace {{userid}} with your VSC account id (see ) + Replace **{{userid}}** with your VSC account id (see ) !!! Tip @@ -94,17 +96,23 @@ Upload both files (`run.sh` and `tensorflow-mnist.py`) to your **home directory* {%- else %} On your local machine you can run: -
curl -OL https://raw.githubusercontent.com/hpcugent/vsc_user_docs/main/{{exampleloc}}/tensorflow_mnist.py
+```shell
+curl -OL https://raw.githubusercontent.com/hpcugent/vsc_user_docs/main/{{exampleloc}}/tensorflow_mnist.py
 curl -OL https://raw.githubusercontent.com/hpcugent/vsc_user_docs/main/{{exampleloc}}/run.sh
-
+``` Using the `scp` command, the files can be copied from your local host to your *home directory* (`~`) on the remote host (HPC). -
scp tensorflow_mnist.py run.sh {{userid}}{{ loginnode }}:~ 
-
ssh  {{userid}}@{{ loginnode }} 
+```shell +scp tensorflow_mnist.py run.sh {{userid}}{{ loginnode }}:~ +``` + +```shell +ssh {{userid}}@{{ loginnode }} +``` !!! Warning "User your own VSC account id" - Replace {{userid}} with your VSC account id (see ) + Replace **{{userid}}** with your VSC account id (see ) !!! Info @@ -114,7 +122,7 @@ Using the `scp` command, the files can be copied from your local host to your *h When running `ls` in your session on the {{hpcinfra}}, you should see the two files listed in your home directory (`~`): -```shell +``` $ ls ~ run.sh tensorflow_mnist.py ``` @@ -131,17 +139,14 @@ and the steps that should be executed to run the calculation. Our job script looks like this: -
-- run.sh --
- -```bash +```bash title="run.sh" #!/bin/bash module load TensorFlow/2.11.0-foss-2022a python tensorflow_mnist.py - ``` -As you can see this job script will run the Python script named **tensorflow_mnist.py**. +As you can see this job script will run the Python script named **tensorflow_mnist.py**. The jobs you submit are per default executed on **cluser/{{defaultcluster}}**, you can swap to another cluster by issuing the following command. @@ -162,7 +167,7 @@ module swap cluster/{{othercluster}} This job script can now be submitted to the cluster's job system for execution, using the qsub (**q**ueue **sub**mit) command: -```shell +``` $ qsub run.sh {{jobid}} ``` @@ -186,18 +191,20 @@ Your job is put into a queue before being executed, so it may take a while befor (see [when will my job start?](running_batch_jobs.md#when-will-my-job-start) for scheduling policy). You can get an overview of the active jobs using the `qstat` command: -
$ qstat
+```
+$ qstat
 Job ID     Name             User            Time Use S Queue
 ---------- ---------------- --------------- -------- - -------
-{{jobid}}     run.sh           {{userid}}        0:00:00  Q {{othercluster}}
-
+{{jobid}} run.sh {{userid}} 0:00:00 Q {{othercluster}} +``` Eventually, after entering `qstat` again you should see that your job has started running: -
$ qstat
+```
+$ qstat
 Job ID     Name             User            Time Use S Queue
 ---------- ---------------- --------------- -------- - -------
-{{jobid}}     run.sh           {{userid}}        0:00:01  R {{othercluster}}
-
+{{jobid}} run.sh {{userid}} 0:00:01 R {{othercluster}} +``` If you don't see your job in the output of the `qstat` command anymore, your job has likely completed. @@ -220,7 +227,7 @@ By default located in the directory where you issued `qsub`. {%- endif %} -In our example when running ls in the current directory you should see 2 new files: +In our example when running `ls` in the current directory you should see 2 new files: - **run.sh.o{{jobid}}**, containing *normal output messages* produced by job {{jobid}}; - **run.sh.e{{jobid}}**, containing *errors and warnings* produced by job {{jobid}}. @@ -231,7 +238,7 @@ In our example when running ls in the current directory you should !!! Warning "Use your own job ID" - Replace {{jobid}} with the jobid you got from the `qstat` command (see above) or simply look for added files in your current directory by running `ls`. + Replace **{{jobid}}** with the jobid you got from the `qstat` command (see above) or simply look for added files in your current directory by running `ls`. When examining the contents of ``run.sh.o{{jobid}}`` you will see something like this: ``` diff --git a/mkdocs/docs/HPC/gpu.md b/mkdocs/docs/HPC/gpu.md index 53f4ab1ea09..46257f80c64 100644 --- a/mkdocs/docs/HPC/gpu.md +++ b/mkdocs/docs/HPC/gpu.md @@ -5,14 +5,16 @@ To submit jobs to the `joltik` GPU cluster, where each node provides 4 NVIDIA V100 GPUs (each with 32GB of GPU memory), use: -
$ module swap cluster/joltik
-
+``` +module swap cluster/joltik +``` To submit to the `accelgor` GPU cluster, where each node provides 4 NVIDIA A100 GPUs (each with 80GB GPU memory), use: -
$ module swap cluster/accelgor
-
+``` +module swap cluster/accelgor +``` Then use the familiar `qsub`, `qstat`, etc. commands, taking into account the guidelines outlined in @@ -158,8 +160,7 @@ versions. #### Example TensorFlow job script -
-- TensorFlow_GPU.sh --
-```bash +```bash title="TensorFlow_GPU.sh" {% include "./examples/HPC-UGent-GPU-clusters/TensorFlow_GPU.sh" %} ``` diff --git a/mkdocs/docs/HPC/index.md b/mkdocs/docs/HPC/index.md index bf7ce9393c3..3554133a8ed 100644 --- a/mkdocs/docs/HPC/index.md +++ b/mkdocs/docs/HPC/index.md @@ -10,7 +10,8 @@ hide: "toc" Use the menu on the left to navigate, or use the search box on the top right. -You are viewing documentation intended for people using **{{ OS }}**.
+You are viewing documentation intended for people using **{{ OS }}**. + Use the OS dropdown in the top bar to switch to a different operating system. **Quick links** diff --git a/mkdocs/docs/HPC/interactive_debug.md b/mkdocs/docs/HPC/interactive_debug.md index 11b3a84dfaf..d4cf3d506ba 100644 --- a/mkdocs/docs/HPC/interactive_debug.md +++ b/mkdocs/docs/HPC/interactive_debug.md @@ -40,8 +40,9 @@ typical workloads for this cluster should be limited to: To submit jobs to the HPC-UGent interactive and debug cluster nicknamed `donphan`, first use: -
$ module swap cluster/donphan
-
+``` +module swap cluster/donphan +``` Then use the familiar `qsub`, `qstat`, etc. commands (see chapter [Running batch jobs](../running_batch_jobs/#running-batch-jobs)). diff --git a/mkdocs/docs/HPC/introduction.md b/mkdocs/docs/HPC/introduction.md index ad8d2e98994..da660ef93e8 100644 --- a/mkdocs/docs/HPC/introduction.md +++ b/mkdocs/docs/HPC/introduction.md @@ -59,15 +59,12 @@ extremely fast solution for all their data processing needs. The {{ hpc }} consists of: -
+| In technical terms | ... in human terms | +|:--------------------------------------------|:----------------------------------------------| +| over 280 nodes and over 11000 cores | ... or the equivalent of 2750 quad-core PCs | +| over 500 Terabyte of online storage | ... or the equivalent of over 60000 DVDs | +| up to 100 Gbit InfiniBand fiber connections | ... or allowing to transfer 3 DVDs per second | -|In technical terms | ... in human terms | -|:--------------------------------------------|:---------------------------------------------- | -| over 280 nodes and over 11000 cores | ... or the equivalent of 2750 quad-core PCs | -| over 500 Terabyte of online storage | ... or the equivalent of over 60000 DVDs | -| up to 100 Gbit InfiniBand fiber connections | ... or allowing to transfer 3 DVDs per second | - -
{%- endif %} diff --git a/mkdocs/docs/HPC/jobscript_examples.md b/mkdocs/docs/HPC/jobscript_examples.md index 7fd6faaafd4..1ada64baee7 100644 --- a/mkdocs/docs/HPC/jobscript_examples.md +++ b/mkdocs/docs/HPC/jobscript_examples.md @@ -17,18 +17,15 @@ The basic parameters should always be used. Some notes on the situational parame * To use a situational parameter, remove one '`#`' at the beginning of the line. -

simple_jobscript.sh

- -```shell +```shell title="simple_jobscript.sh" {% include "examples/Job-script-examples/simple_jobscript.sh" %} ``` ## Single-core job Here's an example of a single-core job script: -

single_core.sh

-```shell +```shell title="single_core.sh" {% include "examples/Job-script-examples/single_core.sh" %} ``` @@ -52,9 +49,7 @@ Here's an example of a single-core job script: Here's an example of a multi-core job script that uses `mympirun`: -

multi_core.sh

- -```shell +```shell title="multi_core.sh" {% include "examples/Job-script-examples/multi_core.sh" %} ``` @@ -73,17 +68,13 @@ This can be done with the `timeout` command. This command sets a limit of time a program can run for, and when this limit is exceeded, it kills the program. Here's an example job script using `timeout`: -

timeout.sh

- -```shell +```shell title="timeout.sh" {% include "examples/Job-script-examples/timeout.sh" %} ``` The example program used in this script is a dummy script that simply sleeps a specified amount of minutes: -

example_program.sh

- -```shell +```shell title="example_program.sh" {% include "examples/Job-script-examples/example_program.sh" %} ``` diff --git a/mkdocs/docs/HPC/linux-tutorial/beyond_the_basics.md b/mkdocs/docs/HPC/linux-tutorial/beyond_the_basics.md index ca3f916a8be..2810bab0fe5 100644 --- a/mkdocs/docs/HPC/linux-tutorial/beyond_the_basics.md +++ b/mkdocs/docs/HPC/linux-tutorial/beyond_the_basics.md @@ -21,24 +21,26 @@ channels: `>` writes the (`stdout`) output of a command to a file and *overwrites* whatever was in the file before. -
$ echo hello > somefile
-$ cat somefile
+```
+$ echo hello > somefile
+$ cat somefile
 hello
-$ echo hello2 > somefile
-$ cat somefile
+$ echo hello2 > somefile
+$ cat somefile
 hello2
-
+``` `>>` appends the (`stdout`) output of a command to a file; it does not clobber whatever was in the file before: -
$ echo hello > somefile
-$ cat somefile 
+```
+$ echo hello > somefile
+$ cat somefile 
 hello
-$ echo hello2 >> somefile
+$ echo hello2 >> somefile
 $ cat somefile
 hello
 hello2
-
+``` ### Reading from `stdin` @@ -51,29 +53,32 @@ command and store the results in a file, so you don't have to repeat it while you refine your command line. For example, if you have a large directory structure you might save a list of all the files you're interested in and then reading in the file list when you are done: -
$ find . -name .txt > files
-$ xargs grep banana < files
-
+``` +$ find . -name .txt > files +$ xargs grep banana < files +``` ### Redirecting `stderr` To redirect the `stderr` output (warnings, messages), you can use `2>`, just like `>` -
$ ls one.txt nosuchfile.txt 2> errors.txt
+```
+$ ls one.txt nosuchfile.txt 2> errors.txt
 one.txt
-$ cat errors.txt
+$ cat errors.txt
 ls: nosuchfile.txt: No such file or directory
-
+``` ### Combining `stdout` and `stderr` To combine both output channels (`stdout` and `stderr`) and redirect them to a single file, you can use `&>` -
$ ls one.txt nosuchfile.txt &> ls.out
-$ cat ls.out
+```
+$ ls one.txt nosuchfile.txt &> ls.out
+$ cat ls.out
 ls: nosuchfile.txt: No such file or directory
 one.txt
-
+``` ## Command piping @@ -82,20 +87,27 @@ together to create useful results. The core of these is the pipe: `|`. For example, to see the number of files in a directory, we can pipe the (`stdout`) output of `ls` to `wc` (**w**ord **c**ount, but can also be used to count the number of lines with the `-l` flag). -
$ ls | wc -l
+```
+$ ls | wc -l
     42
-
+``` A common pattern is to pipe the output of a command to `less` so you can examine or search the output: -
$ find . | less
+``` +$ find . | less +``` Or to look through your command history: -
$ history | less
+``` +$ history | less +``` You can put multiple pipes in the same line. For example, which `cp` commands have we run? -
$ history | grep cp | less
+``` +$ history | grep cp | less +``` ## Shell expansion @@ -124,13 +136,19 @@ The shell will expand certain things, including: `ps` lists processes running. By default, it will only show you the processes running in the local shell. To see all of your processes running on the system, use: -
$ ps -fu $USER
+``` +$ ps -fu $USER +``` To see all the processes: -
$ ps -elf
+``` +$ ps -elf +``` To see all the processes in a forest view, use: -
$ ps auxf
+``` +$ ps auxf +``` The last two will spit out a lot of data, so get in the habit of piping it to `less`. @@ -148,17 +166,19 @@ together as we will see in the next section. this using the `kill` command. Kill will send a message ([SIGINT](https://en.wikipedia.org/wiki/Unix_signal#POSIX_signals)) to the process to ask it to stop. -
$ kill 1234
-$ kill $(pgrep misbehaving_process)
-
+``` +$ kill 1234 +$ kill $(pgrep misbehaving_process) +``` Usually, this ends the process, giving it the opportunity to flush data to files, etc. However, if the process ignored your signal, you can send it a different message ([SIGKILL](https://en.wikipedia.org/wiki/Unix_signal#POSIX_signals)) which the OS will use to unceremoniously terminate the process: -
$ kill -9 1234
-
+``` +$ kill -9 1234 +``` ### `top` @@ -186,22 +206,26 @@ performance analysis](http://brendangregg.com). `ulimit` is a utility to get or set user limits on the machine. For example, you may be limited to a certain number of processes. To see all the limits that have been set, use: -
$ ulimit -a
+``` +$ ulimit -a +``` ## Counting: `wc` To count the number of lines, words, and characters (or bytes) in a file, use `wc` (**w**ord **c**ount): -
$ wc example.txt
+```
+$ wc example.txt
       90     468     3189   example.txt
-
+``` The output indicates that the file named `example.txt` contains 90 lines, 468 words, and 3189 characters/bytes. To only count the number of lines, use `wc -l`: -
$ wc -l example.txt
+```
+$ wc -l example.txt
       90    example.txt
-
+``` ## Searching file contents: `grep` @@ -209,10 +233,11 @@ To only count the number of lines, use `wc -l`: "globally search a regular expression and print" but it's entered the common computing lexicon and people use 'grep' to mean searching for anything. To use grep, you give a pattern and a list of files. -
$ grep banana fruit.txt
-$ grep banana fruit_bowl1.txt fruit_bowl2.txt
-$ grep banana fruit*txt
-
+``` +$ grep banana fruit.txt +$ grep banana fruit_bowl1.txt fruit_bowl2.txt +$ grep banana fruit*txt +``` `grep` also lets you search for [Regular Expressions](https://en.wikipedia.org/wiki/Regular_expression), but @@ -226,7 +251,9 @@ lines where a string occurs and `cut` can pull out a particular field. For example, to pull the first column (`-f 1`, the first **f**ield) from (an unquoted) CSV (comma-separated values, so `-d ','`: **d**elimited by `,`) file, you can use the following: -
$ cut -f 1 -d ',' mydata.csv
+``` +$ cut -f 1 -d ',' mydata.csv +``` ## `sed` @@ -235,7 +262,9 @@ piped stream. In this way, it works like grep, but instead of just searching, it can also edit files. This is like "Search and Replace" in a text editor. `sed` has a lot of features, but almost everyone uses the extremely basic version of string replacement: -
$ sed 's/oldtext/newtext/g' myfile.txt
+``` +$ sed 's/oldtext/newtext/g' myfile.txt +``` By default, sed will just print the results. If you want to edit the file inplace, use `-i`, but be very careful that the results will be @@ -254,12 +283,16 @@ whitespace. For example, if you have padded fields then might be an uncertain number of spaces between each field. `awk` does better whitespace splitting. So, pulling out the fourth field in a whitespace delimited file is as follows: -
$ awk '{print $4}' mydata.dat
+``` +$ awk '{print $4}' mydata.dat +``` You can use `-F ':'` to change the delimiter (F for field separator). The next example is used to sum numbers from a field: -
$ awk -F ',' '{sum += $1} END {print sum}' mydata.csv
+``` +$ awk -F ',' '{sum += $1} END {print sum}' mydata.csv +``` ## Basic Shell Scripting The basic premise of a script is to execute automate the execution of @@ -356,9 +389,10 @@ done Subcommands are used all the time in shell scripts. What they do is storing the output of a command in a variable. So this can later be used in a conditional or a loop for example. -
CURRENTDIR=`pwd`  # using backticks
+```
+CURRENTDIR=`pwd`  # using backticks
 CURRENTDIR=$(pwd)  # recommended (easier to type)
-
+``` In the above example you can see the 2 different methods of using a subcommand. `pwd` will output the current working directory, and its @@ -372,14 +406,18 @@ error. How do you properly deal with these situations? Firstly a useful thing to know for debugging and testing is that you can run any command like this: -
command 2>&1 output.log   # one single output file, both output and errors
+``` +command 2>&1 output.log # one single output file, both output and errors +``` If you add `2>&1 output.log` at the end of any command, it will combine `stdout` and `stderr`, outputting it into a single file named `output.log`. If you want regular and error output separated you can use: -
command > output.log 2> output.err  # errors in a separate file
+``` +command > output.log 2> output.err # errors in a separate file +``` this will write regular output to `output.log` and error output to `output.err`. @@ -388,7 +426,9 @@ You can then look for the errors with `less` or search for specific text with `grep`. In scripts, you can use: -
set -e
+``` +set -e +``` This will tell the shell to stop executing any subsequent commands when diff --git a/mkdocs/docs/HPC/linux-tutorial/common_pitfalls.md b/mkdocs/docs/HPC/linux-tutorial/common_pitfalls.md index 6dd0a8dccd5..fa2cc852955 100644 --- a/mkdocs/docs/HPC/linux-tutorial/common_pitfalls.md +++ b/mkdocs/docs/HPC/linux-tutorial/common_pitfalls.md @@ -21,29 +21,33 @@ different `$VSC_*` variables. Filenames should **not** contain any spaces! If you have a long filename you should use underscores or dashes (e.g., `very_long_filename`). -
$ cat some file
+```
+$ cat some file
 No such file or directory 'some'
-
+``` Spaces are permitted, however they result in surprising behaviour. To cat the file `'some file'` as above, you can escape the space with a backslash ("`\ `") or you can put the filename in quotes: -
$ cat some\ file
+```
+$ cat some\ file
 ...
-$ cat "some file"
+$ cat "some file"
 ...
-
+``` This is especially error-prone if you are piping results of `find`: -
$ find . -type f | xargs cat
+```
+$ find . -type f | xargs cat
 No such file or directory name ’some’
 No such file or directory name ’file’
-
+``` This can be worked around using the `-print0` flag: -
$ find . -type f -print0 | xargs -0 cat
+```
+$ find . -type f -print0 | xargs -0 cat
 ...
-
+``` But, this is tedious, and you can prevent errors by simply colouring within the lines and not using spaces in filenames. @@ -54,14 +58,18 @@ be careful to make sure that the environment variable exists. If you mistype an environment variable then it will resolve into a blank string. This means the following resolves to `rm -r ~/*` which will remove every file in your home directory! -
$ rm -r ~/$PROJETC/*
+``` +$ rm -r ~/$PROJETC/* +``` ### Typing dangerous commands A good habit when typing dangerous commands is to precede the line with `#`, the comment character. This will let you type out the command without fear of accidentally hitting enter and running something unintended. -
$ #rm -r ~/$POROJETC/*
+``` +$ #rm -r ~/$POROJETC/* +``` Then you can go back to the beginning of the line (`Ctrl-A`) and remove the first character (`Ctrl-D`) to run the command. You can also just press enter to put the command in your history so you can come back to @@ -74,31 +82,36 @@ variables). After copying files from a windows machine, a file might look funny when looking at it on the cluster. -
$ cat script.sh
+```
+$ cat script.sh
 #!/bin/bash^M
 #PBS -l nodes^M
 ...
-
+``` Or you can get errors like: -
$ qsub fibo.pbs
+```
+$ qsub fibo.pbs
 qsub: script is written in DOS/Windows text format
-
+``` See section [dos2unix](uploading_files.md#dos2unix) to fix these errors with `dos2unix`. {% endif %} ### Permissions -
$ ls -l script.sh # File with correct permissions
+```
+$ ls -l script.sh # File with correct permissions
 -rwxr-xr-x 1 vsc40000 vsc40000 2983 Jan 30 09:13 script.sh
-$ ls -l script.sh # File with incorrect permissions
+$ ls -l script.sh # File with incorrect permissions
 -rw-r--r-- 1 vsc40000 vsc40000 2983 Jan 30 09:13 script.sh
-
+``` Before submitting the script, you'll need to add execute permissions to make sure it can be executed: -
$ chmod +x script_name.sh
+``` +$ chmod +x script_name.sh +``` ## Help @@ -111,13 +124,15 @@ consider the error as one thing, and won't show results just containing these words in random order. If you need help about a certain command, you should consult its so-called "man page": -
$ man command
+``` +$ man command +``` This will open the manual of this command. This manual contains detailed explanation of all the options the command has. Exiting the manual is done by pressing 'q'. -**Don't be afraid to contact {{hpcinfo}}. They are here to help and will do so for even the +**Don't be afraid to contact <{{hpcinfo}}>. They are here to help and will do so for even the smallest of problems!** # More information diff --git a/mkdocs/docs/HPC/linux-tutorial/getting_started.md b/mkdocs/docs/HPC/linux-tutorial/getting_started.md index 1a9a20aeeac..5335b374b28 100644 --- a/mkdocs/docs/HPC/linux-tutorial/getting_started.md +++ b/mkdocs/docs/HPC/linux-tutorial/getting_started.md @@ -27,7 +27,7 @@ To get help: ``` 2. use Google -3. contact {{hpcinfo}} in case +3. contact [{{hpcinfo}}](mailto:{{hpcinfo}}) in case of problems or questions (even for basic things!) ### Errors @@ -37,7 +37,7 @@ will be error output or a message explaining you this. Read this carefully and try to act on it. Try googling the error first to find any possible solution, but if you can't come up with something in 15 minutes, don't hesitate to mail -{{hpcinfo}}. +[{{hpcinfo}}](mailto:{{hpcinfo}}) ## Basic terminal usage @@ -46,8 +46,11 @@ The basic interface is the so-called shell prompt, typically ending with You use the shell by executing commands, and hitting ``. For example: -
$ echo hello 
- hello 
+ +``` +$ echo hello +hello +``` You can go to the start or end of the command line using `Ctrl-A` or `Ctrl-E`. @@ -60,12 +63,13 @@ To go through previous commands, use `` and A powerful feature is that you can "search" through your command history, either using the `history` command, or using `Ctrl-R`: -
$ history
+```
+$ history
     1 echo hello
 
 # hit Ctrl-R, type 'echo' 
 (reverse-i-search)`echo': echo hello
-
+``` ### Stopping commands @@ -85,9 +89,10 @@ They can be thought of as placeholders for things we need to remember. For example, to print the path to your home directory, we can use the shell variable named `HOME`: -
$ echo $HOME 
+```
+$ echo $HOME 
 /user/home/gent/vsc400/vsc40000
-
+``` This prints the value of this variable. @@ -101,29 +106,37 @@ For a full overview of defined environment variables in your current session, you can use the `env` command. You can sort this output with `sort` to make it easier to search in: -
$ env | sort 
+```
+$ env | sort 
 ...
 HOME=/user/home/gent/vsc400/vsc40000 
-... 
+... +``` You can also use the `grep` command to search for a piece of text. The following command will output all VSC-specific variable names and their values: -
$ env | sort | grep VSC
+``` +$ env | sort | grep VSC +``` But we can also define our own. this is done with the `export` command (note: variables are always all-caps as a convention): -
$ export MYVARIABLE="value"
+``` +$ export MYVARIABLE="value" +``` It is important you don't include spaces around the `=` sign. Also note the lack of `$` sign in front of the variable name. If we then do -
$ echo $MYVARIABLE
+``` +$ echo $MYVARIABLE +``` this will output `value`. Note that the quotes are not included, they were only used when defining the variable to escape @@ -135,16 +148,20 @@ You can change what your prompt looks like by redefining the special-purpose variable `$PS1`. For example: to include the current location in your prompt: -
$ export PS1='\w $'
+```
+$ export PS1='\w $'
 ~ $ cd test 
-~/test $ 
+~/test $ +``` Note that `~` is short representation of your home directory. To make this persistent across session, you can define this custom value for `$PS1` in your `.profile` startup script: -
$ echo 'export PS1="\w $ " ' >> ~/.profile
+``` +$ echo 'export PS1="\w $ " ' >> ~/.profile +``` ### Using non-defined variables @@ -153,11 +170,13 @@ Contrary to what you may expect, this does *not* result in error messages, but the variable is considered to be *empty* instead. This may lead to surprising results, for example: -
$ export WORKDIR=/tmp/test 
-$ pwd
+```
+$ export WORKDIR=/tmp/test 
+$ pwd
+/user/home/gent/vsc400/vsc40000 
+$ echo $HOME
 /user/home/gent/vsc400/vsc40000 
-$ echo $HOME
-/user/home/gent/vsc400/vsc40000 
+``` To understand what's going on here, see the section on `cd` below. @@ -189,17 +208,20 @@ Basic information about the system you are logged into can be obtained in a variety of ways. We limit ourselves to determining the hostname: -
$ hostname 
+```
+$ hostname 
 gligar01.gligar.os
 
-$ echo $HOSTNAME 
+$ echo $HOSTNAME 
 gligar01.gligar.os 
-
+``` And querying some basic information about the Linux kernel: -
$ uname -a 
+```
+$ uname -a 
 Linux gligar01.gligar.os 2.6.32-573.8.1.el6.ug.x86_64 #1 SMP Mon Nov 16 15:12:09
-	CET 2015 x86_64 x86_64 x86_64 GNU/Linux 
+ CET 2015 x86_64 x86_64 x86_64 GNU/Linux +``` ## Exercises diff --git a/mkdocs/docs/HPC/linux-tutorial/hpc_infrastructure.md b/mkdocs/docs/HPC/linux-tutorial/hpc_infrastructure.md index 764e42208f9..2de27c6f5db 100644 --- a/mkdocs/docs/HPC/linux-tutorial/hpc_infrastructure.md +++ b/mkdocs/docs/HPC/linux-tutorial/hpc_infrastructure.md @@ -20,9 +20,10 @@ Space is limited on the cluster's storage. To check your quota, see section To figure out where your quota is being spent, the `du` (isk sage) command can come in useful: -
$ du -sh test
+```
+$ du -sh test
 59M test
-
+``` Do *not* (frequently) run `du` on directories where large amounts of data are stored, since that will: @@ -68,7 +69,7 @@ Hint: `python -c "print(sum(range(1, 101)))"` - How many modules are available for Python version 3.6.4? - How many modules get loaded when you load the `Python/3.6.4-intel-2018a` module? - Which `cluster` modules are available? - + - What's the full path to your personal home/data/scratch directories? - Determine how large your personal directories are. - What's the difference between the size reported by `du -sh $HOME` and by `ls -ld $HOME`? diff --git a/mkdocs/docs/HPC/linux-tutorial/manipulating_files_and_directories.md b/mkdocs/docs/HPC/linux-tutorial/manipulating_files_and_directories.md index 627bf9e9ef7..32ed9395d67 100644 --- a/mkdocs/docs/HPC/linux-tutorial/manipulating_files_and_directories.md +++ b/mkdocs/docs/HPC/linux-tutorial/manipulating_files_and_directories.md @@ -10,21 +10,22 @@ commands short to type. To print the contents of an entire file, you can use `cat`; to only see the first or last N lines, you can use `head` or `tail`: -
$ cat one.txt
+```
+$ cat one.txt
 1
 2
 3
 4
 5
 
-$ head -2 one.txt
+$ head -2 one.txt
 1
 2
 
-$ tail -2 one.txt
+$ tail -2 one.txt
 4
 5
-
+``` To check the contents of long text files, you can use the `less` or `more` commands which support scrolling with "<up>", "<down>", @@ -32,17 +33,20 @@ To check the contents of long text files, you can use the `less` or ## Copying files: "cp" -
$ cp source target
-
+``` +$ cp source target +``` This is the `cp` command, which copies a file from source to target. To copy a directory, we use the `-r` option: -
$ cp -r sourceDirectory target
-
+``` +$ cp -r sourceDirectory target +``` A last more complicated example: -
$ cp -a sourceDirectory target
-
+``` +$ cp -a sourceDirectory target +``` Here we used the same `cp` command, but instead we gave it the `-a` option which tells cp to copy all the files and keep timestamps and @@ -50,26 +54,29 @@ permissions. ## Creating directories: "mkdir" -
$ mkdir directory
-
+``` +$ mkdir directory +``` which will create a directory with the given name inside the current directory. ## Renaming/moving files: "mv" -
$ mv source target
-
+``` +$ mv source target +``` `mv` will move the source path to the destination path. Works for both directories as files. ## Removing files: "rm" -Note: there are NO backups, there is no 'trash bin'. If you -remove files/directories, they are gone. -
$ rm filename
-
+Note: there are NO backups, there is no 'trash bin'. If you +remove files/directories, they are gone. +``` +$ rm filename +``` `rm` will remove a file or directory. (`rm -rf directory` will remove every file inside a given directory). WARNING: files removed will be lost forever, there are no backups, so beware when using this command! @@ -80,8 +87,9 @@ You can remove directories using `rm -r directory`, however, this is error-prone and can ruin your day if you make a mistake in typing. To prevent this type of error, you can remove the contents of a directory using `rm` and then finally removing the directory with: -
$ rmdir directory
-
+``` +$ rmdir directory +``` ## Changing permissions: "chmod" [//]: # (#sec:chmod) @@ -114,11 +122,12 @@ Any time you run `ls -l` you'll see a familiar line of `-rwx------` or similar combination of the letters `r`, `w`, `x` and `-` (dashes). These are the permissions for the file or directory. (See also the [previous section on permissions](navigating.md#permissions)) -
$ ls -l
+```
+$ ls -l
 total 1
 -rw-r--r--. 1 vsc40000 mygroup 4283648 Apr 12 15:13 articleTable.csv
 drwxr-x---. 2 vsc40000 mygroup 40 Apr 12 15:00 Project_GoldenDragon
-
+``` Here, we see that `articleTable.csv` is a file (beginning the line with `-`) has read and write permission for the user `vsc40000` (`rw-`), and read @@ -136,12 +145,13 @@ other users have no permissions to look in the directory at all (`---`). Maybe we have a colleague who wants to be able to add files to the directory. We use `chmod` to change the modifiers to the directory to let people in the group write to the directory: -
$ chmod g+w Project_GoldenDragon
-$ ls -l
+```
+$ chmod g+w Project_GoldenDragon
+$ ls -l
 total 1
 -rw-r--r--. 1 vsc40000 mygroup 4283648 Apr 12 15:13 articleTable.csv
 drwxrwx---. 2 vsc40000 mygroup 40 Apr 12 15:00 Project_GoldenDragon
-
+``` The syntax used here is `g+x` which means group was given write permission. To revoke it again, we use `g-w`. The other roles are `u` @@ -162,10 +172,11 @@ However, this means that all users in `mygroup` can add or remove files. This could be problematic if you only wanted one person to be allowed to help you administer the files in the project. We need a new group. To do this in the HPC environment, we need to use access control lists (ACLs): -
$ setfacl -m u:otheruser:w Project_GoldenDragon
-$ ls -l Project_GoldenDragon
+```
+$ setfacl -m u:otheruser:w Project_GoldenDragon
+$ ls -l Project_GoldenDragon
 drwxr-x---+ 2 vsc40000 mygroup 40 Apr 12 15:00 Project_GoldenDragon
-
+``` This will give the **u**ser `otheruser` permissions to **w**rite to `Project_GoldenDragon` @@ -186,30 +197,34 @@ used frequently. This means they will use less space and thus you get more out of your quota. Some types of files (e.g., CSV files with a lot of numbers) compress as much as 9:1. The most commonly used compression format on Linux is gzip. To compress a file using gzip, we use: -
$ ls -lh myfile
+```
+$ ls -lh myfile
 -rw-r--r--. 1 vsc40000 vsc40000 4.1M Dec 2 11:14 myfile
-$ gzip myfile
-$ ls -lh myfile.gz
+$ gzip myfile
+$ ls -lh myfile.gz
 -rw-r--r--. 1 vsc40000 vsc40000 1.1M Dec 2 11:14 myfile.gz
-
+``` Note: if you zip a file, the original file will be removed. If you unzip a file, the compressed file will be removed. To keep both, we send the data to `stdout` and redirect it to the target file: -
$ gzip -c myfile > myfile.gz
-$ gunzip -c myfile.gz > myfile
-
+``` +$ gzip -c myfile > myfile.gz +$ gunzip -c myfile.gz > myfile +``` ### "zip" and "unzip" Windows and macOS seem to favour the zip file format, so it's also important to know how to unpack those. We do this using unzip: -
$ unzip myfile.zip
-
+``` +$ unzip myfile.zip +``` If we would like to make our own zip archive, we use zip: -
$ zip myfiles.zip myfile1 myfile2 myfile3
-
+``` +$ zip myfiles.zip myfile1 myfile2 myfile3 +``` ## Working with tarballs: "tar" @@ -218,37 +233,42 @@ bigger file. You will normally want to unpack these files more often than you make them. To unpack a `.tar` file you use: -
$ tar -xf tarfile.tar
-
+``` +$ tar -xf tarfile.tar +``` Often, you will find `gzip` compressed `.tar` files on the web. These are called tarballs. You can recognize them by the filename ending in `.tar.gz`. You can uncompress these using `gunzip` and then unpacking them using `tar`. But `tar` knows how to open them using the `-z` option: -
$ tar -zxf tarfile.tar.gz
-$ tar -zxf tarfile.tgz
-
+``` +$ tar -zxf tarfile.tar.gz +$ tar -zxf tarfile.tgz +``` ### Order of arguments Note: Archive programs like `zip`, `tar`, and `jar` use arguments in the "opposite direction" of copy commands. -
# cp, ln: <source(s)> <target>
-$ cp source1 source2 source3 target
-$ ln -s source target
+```
+# cp, ln: <source(s)> <target>
+$ cp source1 source2 source3 target
+$ ln -s source target
 
 # zip, tar: <target> <source(s)>
-$ zip zipfile.zip source1 source2 source3
-$ tar -cf tarfile.tar source1 source2 source3
-
+$ zip zipfile.zip source1 source2 source3 +$ tar -cf tarfile.tar source1 source2 source3 +``` If you use `tar` with the source files first then the first file will be overwritten. You can control the order of arguments of `tar` if it helps you remember: -
$ tar -c source1 source2 source3 -f tarfile.tar
+``` +$ tar -c source1 source2 source3 -f tarfile.tar +``` ## Exercises diff --git a/mkdocs/docs/HPC/linux-tutorial/navigating.md b/mkdocs/docs/HPC/linux-tutorial/navigating.md index 030f7b5da54..5bbfb7ba326 100644 --- a/mkdocs/docs/HPC/linux-tutorial/navigating.md +++ b/mkdocs/docs/HPC/linux-tutorial/navigating.md @@ -7,12 +7,13 @@ important skill. ## Current directory: "pwd" and "$PWD" To print the current directory, use `pwd` or `\$PWD`: -
$ cd $HOME 
-$ pwd
+```
+$ cd $HOME 
+$ pwd
 /user/home/gent/vsc400/vsc40000 
-$ echo "The current directory is: $PWD" 
+$ echo "The current directory is: $PWD" 
 The current directory is: /user/home/gent/vsc400/vsc40000
-
+``` ## Listing files and directories: "ls" @@ -20,78 +21,99 @@ A very basic and commonly used command is `ls`, which can be used to list files and directories. In its basic usage, it just prints the names of files and directories in -the current directory. For example:
$ ls
-afile.txt some_directory 
+the current directory. For example: +``` +$ ls +afile.txt some_directory +``` When provided an argument, it can be used to list the contents of a -directory:
$ ls some_directory 
-one.txt two.txt 
+directory: +``` +$ ls some_directory +one.txt two.txt +``` A couple of commonly used options include: - detailed listing using `ls -l`: -:
$ ls -l
-	total 4224 
-	-rw-rw-r-- 1 vsc40000 vsc40000 2157404 Apr 12 13:17 afile.txt 
-	drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory
+ ``` + $ ls -l + total 4224 + -rw-rw-r-- 1 vsc40000 vsc40000 2157404 Apr 12 13:17 afile.txt + drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory + ``` - To print the size information in human-readable form, use the `-h` flag: -:
$ ls -lh
-	total 4.1M 
-	-rw-rw-r-- 1 vsc40000 vsc40000 2.1M Apr 12 13:16 afile.txt
-	drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory
+ ``` + $ ls -lh + total 4.1M + -rw-rw-r-- 1 vsc40000 vsc40000 2.1M Apr 12 13:16 afile.txt + drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory + ``` - also listing hidden files using the `-a` flag: -:
$ ls -lah
-	total 3.9M 
-	drwxrwxr-x 3 vsc40000 vsc40000 512 Apr 12 13:11 .
-	drwx------ 188 vsc40000 vsc40000 128K Apr 12 12:41 .. 
-	-rw-rw-r-- 1 vsc40000 vsc40000 1.8M Apr 12 13:12 afile.txt 
-	-rw-rw-r-- 1 vsc40000 vsc40000 0 Apr 12 13:11 .hidden_file.txt 
-	drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory
+ ``` + $ ls -lah + total 3.9M + drwxrwxr-x 3 vsc40000 vsc40000 512 Apr 12 13:11 . + drwx------ 188 vsc40000 vsc40000 128K Apr 12 12:41 .. + -rw-rw-r-- 1 vsc40000 vsc40000 1.8M Apr 12 13:12 afile.txt + -rw-rw-r-- 1 vsc40000 vsc40000 0 Apr 12 13:11 .hidden_file.txt + drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory + ``` - ordering files by the most recent change using `-rt`: -:
$ ls -lrth
-	total 4.0M 
-	drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory 
-	-rw-rw-r-- 1 vsc40000 vsc40000 2.0M Apr 12 13:15 afile.txt
+ ``` + $ ls -lrth + total 4.0M + drwxrwxr-x 2 vsc40000 vsc40000 512 Apr 12 12:51 some_directory + -rw-rw-r-- 1 vsc40000 vsc40000 2.0M Apr 12 13:15 afile.txt + ``` If you try to use `ls` on a file that doesn't exist, you will get a clear error message: -
$ ls nosuchfile 
+
+```
+$ ls nosuchfile 
 ls: cannot access nosuchfile: No such file or directory
-
+``` ## Changing directory: "cd" To change to a different directory, you can use the `cd` command: -
$ cd some_directory
+``` +$ cd some_directory +``` To change back to the previous directory you were in, there's a shortcut: `cd -` Using `cd` without an argument results in returning back to your home directory: -
$ cd 
-$ pwd
-/user/home/gent/vsc400/vsc40000 
+``` +$ cd +$ pwd +/user/home/gent/vsc400/vsc40000 +``` ## Inspecting file type: "file" The `file` command can be used to inspect what type of file you're dealing with: -
$ file afile.txt
+```
+$ file afile.txt
 afile.txt: ASCII text
 
-$ file some_directory 
+$ file some_directory 
 some_directory: directory
-
+``` ## Absolute vs relative file paths @@ -118,9 +140,11 @@ There are two special relative paths worth mentioning: You can also use `..` when constructing relative paths, for example: -
$ cd $HOME/some_directory 
-$ ls ../afile.txt 
-../afile.txt 
+``` +$ cd $HOME/some_directory +$ ls ../afile.txt +../afile.txt +``` ## Permissions @@ -130,8 +154,10 @@ Each file and directory has particular *permissions* set on it, which can be queried using `ls -l`. For example: -
$ ls -l afile.txt 
--rw-rw-r-- 1 vsc40000 agroup 2929176 Apr 12 13:29 afile.txt 
+``` +$ ls -l afile.txt +-rw-rw-r-- 1 vsc40000 agroup 2929176 Apr 12 13:29 afile.txt +``` The `-rwxrw-r--` specifies both the type of file (`-` for files, `d` for directories (see first @@ -164,19 +190,23 @@ later in this manual. matching given criteria. For example, to look for the file named `one.txt`: -
$ cd $HOME 
-$ find . -name one.txt
-./some_directory/one.txt 
+``` +$ cd $HOME +$ find . -name one.txt +./some_directory/one.txt +``` To look for files using incomplete names, you can use a wildcard `*`; note that you need to escape the `*` to avoid that Bash *expands* it into `afile.txt` by adding double quotes: -
$ find . -name "*.txt"
+```
+$ find . -name "*.txt"
 ./.hidden_file.txt 
 ./afile.txt 
 ./some_directory/one.txt
-./some_directory/two.txt 
+./some_directory/two.txt +``` A more advanced use of the `find` command is to use the `-exec` flag to perform actions on the found file(s), rather diff --git a/mkdocs/docs/HPC/linux-tutorial/uploading_files.md b/mkdocs/docs/HPC/linux-tutorial/uploading_files.md index 5df09b24f32..59948c9b063 100644 --- a/mkdocs/docs/HPC/linux-tutorial/uploading_files.md +++ b/mkdocs/docs/HPC/linux-tutorial/uploading_files.md @@ -24,8 +24,9 @@ sbatch: error: instead of expected UNIX line breaks (\n). To fix this problem, you should run the ``dos2unix`` command on the file: -
$ dos2unix filename
-
+``` +$ dos2unix filename +``` ## Symlinks for data/scratch [//]: # (sec:symlink-for-data) @@ -40,15 +41,16 @@ This will create 4 symbolic links {% if OS == windows %} (they're like "shortcuts" on your desktop) {% endif %} pointing to the respective storages: -
$ cd $HOME
-$ ln -s $VSC_SCRATCH scratch
-$ ln -s $VSC_DATA data
-$ ls -l scratch data
+```
+$ cd $HOME
+$ ln -s $VSC_SCRATCH scratch
+$ ln -s $VSC_DATA data
+$ ls -l scratch data
 lrwxrwxrwx 1 vsc40000 vsc40000 31 Mar 27 2009 data ->
     /user/data/gent/vsc400/vsc40000
 lrwxrwxrwx 1 vsc40000 vsc40000 34 Jun 5 2012 scratch ->
     /user/scratch/gent/vsc400/vsc40000
-
+``` @@ -83,7 +85,9 @@ Installing `rsync` is the easiest on Linux: it comes pre-installed with a lot of distributions. For example, to copy a folder with lots of CSV files: -
$ rsync -rzv testfolder vsc40000@login.hpc.ugent.be:data/
+``` +$ rsync -rzv testfolder vsc40000@login.hpc.ugent.be:data/ +``` will copy the folder `testfolder` and its contents to `$VSC_DATA` on the , assuming the `data` symlink is present in your home directory, see @@ -98,7 +102,9 @@ To copy large files using `rsync`, you can use the `-P` flag: it enables both showing of progress and resuming partially downloaded files. To copy files to your local computer, you can also use `rsync`: -
$ rsync -rzv vsc40000@login.hpc.ugent.be:data/bioset local_folder
+``` +$ rsync -rzv vsc40000@login.hpc.ugent.be:data/bioset local_folder +``` This will copy the folder `bioset` and its contents on `$VSC_DATA` to a local folder named `local_folder`. diff --git a/mkdocs/docs/HPC/multi_core_jobs.md b/mkdocs/docs/HPC/multi_core_jobs.md index 9c527db4eae..00834138cbd 100644 --- a/mkdocs/docs/HPC/multi_core_jobs.md +++ b/mkdocs/docs/HPC/multi_core_jobs.md @@ -28,79 +28,15 @@ approaches to parallel programming. In addition there are many problem specific libraries that incorporate parallel capabilities. The next three sections explore some common approaches: (raw) threads, OpenMP and MPI. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
Parallel programming approaches
-
- Tool - - Available languages binding - - Limitations -
- Raw threads pthreads, boost:: threading, ... - - Threading libraries are available for all common programming languages - - Threading libraries are available for all common programming languages & Threads are limited to shared memory systems. They are more often used on single node systems rather than for {{ hpc }}. Thread management is hard. -
- OpenMP - - Fortran/C/C++ - - Limited to shared memory systems, but large shared memory systems for HPC are not uncommon (e.g., SGI UV). Loops and task can be parallelized by simple insertion of compiler directives. Under the hood threads are used. Hybrid approaches exist which use OpenMP to parallelize the work load on each node and MPI (see below) for communication between nodes. -
- Lightweight threads with clever scheduling, Intel TBB, Intel Cilk Plus - - C/C++ - - Limited to shared memory systems, but may be combined with MPI. Thread management is taken care of by a very clever scheduler enabling the programmer to focus on parallelization itself. Hybrid approaches exist which use TBB and/or Cilk Plus to parallelise the work load on each node and MPI (see below) for communication between nodes. -
- MPI - - Fortran/C/C++, Python - - Applies to both distributed and shared memory systems. Cooperation between different nodes or cores is managed by explicit calls to library routines handling communication routines. -
- Global Arrays library - - C/C++, Python - - Mimics a global address space on distributed memory systems, by distributing arrays over many nodes and one sided communication. This library is used a lot for chemical structure calculation codes and was used in one of the first applications that broke the PetaFlop barrier. -
+ +| **Tool** | **Available languages binding** | **Limitations** | +|------------------------------------------------------------------------|------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Raw threads (pthreads, boost::threading, ...) | Threading libraries are available for all common programming languages | Threading libraries are available for all common programming languages & Threads are limited to shared memory systems. They are more often used on single node systems rather than for {{ hpc }}. Thread management is hard. | +| OpenMP | Fortran/C/C++ | Limited to shared memory systems, but large shared memory systems for HPC are not uncommon (e.g., SGI UV). Loops and task can be parallelized by simple insertion of compiler directives. Under the hood threads are used. Hybrid approaches exist which use OpenMP to parallelize the work load on each node and MPI (see below) for communication between nodes. | +| Lightweight threads with clever scheduling, Intel TBB, Intel Cilk Plus | C/C++ | Limited to shared memory systems, but may be combined with MPI. Thread management is taken care of by a very clever scheduler enabling the programmer to focus on parallelization itself. Hybrid approaches exist which use TBB and/or Cilk Plus to parallelise the work load on each node and MPI (see below) for communication between nodes. | +| MPI | Fortran/C/C++, Python | Applies to both distributed and shared memory systems. Cooperation between different nodes or cores is managed by explicit calls to library routines handling communication routines. | +| Global Arrays library | C/C++, Python | Mimics a global address space on distributed memory systems, by distributing arrays over many nodes and one sided communication. This library is used a lot for chemical structure calculation codes and was used in one of the first applications that broke the PetaFlop barrier. | + !!! tip You can request more nodes/cores by adding following line to your run script. @@ -150,28 +86,30 @@ runs a simple function that only prints "Hello from thread". Go to the example directory: -
$ cd ~/{{ exampledir }}
-
+``` +cd ~/{{ exampledir }} +``` !!! note If the example directory is not yet present, copy it to your home directory: -
$ cp -r {{ examplesdir }} ~/
+ ``` + cp -r {{ examplesdir }} ~/ + ``` Study the example first: -
-- T_hello.c --
- -```C +```C title="T_hello.c" {% include "./examples/Multi_core_jobs_Parallel_Computing/T_hello.c" %} ``` And compile it (whilst including the thread library) and run and test it on the login-node: -
$ module load GCC
-$ gcc -o T_hello T_hello.c -lpthread
-$ ./T_hello
+```
+$ module load GCC
+$ gcc -o T_hello T_hello.c -lpthread
+$ ./T_hello
 spawning thread 0
 spawning thread 1
 spawning thread 2
@@ -182,13 +120,14 @@ spawning thread 3
 spawning thread 4
 Hello from thread 3!
 Hello from thread 4!
-
+``` Now, run it on the cluster and check the output: -
$ qsub T_hello.pbs
+```
+$ qsub T_hello.pbs
 {{ jobid }}
-$ more T_hello.pbs.o{{ jobid }}
+$ more T_hello.pbs.o{{ jobid }}
 spawning thread 0
 spawning thread 1
 spawning thread 2
@@ -199,7 +138,7 @@ spawning thread 3
 spawning thread 4
 Hello from thread 3!
 Hello from thread 4!
-
+``` !!! tip If you plan engaging in parallel programming using threads, this book @@ -256,18 +195,17 @@ Parallelising for loops is really simple (see code below). By default, loop iteration counters in OpenMP loop constructs (in this case the i variable) in the for loop are set to private variables. -
-- omp1.c --
- -```C +```C title="omp1.c" {% include "./examples/Multi_core_jobs_Parallel_Computing/omp1.c" %} ``` And compile it (whilst including the "*openmp*" library) and run and test it on the login-node: -
$ module load GCC
-$ gcc -fopenmp -o omp1 omp1.c
-$ ./omp1
+```
+$ module load GCC
+$ gcc -fopenmp -o omp1 omp1.c
+$ ./omp1
 Thread 6 performed 125 iterations of the loop.
 Thread 7 performed 125 iterations of the loop.
 Thread 5 performed 125 iterations of the loop.
@@ -276,12 +214,13 @@ Thread 0 performed 125 iterations of the loop.
 Thread 2 performed 125 iterations of the loop.
 Thread 3 performed 125 iterations of the loop.
 Thread 1 performed 125 iterations of the loop.
-
+``` Now run it in the cluster and check the result again. -
$ qsub omp1.pbs
-$ cat omp1.pbs.o*
+```
+$ qsub omp1.pbs
+$ cat omp1.pbs.o*
 Thread 1 performed 125 iterations of the loop.
 Thread 4 performed 125 iterations of the loop.
 Thread 3 performed 125 iterations of the loop.
@@ -290,7 +229,7 @@ Thread 5 performed 125 iterations of the loop.
 Thread 7 performed 125 iterations of the loop.
 Thread 2 performed 125 iterations of the loop.
 Thread 6 performed 125 iterations of the loop.
-
+``` ### Critical Code @@ -301,18 +240,17 @@ you do things like updating a global variable with local results from each thread, and you don't have to worry about things like other threads writing to that global variable at the same time (a collision). -
-- omp2.c --
- -```C +```C title="omp2.c" {% include "./examples/Multi_core_jobs_Parallel_Computing/omp2.c" %} ``` And compile it (whilst including the "*openmp*" library) and run and test it on the login-node: -
$ module load GCC
-$ gcc -fopenmp -o omp2 omp2.c
-$ ./omp2
+```
+$ module load GCC
+$ gcc -fopenmp -o omp2 omp2.c
+$ ./omp2
 Thread 3 is adding its iterations (12500) to sum (0), total is now 12500.
 Thread 7 is adding its iterations (12500) to sum (12500), total is now 25000.
 Thread 5 is adding its iterations (12500) to sum (25000), total is now 37500.
@@ -322,12 +260,13 @@ Thread 4 is adding its iterations (12500) to sum (62500), total is now 75000.
 Thread 1 is adding its iterations (12500) to sum (75000), total is now 87500.
 Thread 0 is adding its iterations (12500) to sum (87500), total is now 100000.
 Total # loop iterations is 100000
-
+``` Now run it in the cluster and check the result again. -
$ qsub omp2.pbs
-$ cat omp2.pbs.o*
+```
+$ qsub omp2.pbs
+$ cat omp2.pbs.o*
 Thread 2 is adding its iterations (12500) to sum (0), total is now 12500.
 Thread 0 is adding its iterations (12500) to sum (12500), total is now 25000.
 Thread 1 is adding its iterations (12500) to sum (25000), total is now 37500.
@@ -337,7 +276,7 @@ Thread 3 is adding its iterations (12500) to sum (62500), total is now 75000.
 Thread 5 is adding its iterations (12500) to sum (75000), total is now 87500.
 Thread 6 is adding its iterations (12500) to sum (87500), total is now 100000.
 Total # loop iterations is 100000
-
+``` ### Reduction @@ -349,27 +288,27 @@ example above, where we used the "critical code" directive to accomplish this. The map-reduce paradigm is so common that OpenMP has a specific directive that allows you to more easily implement this. -
-- omp3.c --
- -```C +```C title="omp3.c" {% include "./examples/Multi_core_jobs_Parallel_Computing/omp3.c" %} ``` And compile it (whilst including the "*openmp*" library) and run and test it on the login-node: -
$ module load GCC
-$ gcc -fopenmp -o omp3 omp3.c
-$ ./omp3
+```
+$ module load GCC
+$ gcc -fopenmp -o omp3 omp3.c
+$ ./omp3
 Total # loop iterations is 100000
-
+``` Now run it in the cluster and check the result again. -
$ qsub omp3.pbs
-$ cat omp3.pbs.o*
+```
+$ qsub omp3.pbs
+$ cat omp3.pbs.o*
 Total # loop iterations is 100000
-
+``` ### Other OpenMP directives @@ -439,38 +378,36 @@ return the results to the main process, and print the messages. Study the MPI-programme and the PBS-file: -
-- mpi_hello.c --
- -```C +```C title="mpi_hello.c" {% include "./examples/Multi_core_jobs_Parallel_Computing/mpi_hello.c" %} ``` -
-- mpi_hello.pbs --
- -```bash +```bash title="mpi_hello.pbs" {% include "./examples/Multi_core_jobs_Parallel_Computing/mpi_hello.pbs" %} ``` and compile it: -
$ module load intel
-$ mpiicc -o mpi_hello mpi_hello.c
-
+``` +$ module load intel +$ mpiicc -o mpi_hello mpi_hello.c +``` mpiicc is a wrapper of the Intel C++ compiler icc to compile MPI programs (see [the chapter on compilation](./compiling_your_software.md) for details). Run the parallel program: -
$ qsub mpi_hello.pbs
-$ ls -l
+```
+$ qsub mpi_hello.pbs
+$ ls -l
 total 1024
 -rwxrwxr-x 1 {{ userid }} 8746 Sep 16 14:19 mpi_hello*
 -rw-r--r-- 1 {{ userid }} 1626 Sep 16 14:18 mpi_hello.c
 -rw------- 1 {{ userid }}    0 Sep 16 14:22 mpi_hello.o{{ jobid }}
 -rw------- 1 {{ userid }}  697 Sep 16 14:22 mpi_hello.o{{ jobid }}
 -rw-r--r-- 1 {{ userid }}  304 Sep 16 14:22 mpi_hello.pbs
-$ cat mpi_hello.o{{ jobid }}
+$ cat mpi_hello.o{{ jobid }}
 0: We have 16 processors
 0: Hello 1! Processor 1 reporting for duty
 0: Hello 2! Processor 2 reporting for duty
@@ -487,7 +424,7 @@ total 1024
 0: Hello 13! Processor 13 reporting for duty
 0: Hello 14! Processor 14 reporting for duty
 0: Hello 15! Processor 15 reporting for duty
-
+``` The runtime environment for the MPI implementation used (often called mpirun or mpiexec) spawns multiple copies of the program, with the total diff --git a/mkdocs/docs/HPC/multi_job_submission.md b/mkdocs/docs/HPC/multi_job_submission.md index 5177d79fa33..d336959cb85 100644 --- a/mkdocs/docs/HPC/multi_job_submission.md +++ b/mkdocs/docs/HPC/multi_job_submission.md @@ -48,30 +48,32 @@ scenario that can be reduced to a **MapReduce** approach.[^1] ## The worker Framework: Parameter Sweeps First go to the right directory: -
$ cd ~/examples/Multi-job-submission/par_sweep
+ +``` +cd ~/examples/Multi-job-submission/par_sweep +``` Suppose the user wishes to run the "*weather*" program, which takes three parameters: a temperature, a pressure and a volume. A typical call of the program looks like: -
$ ./weather -t 20 -p 1.05 -v 4.3
-T: 20  P: 1.05  V: 4.3
+ +``` +$ ./weather -t 20 -p 1.05 -v 4.3 +T: 20 P: 1.05 V: 4.3 +``` For the purpose of this exercise, the weather program is just a simple bash script, which prints the 3 variables to the standard output and waits a bit: -

par_sweep/weather

- -```shell +```shell title="par_sweep/weather" {% include "examples/Multi-job-submission/par_sweep/weather" %} ``` A job script that would run this as a job for the first parameters (p01) would then look like: -

par_sweep/weather_p01.pbs

- -```shell +```shell title="par_sweep/weather_p01.pbs" {% include "examples/Multi-job-submission/par_sweep/weather_p01.pbs" %} ``` @@ -80,7 +82,10 @@ particular instance of the parameters, i.e., temperature = 20, pressure = 1.05, and volume = 4.3. To submit the job, the user would use: -
$ qsub weather_p01.pbs
+ +``` + $ qsub weather_p01.pbs +``` However, the user wants to run this program for many parameter instances, e.g., he wants to run the program on 100 instances of temperature, pressure and volume. The 100 parameter instances can be @@ -88,14 +93,17 @@ stored in a comma separated value file (.csv) that can be generated using a spreadsheet program such as Microsoft Excel or RDBMS or just by hand using any text editor (do **not** use a word processor such as Microsoft Word). The first few lines of the file "*data.csv*" would look like: -
$ more data.csv
+
+```
+$ more data.csv
 temperature, pressure, volume
 293, 1.0e5, 107
 294, 1.0e5, 106
 295, 1.0e5, 105
 296, 1.0e5, 104
 297, 1.0e5, 103
-...
+... +``` It has to contain the names of the variables on the first line, followed by 100 parameter instances in the current example. @@ -103,9 +111,7 @@ by 100 parameter instances in the current example. In order to make our PBS generic, the PBS file can be modified as follows: -

par_sweep/weather.pbs

- -```shell +```shell title="par_sweep/weather.pbs" {% include "examples/Multi-job-submission/par_sweep/weather.pbs" %} ``` @@ -128,10 +134,13 @@ minutes, i.e., 4 hours to be on the safe side. The job can now be submitted as follows (to check which `worker` module to use, see subsection [Using explicit version numbers](running_batch_jobs.md#using-explicit-version-numbers)): -
$ module load worker/1.6.12-foss-2021b
-$ wsub -batch weather.pbs -data data.csv
+
+```
+$ module load worker/1.6.12-foss-2021b
+$ wsub -batch weather.pbs -data data.csv
 total number of work items: 41
-{{jobid}}
+{{jobid}} +``` Note that the PBS file is the value of the -batch option. The weather program will now be run for all 100 parameter instances -- 8 @@ -140,17 +149,26 @@ a parameter instance is called a work item in Worker parlance. !!! warning When you attempt to submit a worker job on a non-default cluster, you might encounter an `Illegal instruction` error. In such cases, the solution is to use a different `module swap` command. For example, to submit a worker job to the [`donphan` debug cluster](interactive_debug.md) from the login nodes, use: -
$ module swap env/slurm/donphan
-    
+ + ``` + module swap env/slurm/donphan + ``` + instead of -
$ module swap cluster/donphan 
+ + ``` + module swap cluster/donphan + ``` We recommend using a `module swap cluster` command after submitting the jobs. Additional information about this as well as more comprehensive details concerning the 'Illegal instruction' error can be accessed [here](troubleshooting.md#multi-job-submissions-on-a-non-default-cluster). ## The Worker framework: Job arrays [//]: # (sec:worker-framework-job-arrays) First go to the right directory: -
$ cd ~/examples/Multi-job-submission/job_array
+ +``` +cd ~/examples/Multi-job-submission/job_array +``` As a simple example, assume you have a serial program called *myprog* that you want to run on various input files *input\[1-100\]*. @@ -187,7 +205,10 @@ The details are script/program to specialise for that job The job could have been submitted using: -
$ qsub -t 1-100 my_prog.pbs
+ +``` +qsub -t 1-100 my_prog.pbs +``` The effect was that rather than 1 job, the user would actually submit 100 jobs to the queue system. This was a popular feature of TORQUE, but @@ -200,9 +221,7 @@ arrays" in its own way. A typical job script for use with job arrays would look like this: -

job_array/job_array.pbs

- -```shell +```shell title="job_array/job_array.pbs" {% include "examples/Multi-job-submission/job_array/job_array.pbs" %} ``` @@ -213,14 +232,17 @@ with those parameters. Input for the program is stored in files with names such as input_1.dat, input_2.dat, ..., input_100.dat in the ./input subdirectory. -
$ ls ./input
+
+```
+$ ls ./input
 ...
-$ more ./input/input_99.dat
+$ more ./input/input_99.dat
 This is input file \#99
 Parameter #1 = 99
 Parameter #2 = 25.67
 Parameter #3 = Batch
-Parameter #4 = 0x562867
+Parameter #4 = 0x562867 +``` For the sole purpose of this exercise, we have provided a short "test_set" program, which reads the "input" files and just copies them @@ -229,18 +251,14 @@ file. The corresponding output computed by our "*test_set*" program will be written to the *"./output*" directory in output_1.dat, output_2.dat, ..., output_100.dat. files. -

job_array/test_set

- -```shell +```shell title="job_array/test_set" {% include "examples/Multi-job-submission/job_array/test_set" %} ``` Using the "worker framework", a feature akin to job arrays can be used with minimal modifications to the job script: -

job_array/test_set.pbs

- -```shell +```shell title="job_array/test_set.pbs" {% include "examples/Multi-job-submission/job_array/test_set.pbs" %} ``` @@ -253,10 +271,13 @@ Note that walltime=04:00:00). The job is now submitted as follows: -
$ module load worker/1.6.12-foss-2021b
-$ wsub -t 1-100 -batch test_set.pbs
+
+```
+$ module load worker/1.6.12-foss-2021b
+$ wsub -t 1-100 -batch test_set.pbs
 total number of work items: 100
-{{jobid}}
+{{jobid}} +``` The "*test_set*" program will now be run for all 100 input files -- 8 concurrently -- until all computations are done. Again, a computation @@ -265,16 +286,18 @@ work item in Worker speak. Note that in contrast to TORQUE job arrays, a worker job array only submits a single job. -
$ qstat
+
+```
+$ qstat
 Job id          Name          User      Time   Use S Queue
 --------------- ------------- --------- ---- ----- - -----
 {{jobid}}  test_set.pbs  {{userid}}          0 Q
 
 And you can now check the generated output files:
-$ more ./output/output_99.dat
+$ more ./output/output_99.dat
 This is output file #99
 Calculations done, no results
-
+``` ## MapReduce: prologues and epilogue @@ -299,33 +322,36 @@ is executed just once after the work on all work items has finished. Technically, the master, i.e., the process that is responsible for dispatching work and logging progress, executes the prologue and epilogue. -
$ cd ~/examples/Multi-job-submission/map_reduce
+ +``` +cd ~/examples/Multi-job-submission/map_reduce +``` The script "pre.sh" prepares the data by creating 100 different input-files, and the script "post.sh" aggregates (concatenates) the data. First study the scripts: -

map_reduce/pre.sh

-```shell +```shell title="map_reduce/pre.sh" {% include "examples/Multi-job-submission/map_reduce/pre.sh" %} ``` -

map_reduce/post.sh

- -```shell +```shell title="map_reduce/post.sh" {% include "examples/Multi-job-submission/map_reduce/post.sh" %} ``` Then one can submit a MapReduce style job as follows: -
$ wsub -prolog pre.sh -batch test_set.pbs -epilog post.sh -t 1-100
+
+```
+$ wsub -prolog pre.sh -batch test_set.pbs -epilog post.sh -t 1-100
 total number of work items: 100
 {{jobid}}
-$ cat all_output.txt
+$ cat all_output.txt
 ...
-$ rm -r -f ./output/
+$ rm -r -f ./output/ +``` Note that the time taken for executing the prologue and the epilogue should be added to the job's total walltime. @@ -356,11 +382,17 @@ from the job's name and the job's ID, i.e., it has the form `.log`. For the running example, this could be `run.pbs.log{{jobid}}`, assuming the job's ID is {{jobid}}. To keep an eye on the progress, one can use: -
$ tail -f run.pbs.log{{jobid}}
+ +``` +tail -f run.pbs.log{{jobid}} +``` Alternatively, `wsummarize`, a Worker command that summarises a log file, can be used: -
$ watch -n 60 wsummarize run.pbs.log{{jobid}}
+ +``` +watch -n 60 wsummarize run.pbs.log{{jobid}} +``` This will summarise the log file every 60 seconds. @@ -398,13 +430,19 @@ processed. Worker makes it very easy to resume such a job without having to figure out which work items did complete successfully, and which remain to be computed. Suppose the job that did not complete all its work items had ID "445948". -
$ wresume -jobid {{jobid}}
+ +``` +wresume -jobid {{jobid}} +``` This will submit a new job that will start to work on the work items that were not done yet. Note that it is possible to change almost all job parameters when resuming, specifically the requested resources such as the number of cores and the walltime. -
$ wresume -l walltime=1:30:00 -jobid {{jobid}}}
+ +``` +wresume -l walltime=1:30:00 -jobid {{jobid}} +``` Work items may fail to complete successfully for a variety of reasons, e.g., a data file that is missing, a (minor) programming error, etc. @@ -413,7 +451,10 @@ done, so resuming a job will only execute work items that did not terminate either successfully, or reporting a failure. It is also possible to retry work items that failed (preferably after the glitch why they failed was fixed). -
$ wresume -jobid {{jobid}} -retry
+ +``` +wresume -jobid {{jobid}} -retry +``` By default, a job's prologue is not executed when it is resumed, while its epilogue is. "wresume" has options to modify this default behaviour. @@ -423,7 +464,9 @@ its epilogue is. "wresume" has options to modify this default behaviour. This how-to introduces only Worker's basic features. The wsub command has some usage information that is printed when the -help option is specified: -
$ wsub -help
+
+```
+$ wsub -help
 ### usage: wsub  -batch <batch-file>          
 #                [-data <data-files>]         
 #                [-prolog <prolog-file>]      
@@ -453,7 +496,7 @@ specified:
 #   -t <array-req>        : qsub's PBS array request options, e.g., 1-10
 #   <pbs-qsub-options>    : options passed on to the queue submission
 #                           command
-
+``` ## Troubleshooting diff --git a/mkdocs/docs/HPC/mympirun.md b/mkdocs/docs/HPC/mympirun.md index 98fd91cd0c4..93a55fc44e3 100644 --- a/mkdocs/docs/HPC/mympirun.md +++ b/mkdocs/docs/HPC/mympirun.md @@ -12,8 +12,9 @@ README](https://github.com/hpcugent/vsc-mympirun/blob/master/README.md). Before using `mympirun`, we first need to load its module: -
$ module load vsc-mympirun
-
+``` +module load vsc-mympirun +``` As an exception, we don't specify a version here. The reason is that we want to ensure that the latest version of the `mympirun` script is @@ -47,14 +48,15 @@ The `--hybrid` option requires a positive number. This number specifies the number of processes started on each available physical *node*. It will ignore the number of available *cores* per node. -
$ echo $PBS_NUM_NODES
+```
+$ echo $PBS_NUM_NODES
 2
-$ mympirun --hybrid 2 ./mpihello
+$ mympirun --hybrid 2 ./mpihello
 Hello world from processor node3400.doduo.os, rank 1 out of 4 processors 
 Hello world from processor node3401.doduo.os, rank 3 out of 4 processors 
 Hello world from processor node3401.doduo.os, rank 2 out of 4 processors 
 Hello world from processor node3400.doduo.os, rank 0 out of 4 processors
-
+``` ### Other options @@ -74,6 +76,7 @@ You can do a so-called "dry run", which doesn't have any side-effects, but just prints the command that `mympirun` would execute. You enable this with the `--dry-run` flag: -
$ mympirun --dry-run ./mpi_hello
+```
+$ mympirun --dry-run ./mpi_hello
 mpirun ... -genv I_MPI_FABRICS shm:dapl ... -np 16 ... ./mpi_hello
-
+``` diff --git a/mkdocs/docs/HPC/openFOAM.md b/mkdocs/docs/HPC/openFOAM.md index 8f83201d6a4..04ed7a29c77 100644 --- a/mkdocs/docs/HPC/openFOAM.md +++ b/mkdocs/docs/HPC/openFOAM.md @@ -71,7 +71,8 @@ First of all, you need to pick and load one of the available `OpenFOAM` modules. To get an overview of the available modules, run '`module avail OpenFOAM`'. For example: -
$ module avail OpenFOAM
+```
+$ module avail OpenFOAM
 ------------------ /apps/gent/CO7/sandybridge/modules/all ------------------
    OpenFOAM/v1712-foss-2017b     OpenFOAM/4.1-intel-2017a
    OpenFOAM/v1712-intel-2017b    OpenFOAM/5.0-intel-2017a
@@ -81,7 +82,7 @@ modules. To get an overview of the available modules, run
    OpenFOAM/2.4.0-intel-2017a    OpenFOAM/5.0-20180108-intel-2018a
    OpenFOAM/3.0.1-intel-2016b    OpenFOAM/6-intel-2018a            (D)
    OpenFOAM/4.0-intel-2016b
-
+``` To pick a module, take into account the differences between the different OpenFOAM versions w.r.t. features and API (see also [Different OpenFOAM releases](./#different-openfoam-releases)). If @@ -94,8 +95,9 @@ that includes `intel-{{ current_year}}a`. To prepare your environment for using OpenFOAM, load the `OpenFOAM` module you have picked; for example: -
$ module load OpenFOAM/4.1-intel-2017a
-
+``` +module load OpenFOAM/11-foss-2023a +``` ### Sourcing the `$FOAM_BASH` script @@ -107,8 +109,9 @@ location to this script. Assuming you are using `bash` in your shell session or job script, you should always run the following command after loading an `OpenFOAM` module: -
$ source $FOAM_BASH
-
+``` +source $FOAM_BASH +``` ### Defining utility functions used in tutorial cases @@ -117,8 +120,9 @@ If you would like to use the `getApplication`, `runApplication`, are used in OpenFOAM tutorials, you also need to `source` the `RunFunctions` script: -
$  source $WM_PROJECT_DIR/bin/tools/RunFunctions
-
+``` +source $WM_PROJECT_DIR/bin/tools/RunFunctions +``` Note that this needs to be done **after** sourcing `$FOAM_BASH` to make sure `$WM_PROJECT_DIR` is defined. @@ -129,8 +133,9 @@ If you are seeing `Floating Point Exception` errors, you can undefine the `$FOAM_SIGFPE` environment variable that is defined by the `$FOAM_BASH` script as follows: -
$ unset $FOAM_SIGFPE
-
+``` +unset $FOAM_SIGFPE +``` Note that this only prevents OpenFOAM from propagating floating point exceptions, which then results in terminating the simulation. However, @@ -218,8 +223,9 @@ processes used in a parallel OpenFOAM execution, the `$MYMPIRUN_VARIABLESPREFIX` environment variable must be defined as follows, prior to running the OpenFOAM simulation with `mympirun`: -
$ export MYMPIRUN_VARIABLESPREFIX=WM_PROJECT,FOAM,MPI
-
+``` +export MYMPIRUN_VARIABLESPREFIX=WM_PROJECT,FOAM,MPI +``` Whenever you are instructed to use a command like `mpirun -np ...`, use `mympirun ...` instead; `mympirun` will automatically detect the @@ -236,8 +242,9 @@ make sure that the number of subdomains matches the number of processor cores that will be used by `mympirun`. If not, you may run into an error message like: -
number of processor directories = 4 is not equal to the number of processors = 16
-
+``` +number of processor directories = 4 is not equal to the number of processors = 16 +``` In this case, the case was decomposed in 4 subdomains, while the OpenFOAM simulation was started with 16 processes through `mympirun`. To @@ -264,8 +271,9 @@ by minimising the number of processor boundaries. To visualise the processor domains, use the following command: -
$ mympirun foamToVTK -parallel -constant -time 0 -excludePatches '(".*.")'
-
+``` +mympirun foamToVTK -parallel -constant -time 0 -excludePatches '(".*.")' +``` and then load the VTK files generated in the `VTK` folder into ParaView. @@ -292,7 +300,7 @@ specify in `system/controlDict` (see also of plane) rather than the entire domain; - if you do not plan to change the parameters of the OpenFOAM - simulation while it is running, **set runTimeModifiable to false** to avoid that OpenFOAM re-reads each + simulation while it is running, set **runTimeModifiable** to **false** to avoid that OpenFOAM re-reads each of the `system/*Dict` files at every time step; - if the results per individual time step are large, consider setting @@ -322,7 +330,6 @@ See . Example job script for `damBreak` OpenFOAM tutorial (see also ): -
-- OpenFOAM_damBreak.sh --
-```bash +```bash title="OpenFOAM_damBreak.sh" {% include "./examples/OpenFOAM/OpenFOAM_damBreak.sh" %} ``` diff --git a/mkdocs/docs/HPC/program_examples.md b/mkdocs/docs/HPC/program_examples.md index 96fbb42dccd..34138392886 100644 --- a/mkdocs/docs/HPC/program_examples.md +++ b/mkdocs/docs/HPC/program_examples.md @@ -2,11 +2,16 @@ # Program examples { #ch:program-examples} If you have **not done so already** copy our examples to your home directory by running the following command: -
 cp -r {{ examplesdir }} ~/
-`~`(tilde) refers to your home directory, the directory you arrive by default when you login. +``` + cp -r {{ examplesdir }} ~/ +``` + +`~`(tilde) refers to your home directory, the directory you arrive by default when you login. Go to our examples: -
cd ~/{{exampledir}}
+``` +cd ~/{{exampledir}} +``` Here, we just have put together a number of examples for your convenience. We did an effort to put comments inside the source files, @@ -36,27 +41,26 @@ so the source code files are (should be) self-explanatory. The above 2 OMP directories contain the following examples: -| C Files | Fortran Files | Description | -|-------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------| -| omp_hello.c | omp_hello.f | Hello world | -| omp_workshare1.c | omp_workshare1.f | Loop work-sharing | -| omp_workshare2.c | omp_workshare2.f | Sections work-sharing | -| omp_reduction.c | omp_reduction.f | Combined parallel loop reduction | -| omp_orphan.c | omp_orphan.f | Orphaned parallel loop reduction | -| omp_mm.c | omp_mm.f | Matrix multiply | -| omp_getEnvInfo.c | omp_getEnvInfo.f | Get and print environment information | -| omp_bug1.c
omp_bug1fix.c
omp_bug2.c
omp_bug3.c
omp_bug4.c
omp_bug4fix
omp_bug5.c
omp_bug5fix.c
omp_bug6.c | omp_bug1.f
omp_bug1fix.f
omp_bug2.f
omp_bug3.f
omp_bug4.f
omp_bug4fix
omp_bug5.f
omp_bug5fix.f
omp_bug6.f
| Programs with bugs and their solution | +| C Files | Fortran Files | Description | +|------------------|------------------|---------------------------------------| +| omp_hello.c | omp_hello.f | Hello world | +| omp_workshare1.c | omp_workshare1.f | Loop work-sharing | +| omp_workshare2.c | omp_workshare2.f | Sections work-sharing | +| omp_reduction.c | omp_reduction.f | Combined parallel loop reduction | +| omp_orphan.c | omp_orphan.f | Orphaned parallel loop reduction | +| omp_mm.c | omp_mm.f | Matrix multiply | +| omp_getEnvInfo.c | omp_getEnvInfo.f | Get and print environment information | +| omp_bug* | omp_bug* | Programs with bugs and their solution | Compile by any of the following commands: - - - - - - - - - -
C:icc -openmp omp_hello.c -o hello\newline pgcc -mp omp_hello.c -o hello\newline gcc -fopenmp omp_hello.c -o hello
Fortran:ifort -openmp omp_hello.f -o hello\newline pgf90 -mp omp_hello.f -o hello\newline gfortran -fopenmp omp_hello.f -o hello
+ +| **Language** | **Commands** | +|--------------|----------------------------------------| +| **C:** | icc -openmp omp_hello.c -o hello | +| | pgcc -mp omp_hello.c -o hello | +| | gcc -fopenmp omp_hello.c -o hello | +| **Fortran:** | ifort -openmp omp_hello.f -o hello | +| | pgf90 -mp omp_hello.f -o hello | +| | gfortran -fopenmp omp_hello.f -o hello | Be invited to explore the examples. diff --git a/mkdocs/docs/HPC/quick_reference_guide.md b/mkdocs/docs/HPC/quick_reference_guide.md index 6141d038567..05de5dfeb77 100644 --- a/mkdocs/docs/HPC/quick_reference_guide.md +++ b/mkdocs/docs/HPC/quick_reference_guide.md @@ -3,282 +3,50 @@ Remember to substitute the usernames, login nodes, file names, ...for your own. - - - - - - - - - - - - - - - - - - - - - - - - -
-
Login
-
- Login - - ssh {{userid}}@{{loginnode}} -
- Where am I? - - hostname -
- Copy to {{hpc}} - - scp foo.txt {{userid}}@{{loginnode}}: -
- Copy from {{hpc}} - - scp {{userid}}@{{loginnode}}:foo.txt -
- Setup ftp session - - sftp {{userid}}@{{loginnode}} -
+| **Login** | | +|-------------------|-----------------------------------------| +| Login | `ssh {{userid}}@{{loginnode}}` | +| Where am I? | `hostname` | +| Copy to {{hpc}} | `scp foo.txt {{userid}}@{{loginnode}}:` | +| Copy from {{hpc}} | `scp {{userid}}@{{loginnode}}:foo.txt` | +| Setup ftp session | `sftp {{userid}}@{{loginnode}}` | - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
Modules
-
- List all available modules - - Module avail -
- List loaded modules - - module list -
- Load module - - module load example -
- Unload module - - module unload example -
- Unload all modules - - module purge -
- Help on use of module - - module help -
+| **Modules** | | +|----------------------------|-----------------------| +| List all available modules | Module avail | +| List loaded modules | module list | +| Load module | module load example | +| Unload module | module unload example | +| Unload all modules | module purge | +| Help on use of module | module help | - - - - - - - - - - - - -{% if site != (gent or brussel) %} - - - - - - - - -{% endif %} - - - - - - - - - - - - - - - - -{% if site != (gent or brussel) %} - - - - -{% endif %} - - - - -
-
Jobs
-
- Submit job with job script script.pbs - - qsub script.pbs -
- Status of job with ID 12345 - - qstat 12345 -
- Possible start time of job with ID 12345 (not available everywhere) - - showstart 12345 -
- Check job with ID 12345 (not available everywhere) - - checkjob 12345 -
- Show compute node of job with ID 12345 - - qstat -n 12345 -
- Delete job with ID 12345 - - qdel 12345 -
- Status of all your jobs - - qstat -
- Detailed status of your jobs + a list nodes they are running on - - qstat -na -
- Show all jobs on queue (not available everywhere) - - showq -
- Submit Interactive job - - qsub -I -
+| Command | Description | +|-----------------------------------------------|---------------------------------------------------------| +| `qsub script.pbs` | Submit job with job script `script.pbs` | +| `qstat 12345` | Status of job with ID 12345 | +{% if site != (gent or brussel) %} | `showstart 12345` | Possible start time of job with ID 12345 (not available everywhere) | +| `checkjob 12345` | Check job with ID 12345 (not available everywhere) | +{% endif %} | `qstat -n 12345` | Show compute node of job with ID 12345 | +| `qdel 12345` | Delete job with ID 12345 | +| `qstat` | Status of all your jobs | +| `qstat -na` | Detailed status of your jobs + a list of nodes they are running on | +{% if site != (gent or brussel) %} | `showq` | Show all jobs on queue (not available everywhere) | +{% endif %} | `qsub -I` | Submit Interactive job | - - - - -{% if site == gent %} - - - - -{% else %} - - - - - - - - -{% endif %} - - - - -
-
Disk quota
-
- Check your disk quota - - see https://account.vscentrum.be -
- Check your disk quota - - mmlsquota -
- Check your disk quota nice - - show_quota.py -
- Disk usage in current directory (.) - - du -h -
- - - - - - - - - - - - - - - - - - - - -
-
Worker Framework
-
- Load worker module - - module load worker/1.6.12-foss-2021b Don't forget to specify a version. To list available versions, use module avail worker/ -
- Submit parameter sweep - - wsub -batch weather.pbs -data data.csv -
- Submit job array - - wsub -t 1-100 -batch test_set.pbs -
- Submit job array with prolog and epilog - - wsub -prolog pre.sh -batch test_set.pbs -epilog post.sh -t 1-100 -
+| **Disk quota** | | +|-----------------------------------------------|-------------------------------------------------| +{% if site == gent %} | Check your disk quota | see [https://account.vscentrum.be](https://account.vscentrum.be) | +{% else %} | Check your disk quota | `mmlsquota` | +| Check your disk quota nice | `show_quota.py` | +{% endif %} | Disk usage in current directory (`.`) | `du -h` | + + + +| **Worker Framework** | | +|-----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| +| Load worker module | `module load worker/1.6.12-foss-2021b` Don't forget to specify a version. To list available versions, use `module avail worker/` | +| Submit parameter sweep | `wsub -batch weather.pbs -data data.csv` | +| Submit job array | `wsub -t 1-100 -batch test_set.pbs` | +| Submit job array with prolog and epilog | `wsub -prolog pre.sh -batch test_set.pbs -epilog post.sh -t 1-100` | diff --git a/mkdocs/docs/HPC/running_batch_jobs.md b/mkdocs/docs/HPC/running_batch_jobs.md index 9eb61a2d09f..9c9eaf554bc 100644 --- a/mkdocs/docs/HPC/running_batch_jobs.md +++ b/mkdocs/docs/HPC/running_batch_jobs.md @@ -88,8 +88,9 @@ command, you can replace `module` with `ml`. A large number of software packages are installed on the {{ hpc }} clusters. A list of all currently available software can be obtained by typing: -
$ module available
-
+``` +module available +``` It's also possible to execute `module av` or `module avail`, these are shorter to type and will do the same thing. @@ -133,8 +134,9 @@ same toolchain name and version can work together without conflicts. To "activate" a software package, you load the corresponding module file using the `module load` command: -
$ module load example
-
+``` +module load example +``` This will load the most recent version of *example*. @@ -145,8 +147,9 @@ lexicographical last after the `/`). **However, you should specify a particular version to avoid surprises when newer versions are installed: -
$ module load secondexample/2.7-intel-2016b
-
+``` +module load secondexample/2.7-intel-2016b +``` The `ml` command is a shorthand for `module load`: `ml example/1.2.3` is equivalent to `module load example/1.2.3`. @@ -154,8 +157,9 @@ equivalent to `module load example/1.2.3`. Modules need not be loaded one by one; the two `module load` commands can be combined as follows: -
$ module load example/1.2.3 secondexample/2.7-intel-2016b
-
+``` +module load example/1.2.3 secondexample/2.7-intel-2016b +``` This will load the two modules as well as their dependencies (unless there are conflicts between both modules). @@ -166,14 +170,15 @@ Obviously, you need to be able to keep track of the modules that are currently loaded. Assuming you have run the `module load` commands stated above, you will get the following: -
$ module list
+```
+$ module list
 Currently Loaded Modulefiles: 
 1) example/1.2.3                                        6) imkl/11.3.3.210-iimpi-2016b 
 2) GCCcore/5.4.0                                        7) intel/2016b 
 3) icc/2016.3.210-GCC-5.4.0-2.26                        8) examplelib/1.2-intel-2016b 
 4) ifort/2016.3.210-GCC-5.4.0-2.26                      9) secondexample/2.7-intel-2016b 
 5) impi/5.1.3.181-iccifort-2016.3.210-GCC-5.4.0-2.26
-
+``` You can also just use the `ml` command without arguments to list loaded modules. @@ -193,15 +198,16 @@ However, the dependencies of the package are NOT automatically unloaded; you will have to unload the packages one by one. When the `secondexample` module is unloaded, only the following modules remain: -
$ module unload secondexample
-$ module list
+```
+$ module unload secondexample
+$ module list
 Currently Loaded Modulefiles: 
 Currently Loaded Modulefiles: 
 1) example/1.2.3                        5) impi/5.1.3.181-iccifort-2016.3.210-GCC-5.4.0-2.26 
 2) GCCcore/5.4.0                        6) imkl/11.3.3.210-iimpi-2016b 
 3) icc/2016.3.210-GCC-5.4.0-2.26        7) intel/2016b 
 4) ifort/2016.3.210-GCC-5.4.0-2.26      8) examplelib/1.2-intel-2016b
-
+``` To unload the `secondexample` module, you can also use `ml -secondexample`. @@ -217,8 +223,9 @@ loaded will *not* result in an error. In order to unload all modules at once, and hence be sure to start in a clean state, you can use: -
$ module purge
-
+``` +module purge +``` {% if site == gent -%} This is always safe: the `cluster` module (the module that specifies which cluster jobs will get submitted to) will not be unloaded (because @@ -244,13 +251,15 @@ Consider the following example: the user decides to use the `example` module and at that point in time, just a single version 1.2.3 is installed on the cluster. The user loads the module using: -
$ module load example
-
+``` +module load example +``` rather than -
$ module load example/1.2.3
-
+``` +module load example/1.2.3 +``` Everything works fine, up to the point where a new version of `example` is installed, 4.5.6. From then on, the user's `load` command will load @@ -259,28 +268,23 @@ unexpected problems. See for example [the following section on Module Conflicts] Consider the following `example` modules: -
$ module avail example/
+```
+$ module avail example/
 example/1.2.3 
 example/4.5.6
-
+``` Let's now generate a version conflict with the `example` module, and see what happens. -
$ module av example/
+```
+$ module av example/
 example/1.2.3       example/4.5.6
-$ module load example/1.2.3  example/4.5.6
+$ module load example/1.2.3  example/4.5.6
 Lmod has detected the following error: A different version of the 'example' module is already loaded (see output of 'ml').
-$ module swap example/4.5.6
-
- - +$ module swap example/4.5.6 +``` Note: A `module swap` command combines the appropriate `module unload` and `module load` commands. @@ -289,7 +293,8 @@ and `module load` commands. With the `module spider` command, you can search for modules: -
$ module spider example
+```
+$ module spider example
 --------------------------------------------------------------------------------
   example:
 --------------------------------------------------------------------------------
@@ -305,11 +310,12 @@ With the `module spider` command, you can search for modules:
 
     module spider example/1.2.3
 --------------------------------------------------------------------------------
-
+``` It's also possible to get detailed information about a specific module: -
$ module spider example/1.2.3
+```
+$ module spider example/1.2.3
 ------------------------------------------------------------------------------------------
   example: example/1.2.3
 ------------------------------------------------------------------------------------------
@@ -337,21 +343,23 @@ It's also possible to get detailed information about a specific module:
         More information 
         ================ 
          - Homepage: https://example.com
-
+``` ### Get detailed info To get a list of all possible commands, type: -
$ module help
-
+``` +module help +``` Or to get more information about one specific module package: -
$ module help example/1.2.3
+```
+$ module help example/1.2.3
 ----------- Module Specific Help for 'example/1.2.3' --------------------------- 
   This is just an example - Homepage: https://example.com/
-
+``` ### Save and load collections of modules @@ -364,52 +372,59 @@ In each `module` command shown below, you can replace `module` with First, load all modules you want to include in the collections: -
$ module load example/1.2.3 secondexample/2.7-intel-2016b
-
+``` +module load example/1.2.3 secondexample/2.7-intel-2016b +``` Now store it in a collection using `module save`. In this example, the collection is named `my-collection`. -
$ module save my-collection
-
+``` +module save my-collection +``` Later, for example in a jobscript or a new session, you can load all these modules with `module restore`: -
$ module restore my-collection
-
+``` +module restore my-collection +``` You can get a list of all your saved collections with the `module savelist` command: -
$ module savelistr
+```
+$ module savelist
 Named collection list (For LMOD_SYSTEM_NAME = "CO7-sandybridge"):
   1) my-collection
-
+``` To get a list of all modules a collection will load, you can use the `module describe` command: -
$ module describe my-collection
+```
+$ module describe my-collection
 1) example/1.2.3                                        6) imkl/11.3.3.210-iimpi-2016b 
 2) GCCcore/5.4.0                                        7) intel/2016b 
 3) icc/2016.3.210-GCC-5.4.0-2.26                        8) examplelib/1.2-intel-2016b 
 4) ifort/2016.3.210-GCC-5.4.0-2.26                      9) secondexample/2.7-intel-2016b 
 5) impi/5.1.3.181-iccifort-2016.3.210-GCC-5.4.0-2.26
-
+``` To remove a collection, remove the corresponding file in `$HOME/.lmod.d`: -
$ rm $HOME/.lmod.d/my-collection
-
+``` +rm $HOME/.lmod.d/my-collection +``` ### Getting module details To see how a module would change the environment, you can use the `module show` command: -
$ module show Python/2.7.12-intel-2016b
+```
+$ module show Python/2.7.12-intel-2016b
 whatis("Description: Python is a programming language that lets youwork more quickly and integrate your systems more effectively. - Homepage: http://python.org/ ") 
 conflict("Python")
 load("intel/2016b") 
@@ -417,7 +432,7 @@ load("bzip2/1.0.6-intel-2016b")
 ...
 prepend_path(...)
 setenv("EBEXTSLISTPYTHON","setuptools-23.1.0,pip-8.1.2,nose-1.3.7,numpy-1.11.1,scipy-0.17.1,ytz-2016.4", ...)
-
+``` It's also possible to use the `ml show` command instead: they are equivalent. @@ -428,20 +443,6 @@ bunch of extensions: `numpy`, `scipy`, ... You can also see the modules the `Python/2.7.12-intel-2016b` module loads: `intel/2016b`, `bzip2/1.0.6-intel-2016b`, ... - - - If you're not sure what all of this means: don't worry, you don't have to know; just load the module and try to use the software. ## Getting system information about the HPC infrastructure @@ -455,7 +456,8 @@ information about scheduled downtime, status of the system, ... To check how much jobs are running in what queues, you can use the `qstat -q` command: -
$ qstat -q
+```
+$ qstat -q
 Queue            Memory CPU Time Walltime Node  Run Que Lm  State
 ---------------- ------ -------- -------- ----  --- --- --  -----
 default            --      --       --      --    0   0 --   E R
@@ -466,7 +468,7 @@ q1h                --      --    01:00:00   --    0   1 --   E R
 q24h               --      --    24:00:00   --    0   0 --   E R
                                                ----- -----
                                                 337  82
-
+``` Here, there are 316 jobs running on the `long` queue, and 77 jobs queued. We can also see that the `long` queue allows a maximum wall time @@ -482,8 +484,9 @@ filled with jobs, completely filled with jobs, .... You can also get this information in text form (per cluster separately) with the `pbsmon` command: -
$ module swap cluster/donphan
-$ pbsmon
+```
+$ module swap cluster/donphan
+$ pbsmon
  4001 4002 4003 4004 4005 4006 4007
     _    j    j    j    _    _    .
 
@@ -501,7 +504,7 @@ with the `pbsmon` command:
 
 Node type:
  ppn=36, mem=751GB
-
+``` `pbsmon` only outputs details of the cluster corresponding to the currently loaded `cluster` module see [the section on Specifying the cluster on which to run](./#specifying-the-cluster-on-which-to-run). @@ -526,14 +529,16 @@ to your home directory, so that you have your **own personal** copy (editable an over-writable) and that you can start using the examples. If you haven't done so already, run these commands now: -
$ cd
-$ cp -r {{ examplesdir }} ~/
-
+``` +cd +cp -r {{ examplesdir }} ~/ +``` First go to the directory with the first examples by entering the command: -
$ cd ~/examples/Running-batch-jobs
-
+``` +cd ~/examples/Running-batch-jobs +``` Each time you want to execute a program on the {{ hpc }} you'll need 2 things: @@ -564,11 +569,12 @@ provided for you in the examples subdirectories. List and check the contents with: -
$ ls -l
+```
+$ ls -l
 total 512
 -rw-r--r-- 1 {{ userid }} 193 Sep 11 10:34 fibo.pbs
 -rw-r--r-- 1 {{ userid }} 609 Sep 11 10:25 fibo.pl
-
+``` In this directory you find a Perl script (named "fibo.pl") and a job script (named "fibo.pbs"). @@ -584,7 +590,8 @@ login-node), so that you can see what the program does. On the command line, you would run this using: -
$ ./fibo.pl
+```
+$ ./fibo.pl
 [0] -> 0
 [1] -> 1
 [2] -> 1
@@ -615,9 +622,9 @@ On the command line, you would run this using:
 [27] -> 196418
 [28] -> 317811
 [29] -> 514229
-
+``` -Remark: Recall that you have now executed the Perl script locally on one of +Remark: Recall that you have now executed the Perl script locally on one of the login-nodes of the {{ hpc }} cluster. Of course, this is not our final intention; we want to run the script on any of the compute nodes. Also, it is not considered as good practice, if you "abuse" the login-nodes @@ -630,9 +637,7 @@ since these jobs require very little computing power. The job script contains a description of the job by specifying the command that need to be executed on the compute node: -
-- fibo.pbs --
- -```bash +```bash title="fibo.pbs" {% include "./examples/Running_batch_jobs/fibo.pbs" %} ``` @@ -644,15 +649,16 @@ specified on the command line. This job script can now be submitted to the cluster's job system for execution, using the qsub (Queue SUBmit) command: -
$ qsub fibo.pbs
+```
+$ qsub fibo.pbs
 {{ jobid }}
-
+``` The qsub command returns a job identifier on the HPC cluster. The important part is the number (e.g., "{{ jobid }} "); this is a unique identifier for the job and can be used to monitor and manage your job. -Remark: the modules that were loaded when you submitted the job will *not* be +Remark: the modules that were loaded when you submitted the job will *not* be loaded when the job is started. You should always specify the `module load` statements that are required for your job in the job script itself. @@ -669,19 +675,21 @@ monitor jobs in the queue. After your job was started, and ended, check the contents of the directory: -
$ ls -l
+```
+$ ls -l
 total 768
 -rw-r--r-- 1 {{ userid }} {{ userid }}   44 Feb 28 13:33 fibo.pbs
 -rw------- 1 {{ userid }} {{ userid }}    0 Feb 28 13:33 fibo.pbs.e{{ jobid }}
 -rw------- 1 {{ userid }} {{ userid }} 1010 Feb 28 13:33 fibo.pbs.o{{ jobid }}
 -rwxrwxr-x 1 {{ userid }} {{ userid }}  302 Feb 28 13:32 fibo.pl
-
+``` Explore the contents of the 2 new files: -
$ more fibo.pbs.o{{ jobid }}
-$ more fibo.pbs.e{{ jobid }}
-
+``` +$ more fibo.pbs.o{{ jobid }} +$ more fibo.pbs.e{{ jobid }} +``` These files are used to store the standard output and error that would otherwise be shown in the terminal window. By default, they have the @@ -766,8 +774,9 @@ the environment so you get access to all modules installed on the `{{ otherclust cluster, and to be able to submit jobs to the `{{ othercluster }}` scheduler so your jobs will start on `{{ othercluster }}` instead of the default `{{ defaultcluster }}` cluster. -
$ module swap cluster/{{ othercluster }}
-
+``` +module swap cluster/{{ othercluster }} +``` Note: the `{{ othercluster }}` modules may not work directly on the login nodes, because the login nodes do not have the same architecture as the `{{ othercluster }}` cluster, they have @@ -778,7 +787,8 @@ this. To list the available cluster modules, you can use the `module avail cluster/` command: -
$ module avail cluster/
+```
+$ module avail cluster/
 --------------------------------------- /etc/modulefiles/vsc ----------------------------------------
    cluster/accelgor (S)    cluster/doduo   (S,L)    cluster/gallade (S)    cluster/skitty  (S)
    cluster/default         cluster/donphan (S)      cluster/joltik  (S)
@@ -789,8 +799,8 @@ To list the available cluster modules, you can use the
    D:  Default Module
 
 If you need software that is not listed, 
-request it via https://www.ugent.be/hpc/en/support/software-installation-request
-
+request it via https://www.ugent.be/hpc/en/support/software-installation-request +``` As indicated in the output above, each `cluster` module is a so-called sticky module, i.e., it will not be unloaded when `module purge` (see [the section on purging modules](./#purging-all-modules)) @@ -841,8 +851,9 @@ Using the job ID that `qsub` returned, there are various ways to monitor the status of your job. In the following commands, replace `12345` with the job ID `qsub` returned. -
$ qstat 12345
-
+``` +qstat 12345 +``` {% if site != (gent or brussel) %} To show an estimated start time for your job (note that this may be very @@ -867,25 +878,28 @@ error messages that may prevent your job from starting: To show on which compute nodes your job is running, at least, when it is running: -
$ qstat -n 12345
-
+``` +qstat -n 12345 +``` To remove a job from the queue so that it will not run, or to stop a job that is already running. -
$ qdel 12345
-
+``` +qdel 12345 +``` When you have submitted several jobs (or you just forgot about the job ID), you can retrieve the status of all your jobs that are submitted and are not yet finished using: -
$ qstat
+```
+$ qstat
 :
 Job ID      Name    User      Time Use S Queue
 ----------- ------- --------- -------- - -----
 {{ jobid }} ....     mpi  {{ userid }}     0    Q short
-
+``` Here: @@ -1068,8 +1082,9 @@ properly. The **qsub** command takes several options to specify the requirements, of which we list the most commonly used ones below. -
$ qsub -l walltime=2:30:00
-
+``` +qsub -l walltime=2:30:00 ... +``` For the simplest cases, only the amount of maximum estimated execution time (called "walltime") is really important. Here, the job requests 2 @@ -1088,8 +1103,9 @@ before the walltime kills your main process, you have to kill the main command yourself before the walltime runs out and then copy the file back. See [the section on Running a command with a maximum time limit](../jobscript_examples/#running-a-command-with-a-maximum-time-limit) for how to do this. -
$ qsub -l mem=4gb
-
+``` +qsub -l mem=4gb ... +``` The job requests 4 GB of RAM memory. As soon as the job tries to use more memory, it will be "killed" (terminated) by the job scheduler. @@ -1106,15 +1122,17 @@ per node" and "number of cores in a node" please consult . {% endif %} -
$ qsub -l nodes=5:ppn=2
-
+``` +qsub -l nodes=5:ppn=2 ... +``` The job requests 5 compute nodes with two cores on each node (ppn stands for "processors per node", where "processors" here actually means "CPU cores"). -
$ qsub -l nodes=1:westmere
-
+``` +qsub -l nodes=1:westmere +``` The job requests just one node, but it should have an Intel Westmere processor. A list with site-specific properties can be found in the next @@ -1123,8 +1141,9 @@ website. These options can either be specified on the command line, e.g. -
$ qsub -l nodes=1:ppn,mem=2gb fibo.pbs
-
+``` +qsub -l nodes=1:ppn,mem=2gb fibo.pbs +``` or in the job script itself using the #PBS-directive, so "fibo.pbs" could be modified to: @@ -1193,13 +1212,14 @@ located by default in the directory where you issued the *qsub* command. When you navigate to that directory and list its contents, you should see them: -
$ ls -l
+```
+$ ls -l
 total 1024
 -rw-r--r-- 1 {{ userid }}  609 Sep 11 10:54 fibo.pl
 -rw-r--r-- 1 {{ userid }}   68 Sep 11 10:53 fibo.pbs
 -rw------- 1 {{ userid }}   52 Sep 11 11:03 fibo.pbs.e{{ jobid }}
 -rw------- 1 {{ userid }} 1307 Sep 11 11:03 fibo.pbs.o{{ jobid }}
-
+``` In our case, our job has created both output ('fibo.pbs.') and error files ('fibo.pbs.') containing info written to *stdout* and *stderr* @@ -1207,11 +1227,12 @@ respectively. Inspect the generated output and error files: -
$ cat fibo.pbs.o{{ jobid }}
+```
+$ cat fibo.pbs.o{{ jobid }}
 ...
-$ cat fibo.pbs.e{{ jobid }}
+$ cat fibo.pbs.e{{ jobid }}
 ...
-
+``` ## E-mail notifications {% if site != gent %} @@ -1259,15 +1280,17 @@ or These options can also be specified on the command line. Try it and see what happens: -
$ qsub -m abe fibo.pbs
-
+``` +qsub -m abe fibo.pbs +``` The system will use the e-mail address that is connected to your VSC account. You can also specify an alternate e-mail address with the `-M` option: -
$ qsub -m b -M john.smith@example.com fibo.pbs
-
+``` +qsub -m b -M john.smith@example.com fibo.pbs +``` will send an e-mail to john.smith@example.com when the job begins. @@ -1279,9 +1302,10 @@ might be a problem as they might both be run at the same time. So the following example might go wrong: -
$ qsub job1.sh
-$ qsub job2.sh
-
+``` +$ qsub job1.sh +$ qsub job2.sh +``` You can make jobs that depend on other jobs. This can be useful for breaking up large jobs into smaller jobs that can be run in a pipeline. @@ -1289,9 +1313,10 @@ The following example will submit 2 jobs, but the second job (`job2.sh`) will be held (`H` status in `qstat`) until the first job successfully completes. If the first job fails, the second will be cancelled. -
$ FIRST_ID=$ (qsub job1.sh)
-$ qsub -W depend=afterok:$FIRST_ID job2.sh
-
+``` +$ FIRST_ID=$(qsub job1.sh) +$ qsub -W depend=afterok:$FIRST_ID job2.sh +``` `afterok` means "After OK", or in other words, after the first job successfully completed. diff --git a/mkdocs/docs/HPC/running_interactive_jobs.md b/mkdocs/docs/HPC/running_interactive_jobs.md index f33d546de70..8b97fa70d3b 100644 --- a/mkdocs/docs/HPC/running_interactive_jobs.md +++ b/mkdocs/docs/HPC/running_interactive_jobs.md @@ -21,8 +21,9 @@ the computing resources. The syntax for *qsub* for submitting an interactive PBS job is: -
$ qsub -I <... pbs directives ...>
-
+``` +$ qsub -I <... pbs directives ...> +``` ## Interactive jobs, without X support @@ -31,18 +32,20 @@ The syntax for *qsub* for submitting an interactive PBS job is: First of all, in order to know on which computer you're working, enter: -
$ hostname -f
+```
+$ hostname -f
 {{ loginhost }}
-
+``` This means that you're now working on the login node ` {{ loginhost }} ` of the cluster. The most basic way to start an interactive job is the following: -
$ qsub -I
+```
+$ qsub -I
 qsub: waiting for job {{ jobid }} to start
 qsub: job {{ jobid }} ready
-
+``` There are two things of note here. @@ -57,9 +60,10 @@ There are two things of note here. In order to know on which compute-node you're working, enter again: -
$ hostname -f
+```
+$ hostname -f
 {{ computenode }}
-
+``` Note that we are now working on the compute-node called "*{{ computenode }}*". This is the compute node, which was assigned to us by the scheduler after issuing @@ -87,10 +91,11 @@ Now, go to the directory of our second interactive example and run the program "primes.py". This program will ask you for an upper limit ($> 1$) and will print all the primes between 1 and your upper limit: -
$ cd ~/{{ exampledir }}
-$ ./primes.py
+```
+$ cd ~/{{ exampledir }}
+$ ./primes.py
 This program calculates all primes between 1 and your upper limit.
-Enter your upper limit (>1): 50
+Enter your upper limit (>1): 50
 Start Time:  2013-09-11 15:49:06
 [Prime#1] = 1
 [Prime#2] = 2
@@ -110,12 +115,13 @@ Start Time:  2013-09-11 15:49:06
 [Prime#16] = 47
 End Time:  2013-09-11 15:49:06
 Duration:  0 seconds.
-
+``` You can exit the interactive session with: -
$ exit
-
+``` +$ exit +``` Note that you can now use this allocated node for 1 hour. After this hour you will be automatically disconnected. You can change this "usage @@ -125,8 +131,9 @@ watching the clock on the wall.) You can work for 3 hours by: -
$ qsub -I -l walltime=03:00:00
-
+``` +qsub -I -l walltime=03:00:00 +``` If the walltime of the job is exceeded, the (interactive) job will be killed and your connection to the compute node will be closed. So do @@ -160,9 +167,7 @@ Download the latest version of the XQuartz package on: and install the XQuartz.pkg package. -
![image](img/img0512.png) -
The installer will take you through the installation procedure, just continue clicking ++"Continue"++ on the various screens that will pop-up until your @@ -171,9 +176,7 @@ installation was successful. A reboot is required before XQuartz will correctly open graphical applications. -
![image](img/img0513.png) -
{% endif %} {% if OS == windows %} ##### Install Xming @@ -191,9 +194,7 @@ The first task is to install the Xming software. 4. When selecting the components that need to be installed, make sure to select "*XLaunch wizard*" and "*Normal PuTTY Link SSH client*". -
![image](img/img0500.png) -
5. We suggest to create a Desktop icon for Xming and XLaunch. @@ -206,28 +207,20 @@ And now we can run Xming: 2. Select ++"Multiple Windows"++. This will open each application in a separate window. -
![image](img/img0501.png) -
3. Select ++"Start no client"++ to make XLaunch wait for other programs (such as PuTTY). -
![image](img/img0502.png) -
4. Select ++"Clipboard"++ to share the clipboard. -
![image](img/img0503.png) -
5. Finally ++"Save configuration"++ into a file. You can keep the default filename and save it in your Xming installation directory. -
![image](img/img0504.png) -
6. Now Xming is running in the background ... and you can launch a graphical application in your PuTTY terminal. @@ -237,27 +230,27 @@ And now we can run Xming: 8. In order to test the X-server, run "*xclock*". "*xclock*" is the standard GUI clock for the X Window System. -
$ xclock
-
+``` +xclock +``` You should see the XWindow clock application appearing on your Windows machine. The "*xclock*" application runs on the login-node of the {{ hpc }}, but is displayed on your Windows machine. -
![image](img/img0505.png) -
You can close your clock and connect further to a compute node with again your X-forwarding enabled: -
$ qsub -I -X
+```
+$ qsub -I -X
 qsub: waiting for job {{ jobid }} to start
 qsub: job {{ jobid }} ready
-$ hostname -f
+$ hostname -f
 {{ computenode }}
-$ xclock
-
+$ xclock +``` and you should see your clock again. @@ -309,9 +302,7 @@ the cluster 2. In the "*Category*" pane, expand ++"Connection>SSh"++, and select as show below: -
![image](img/img0506.png) -
3. In the ++"Source port"++ field, enter the local port to use (e.g., *5555*). @@ -334,41 +325,40 @@ running on a compute node on the {{ hpc }}) transferred to your personal screen, you will need to reconnect to the {{ hpc }} with X-forwarding enabled, which is done with the "-X" option. -
![image](img/ch5-interactive-mode.png) -
First exit and reconnect to the {{ hpc }} with X-forwarding enabled: -
$ exit
-$ ssh -X {{ userid }}@{{ loginnode }}
-$ hostname -f
+```
+$ exit
+$ ssh -X {{ userid }}@{{ loginnode }}
+$ hostname -f
 {{ loginhost }}
-
+``` We first check whether our GUIs on the login node are decently forwarded to your screen on your local machine. An easy way to test it is by running a small X-application on the login node. Type: -
$ xclock
-
+``` +$ xclock +``` And you should see a clock appearing on your screen. -
![image](img/img0507.png) -
You can close your clock and connect further to a compute node with again your X-forwarding enabled: -
$ qsub -I -X
+```
+$ qsub -I -X
 qsub: waiting for job {{ jobid }} to start
 qsub: job {{ jobid }} ready
-$ hostname -f
+$ hostname -f
 {{ computenode }}
-$ xclock
-
+$ xclock +``` and you should see your clock again. {% endif %} @@ -380,15 +370,14 @@ screen, but also asks you to click a button. Now run the message program: -
$ cd ~/{{ exampledir }}
-./message.py
-
+``` +cd ~/{{ exampledir }} +./message.py +``` You should see the following message appearing. -
![image](img/img0508.png) -
Click any button and see what happens. diff --git a/mkdocs/docs/HPC/running_jobs_with_input_output_data.md b/mkdocs/docs/HPC/running_jobs_with_input_output_data.md index c8393da45de..af9bb1bfecd 100644 --- a/mkdocs/docs/HPC/running_jobs_with_input_output_data.md +++ b/mkdocs/docs/HPC/running_jobs_with_input_output_data.md @@ -13,18 +13,22 @@ and where that you can collect your results. First go to the directory: -
$ cd ~/{{ exampledir }}
-
+``` +cd ~/{{ exampledir }} +``` !!! note If the example directory is not yet present, copy it to your home directory: -
$ cp -r {{ examplesdir }} ~/
+ ``` +cp -r {{ examplesdir }} ~/ + ``` List and check the contents with: -
ls -l
+```
+$ ls -l
 total 2304
 -rwxrwxr-x 1 {{ userid }}   682 Sep 13 11:34 file1.py
 -rw-rw-r-- 1 {{ userid }}   212 Sep 13 11:54 file1a.pbs
@@ -34,13 +38,12 @@ total 2304
 -rwxrwxr-x 1 {{ userid }}  2393 Sep 13 10:40 file2.py
 -rw-r--r-- 1 {{ userid }}  1393 Sep 13 10:41 file3.pbs
 -rwxrwxr-x 1 {{ userid }}  2393 Sep 13 10:40 file3.py
-
+``` Now, let us inspect the contents of the first executable (which is just a Python script with execute permission). -
-- file1.py --
-```python +```python title="file1.py" {% include "./examples/Running_jobs_with_input_output_data/file1.py" %} ``` @@ -55,8 +58,7 @@ The code of the Python script, is self explanatory: Check the contents of the first job script: -
-- file1a.pbs --
-```bash +```bash title="file1a.pbs" {% include "./examples/Running_jobs_with_input_output_data/file1a.pbs" %} ``` @@ -66,13 +68,15 @@ paths. Submit it: -
$ qsub file1a.pbs
-
+``` +qsub file1a.pbs +``` After the job has finished, inspect the local directory again, i.e., the directory where you executed the *qsub* command: -
ls -l
+```
+$ ls -l
 total 3072
 -rw-rw-r-- 1 {{ userid }}   90 Sep 13 13:13 Hello.txt
 -rwxrwxr-x 1 {{ userid }}  693 Sep 13 13:03 file1.py*
@@ -85,7 +89,7 @@ total 3072
 -rwxrwxr-x 1 {{ userid }} 2393 Sep 13 10:40 file2.py*
 -rw-r--r-- 1 {{ userid }} 1393 Sep 13 10:41 file3.pbs
 -rwxrwxr-x 1 {{ userid }} 2393 Sep 13 10:40 file3.py*
-
+``` Some observations: @@ -99,11 +103,12 @@ Some observations: Inspect their contents ... and remove the files -
$ cat Hello.txt
-$ cat file1a.pbs.o{{ jobid }}
-$ cat file1a.pbs.e{{ jobid }}
-$ rm Hello.txt file1a.pbs.o{{ jobid }} file1a.pbs.e{{ jobid }}
-
+``` +$ cat Hello.txt +$ cat file1a.pbs.o{{ jobid }} +$ cat file1a.pbs.e{{ jobid }} +$ rm Hello.txt file1a.pbs.o{{ jobid }} file1a.pbs.e{{ jobid }} +``` !!! tip Type `cat H` and press the Tab button (looks like ++tab++), and it will **expand** into @@ -113,18 +118,18 @@ Inspect their contents ... and remove the files Check the contents of the job script and execute it. -
-- file1b.pbs --
-```bash +```bash title="file1b.pbs" {% include "./examples/Running_jobs_with_input_output_data/file1b.pbs" %} ``` Inspect the contents again ... and remove the generated files: -
$ ls
+```
+$ ls
 Hello.txt file1a.pbs file1c.pbs file2.pbs file3.pbs my_serial_job.e{{ jobid }}
 file1.py* file1b.pbs file2.py* file3.py* my_serial_job.o{{ jobid }}
-$ rm Hello.txt my_serial_job.*
-
+$ rm Hello.txt my_serial_job.* +``` Here, the option "`-N`" was used to explicitly assign a name to the job. This overwrote the JOBNAME variable, and resulted in a different name @@ -137,8 +142,7 @@ defaults to the name of the job script. You can also specify the name of *stdout* and *stderr* files explicitly by adding two lines in the job script, as in our third example: -
-- file1c.pbs --
-```bash +```bash title="file1c.pbs" {% include "./examples/Running_jobs_with_input_output_data/file1c.pbs" %} ``` @@ -156,98 +160,21 @@ store your data depends on the purpose, but also the size and type of usage of the data. The following locations are available: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -{% if site == gent %} - - - - - - - - + +| **Variable** | **Description** | +|-------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| | *Long-term storage slow filesystem, intended for smaller files* | +| `$VSC_HOME` | For your configuration files and other small files, see [the section on your home directory.](./#your-home-directory-vsc_home) The default directory is `user/{{ site }}/xxx/{{ userid }}`. The same file system is accessible from all sites, i.e., you'll see the same contents in $VSC_HOME on all sites. | +| `$VSC_DATA` | A bigger "workspace", for **datasets**, results, logfiles, etc. see [the section on your data directory.](./#your-data-directory-vsc_data) The default directory is `data/{{ site }}/xxx/{{ userid }}`. The same file system is accessible from all sites. | +| | *Fast temporary storage* | +| `$VSC_SCRATCH_NODE` | For **temporary** or transient data on the local compute node, where fast access is important; see [the section on your scratch space.](./#your-scratch-space-vsc_scratch) This space is available per node. The default directory is `/tmp`. On different nodes, you'll see different content. | +| `$VSC_SCRATCH` | For **temporary** or transient data that has to be accessible from all nodes of a cluster (including the login nodes). The default directory is `scratch/{{ site }}/xxx/{{ userid }}`. This directory is cluster- or site-specific: On different sites, and sometimes on different clusters on the same site, you'll get a different directory with different content. | +| `$VSC_SCRATCH_SITE` | Currently the same as $VSC_SCRATCH, but could be used for a scratch space shared across all clusters at a site in the future. See [the section on your scratch space.](./#your-scratch-space-vsc_scratch) | +| `$VSC_SCRATCH_GLOBAL` | Currently the same as $VSC_SCRATCH, but could be used for a scratch space shared across all clusters of the VSC in the future. See [the section on your scratch space.](./#your-scratch-space-vsc_scratch) | + {% if site == gent %} | `$VSC_SCRATCH_CLUSTER` | The scratch filesystem closest to the cluster. | +| `$VSC_SCRATCH_ARCANINE` | A separate (smaller) shared scratch filesystem, powered by SSDs. This scratch filesystem is intended for very I/O-intensive workloads. | {% endif %} -
- Variable - - Description -
-
Long-term storage slow filesystem, intended for smaller files
-
- $VSC_HOME - - For your configuration files and other small files, see the section on your home directory. - The default directory is user/{{ site }}/xxx/{{ userid }}. - The same file system is accessible from all sites, i.e., you'll see the same contents in $VSC_HOME on all sites. -
- $VSC_DATA - - A bigger "workspace", for datasets, results, logfiles, etc. see the section on your data directory. - The default directory is data/{{ site }}/xxx/{{ userid }}. - The same file system is accessible from all sites. -
-
Fast temporary storage
-
- $VSC_SCRATCH_NODE - - For temporary or transient data on the local compute node, where fast access is important; see the section on your scratch space. - This space is available per node. The default directory is /tmp. On different nodes, you'll see different content. -
- $VSC_SCRATCH - - For temporary or transient data that has to be accessible from all nodes of a cluster (including the login nodes)
- The default directory is scratch/{{ site }}/xxx/{{ userid }}. This directory is cluster- or site-specific: On different sites, and sometimes on different clusters on the same site, you'll get a different directory with different content. -
- $VSC_SCRATCH_SITE - - Currently the same as $VSC_SCRATCH, but could be used for a scratch space shared accross all clusters at a site in the future. See the section on your scratch space. -
- $VSC_SCRATCH_GLOBAL - - Currently the same as $VSC_SCRATCH, but could be used for a scratch space shared accross all clusters of the VSC in the future. See the section on your scratch space. -
- $VSC_SCRATCH_CLUSTER - - The scratch filesystem closest to the cluster. -
- $VSC_SCRATCH_ARCANINE - - A separate (smaller) shared scratch filesystem, powered by SSDs. This scratch filesystem is intended for very I/O-intensive workloads. -
+ Since these directories are not necessarily mounted on the same locations over all sites, you should always (try to) use the environment @@ -379,15 +306,17 @@ access your UGent home drive and shares. To allow this you need a ticket. This requires that you first authenticate yourself with your UGent username and password by running: -
$ kinit yourugentusername@UGENT.BE
+```
+$ kinit yourugentusername@UGENT.BE
 Password for yourugentusername@UGENT.BE:
-
+``` Now you should be able to access your files running -
$ ls /UGent/yourugentusername
+```
+$ ls /UGent/yourugentusername
 home shares www
-
+``` Please note the shares will only be mounted when you access this folder. You should specify your complete username - tab completion will not @@ -396,48 +325,54 @@ work. If you want to use the UGent shares longer than 24 hours, you should ask a ticket for up to a week by running -
$ kinit yourugentusername@UGENT.BE -r 7d
-
+``` +kinit yourugentusername@UGENT.BE -r 7 +``` You can verify your authentication ticket and expiry dates yourself by running klist -
$ klist
+```
+$ klist
 ...
 Valid starting     Expires            Service principal
 14/07/20 15:19:13  15/07/20 01:19:13  krbtgt/UGENT.BE@UGENT.BE
 	renew until 21/07/20 15:19:13
 
-
+``` Your ticket is valid for 10 hours, but you can renew it before it expires. To renew your tickets, simply run -
$ kinit -R
-
+``` +kinit -R +``` If you want your ticket to be renewed automatically up to the maximum expiry date, you can run -
$ krenew -b -K 60
-
+``` +krenew -b -K 60 +``` Each hour the process will check if your ticket should be renewed. We strongly advise to disable access to your shares once it is no longer needed: -
$ kdestroy
-
+``` +kdestroy +``` If you get an error "*Unknown credential cache type while getting default ccache*" (or similar) and you use conda, then please deactivate conda before you use the commands in this chapter. -
$ conda deactivate
-
+``` +conda deactivate +``` ### UGent shares with globus @@ -447,7 +382,8 @@ endpoint. To do that, you have to ssh to the globus endpoint from a loginnode. You will be prompted for your UGent username and password to authenticate: -
$ ssh globus
+```
+$ ssh globus
 UGent username:ugentusername
 Password for ugentusername@UGENT.BE:
 Shares are available in globus endpoint at /UGent/ugentusername/
@@ -460,16 +396,17 @@ Valid starting     Expires            Service principal
 	renew until 05/08/20 15:56:40
 Tickets will be automatically renewed for 1 week
 Connection to globus01 closed.
-
+``` Your shares will then be available at /UGent/ugentusername/ under the globus VSC tier2 endpoint. Tickets will be renewed automatically for 1 week, after which you'll need to run this again. We advise to disable access to your shares within globus once access is no longer needed: -
$ ssh globus01 destroy
+```
+$ ssh globus01 destroy
 Succesfully destroyed session
-
+``` {% endif %} ### Pre-defined quotas @@ -568,15 +505,16 @@ Check the Python and the PBS file, and submit the job: Remember that this is already a more serious (disk-I/O and computational intensive) job, which takes approximately 3 minutes on the {{ hpc }}. -
$ cat file2.py
-$ cat file2.pbs
-$ qsub file2.pbs
-$ qstat
-$ ls -l
-$ echo $VSC_SCRATCH
-$ ls -l $VSC_SCRATCH
-$ more $VSC_SCRATCH/primes_1.txt
-
+``` +$ cat file2.py +$ cat file2.pbs +$ qsub file2.pbs +$ qstat +$ ls -l +$ echo $VSC_SCRATCH +$ ls -l $VSC_SCRATCH +$ more $VSC_SCRATCH/primes_1.txt +``` ## Reading Input files @@ -601,13 +539,14 @@ In this exercise, you will Check the Python and the PBS file, and submit the job: -
$ cat file3.py
-$ cat file3.pbs
-$ qsub file3.pbs
-$ qstat
-$ ls -l
-$ more $VSC_SCRATCH/primes_2.txt
-
+``` +$ cat file3.py +$ cat file3.pbs +$ qsub file3.pbs +$ qstat +$ ls -l +$ more $VSC_SCRATCH/primes_2.txt +``` ## How much disk space do I get? ### Quota @@ -694,23 +633,25 @@ into the login nodes of that VSC site). {% else %} The "`show_quota`" command has been developed to show you the status of your quota in a readable format: -
$ show_quota
+```
+$ show_quota
 VSC_DATA:    used 81MB (0%)  quota 25600MB
 VSC_HOME:    used 33MB (1%)  quota 3072MB
 VSC_SCRATCH:   used 28MB (0%)  quota 25600MB
 VSC_SCRATCH_GLOBAL: used 28MB (0%)  quota 25600MB
 VSC_SCRATCH_SITE:   used 28MB (0%)  quota 25600MB
-
+``` or on the UAntwerp clusters -
$ module load scripts
-$ show_quota
+```
+$ module load scripts
+$ show_quota
 VSC_DATA:    used 81MB (0%)  quota 25600MB
 VSC_HOME:    used 33MB (1%)  quota 3072MB
 VSC_SCRATCH:   used 28MB (0%)  quota 25600MB
 VSC_SCRATCH_GLOBAL: used 28MB (0%)  quota 25600MB
 VSC_SCRATCH_SITE:   used 28MB (0%)  quota 25600MB
-
+``` With this command, you can follow up the consumption of your total disk quota easily, as it is expressed in percentages. Depending of on which @@ -725,14 +666,15 @@ directories are responsible for the consumption of your disk space. You can check the size of all subdirectories in the current directory with the "`du`" (**Disk Usage**) command: -
$ du
+```
+$ du
 256 ./ex01-matlab/log
 1536 ./ex01-matlab
 768 ./ex04-python
 512 ./ex02-python
 768 ./ex03-python
 5632
-
+``` This shows you first the aggregated size of all subdirectories, and finally the total size of the current directory "." (this includes files @@ -741,28 +683,31 @@ stored in the current directory). If you also want this size to be "human-readable" (and not always the total number of kilobytes), you add the parameter "-h": -
$ du -h
+```
+$ du -h
 256K ./ex01-matlab/log
 1.5M ./ex01-matlab
 768K ./ex04-python
 512K ./ex02-python
 768K ./ex03-python
 5.5M .
-
+``` If the number of lower level subdirectories starts to grow too big, you may not want to see the information at that depth; you could just ask for a summary of the current directory: -
$ du -s
+```
+$ du -s
 5632 .
-$ du -s -h
-
+$ du -s -h +``` If you want to see the size of any file or top-level subdirectory in the current directory, you could use the following command: -
$ du -h --max-depth 1
+```
+$ du -h --max-depth 1
 1.5M ./ex01-matlab
 512K ./ex02-python
 768K ./ex03-python
@@ -770,7 +715,7 @@ current directory, you could use the following command:
 256K ./example.sh
 1.5M ./intro-HPC.pdf
 700M ./.cache
-
+``` Finally, if you don't want to know the size of the data in your current directory, but in some other directory (e.g., your data directory), you @@ -778,13 +723,14 @@ just pass this directory as a parameter. The command below will show the disk use in your home directory, even if you are currently in a different directory: -
$ du -h --max-depth 1 $VSC_HOME
+```
+$ du -h --max-depth 1 $VSC_HOME
 22M {{ homedir }}/dataset01
 36M {{ homedir }}/dataset02
 22M {{ homedir }}/dataset03
 3.5M {{ homedir }}/primes.txt
 24M {{ homedir }}/.cache
-
+``` {% if site == gent %} {% else %} @@ -796,8 +742,9 @@ listing of files. Try: -
$ tree -s -d
-
+``` +$ tree -s -d +``` However, we urge you to only use the `du` and `tree` commands when you really need them as they can put a heavy strain on the file system and @@ -816,8 +763,9 @@ infrastructure. To change the group of a directory and it's underlying directories and files, you can use: -
$ chgrp -R groupname directory
-
+``` +chgrp -R groupname directory +``` ### Joining an existing group @@ -864,9 +812,10 @@ You can get details about the current state of groups on the HPC infrastructure with the following command (`example` is the name of the group we want to inspect): -
$ getent group example
+```
+$ getent group example
 example:*:1234567:vsc40001,vsc40002,vsc40003
-
+``` We can see that the VSC id number is 1234567 and that there are three members in the group: `vsc40001`, `vsc40002` and `vsc40003`. diff --git a/mkdocs/docs/HPC/sites/antwerpen/available-modules.md b/mkdocs/docs/HPC/sites/antwerpen/available-modules.md index 474d3ab5d1b..71c8382b85c 100644 --- a/mkdocs/docs/HPC/sites/antwerpen/available-modules.md +++ b/mkdocs/docs/HPC/sites/antwerpen/available-modules.md @@ -1,4 +1,5 @@ -
$ module av 2>&1 | more
+```
+$ module av 2>&1 | more
 ------------- /apps/antwerpen/modules/hopper/2015a/all ------------
 ABINIT/7.10.2-intel-2015a
 ADF/2014.05
@@ -9,16 +10,17 @@ Boost/1.57.0-intel-2015a-Python-2.7.9
 bzip2/1.0.6-foss-2015a
 bzip2/1.0.6-intel-2015a
 ...
-
+``` Or when you want to check whether some specific software, some compiler or some application (e.g., LAMMPS) is installed on the {{hpc}}. -
$ module av 2>&1 | grep -i -e "LAMMPS"
+```
+$ module av 2>&1 | grep -i -e "LAMMPS"
 LAMMPS/9Dec14-intel-2015a
 LAMMPS/30Oct14-intel-2014a
 LAMMPS/5Sep14-intel-2014a
-
+``` As you are not aware of the capitals letters in the module name, we looked for a case-insensitive name with the "-i" option. diff --git a/mkdocs/docs/HPC/sites/gent/available-modules.md b/mkdocs/docs/HPC/sites/gent/available-modules.md index 13956ab2346..fc246514453 100644 --- a/mkdocs/docs/HPC/sites/gent/available-modules.md +++ b/mkdocs/docs/HPC/sites/gent/available-modules.md @@ -1,4 +1,5 @@ -
$ module av | more
+```
+module avail
 --- /apps/gent/RHEL8/zen2-ib/modules/all ---
    ABAQUS/2021-hotfix-2132
    ABAQUS/2022-hotfix-2214
@@ -6,16 +7,17 @@
    ABAQUS/2023
    ABAQUS/2024-hotfix-2405                                                (D)
    ...
-
+``` Or when you want to check whether some specific software, some compiler or some application (e.g., MATLAB) is installed on the {{hpc}}. -
$ module av matlab
+```
+module avail matlab
 --- /apps/gent/RHEL8/zen2-ib/modules/all ---
    LIBSVM-MATLAB/3.30-GCCcore-11.3.0-MATLAB-2022b-r5
    MATLAB/2019b
    MATLAB/2021b
    MATLAB/2022b-r5                                   (D)
    SPM/12.5_r7771-MATLAB-2021b
-
+``` diff --git a/mkdocs/docs/HPC/torque_options.md b/mkdocs/docs/HPC/torque_options.md index 55c6e0709f5..cc7da6c4812 100644 --- a/mkdocs/docs/HPC/torque_options.md +++ b/mkdocs/docs/HPC/torque_options.md @@ -4,20 +4,20 @@ Below is a list of the most common and useful directives. -| Option | System type | Description| -|:------:|:-----------:|:----------| -| -k | All | Send "stdout" and/or "stderr" to your home directory when the job runs
**#PBS -k o** or **#PBS -k e** or **#PBS -koe**
| -| -l | All | Precedes a resource request, e.g., processors, wallclock | -| -M | All | Send an e-mail messages to an alternative e-mail address
**#PBS -M me@mymail.be**
| -| -m | All | Send an e-mail address when a job **b**egins execution and/or **e**nds or **a**borts
**#PBS -m b** or **#PBS -m be** or **#PBS -m ba** | -| mem | Shared Memory | Memory & Specifies the amount of memory you need for a job.
**#PBS -I mem=90gb** | -| mpiproces | Clusters | Number of processes per node on a cluster. This should equal number of processors on a node in most cases.
**#PBS -l mpiprocs=4** | -| -N | All | Give your job a unique name
**#PBS -N galaxies1234**
| -| -ncpus | Shared Memory | The number of processors to use for a shared memory job.
**#PBS ncpus=4**
| -| -r | All | ontrol whether or not jobs should automatically re-run from the start if the system crashes or is rebooted. Users with check points might not wish this to happen.
**#PBS -r n**

**#PBS -r y**
| -| select | Clusters | Number of compute nodes to use. Usually combined with the mpiprocs directive
**#PBS -l select=2**
| -| -V | All | Make sure that the environment in which the job **runs** is the same as the environment in which it was **submitted
#PBS -V
** -| Walltime | All | The maximum time a job can run before being stopped. If not used a default of a few minutes is used. Use this flag to prevent jobs that go bad running for hundreds of hours. Format is HH:MM:SS
**#PBS -l walltime=12:00:00**
| +| Option | System type | Description | +|:---------:|:-------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| -k | All | Send "stdout" and/or "stderr" to your home directory when the job runs
**#PBS -k o** or **#PBS -k e** or **#PBS -koe**
| +| -l | All | Precedes a resource request, e.g., processors, wallclock | +| -M | All | Send an e-mail messages to an alternative e-mail address
**#PBS -M me@mymail.be**
| +| -m | All | Send an e-mail address when a job **b**egins execution and/or **e**nds or **a**borts
**#PBS -m b** or **#PBS -m be** or **#PBS -m ba** | +| mem | Shared Memory | Memory & Specifies the amount of memory you need for a job.
**#PBS -I mem=90gb** | +| mpiproces | Clusters | Number of processes per node on a cluster. This should equal number of processors on a node in most cases.
**#PBS -l mpiprocs=4** | +| -N | All | Give your job a unique name
**#PBS -N galaxies1234**
| +| -ncpus | Shared Memory | The number of processors to use for a shared memory job.
**#PBS ncpus=4**
| +| -r | All | ontrol whether or not jobs should automatically re-run from the start if the system crashes or is rebooted. Users with check points might not wish this to happen.
**#PBS -r n**

**#PBS -r y**
| +| select | Clusters | Number of compute nodes to use. Usually combined with the mpiprocs directive
**#PBS -l select=2**
| +| -V | All | Make sure that the environment in which the job **runs** is the same as the environment in which it was **submitted
#PBS -V
** | +| Walltime | All | The maximum time a job can run before being stopped. If not used a default of a few minutes is used. Use this flag to prevent jobs that go bad running for hundreds of hours. Format is HH:MM:SS
**#PBS -l walltime=12:00:00**
| ## Environment Variables in Batch Job Scripts @@ -55,25 +55,25 @@ When a batch job is started, a number of environment variables are created that can be used in the batch job script. A few of the most commonly used variables are described here. -| Variable | Description | -|:--------:|:-----------| -| PBS_ENVIRONMENT | set to PBS_BATCH to indicate that the job is a batch job; otherwise, set to PBS_INTERACTIVE to indicate that the job is a PBS interactive job. | -| PBS_JOBID | the job identifier assigned to the job by the batch system. This is the same number you see when you do *qstat*. | -| PBS_JOBNAME | the job name supplied by the user | -| PBS_NODEFILE | the name of the file that contains the list of the nodes assigned to the job . Useful for Parallel jobs if you want to refer the node, count the node etc. | -| PBS_QUEUE | the name of the queue from which the job is executed -| PBS_O_HOME | value of the HOME variable in the environment in which *qsub* was executed | -| PBS_O_LANG | value of the LANG variable in the environment in which *qsub* was executed | -| PBS_O_LOGNAME | value of the LOGNAME variable in the environment in which *qsub* was executed | -| PBS_O_PATH | value of the PATH variable in the environment in which *qsub* was executed | -| PBS_O_MAIL | value of the MAIL variable in the environment in which *qsub* was executed | -| PBS_O_SHELL | value of the SHELL variable in the environment in which *qsub* was executed | -| PBS_O_TZ | value of the TZ variable in the environment in which *qsub* was executed | -| PBS_O_HOST | the name of the host upon which the *qsub* command is running | -| PBS_O_QUEUE | the name of the original queue to which the job was submitted | -| PBS_O_WORKDIR | the absolute path of the current working directory of the *qsub* command. This is the most useful. Use it in every job script. The first thing you do is, cd $PBS_O_WORKDIR after defining the resource list. This is because, pbs throw you to your $HOME directory. | -| PBS_VERSION | Version Number of TORQUE, e.g., TORQUE-2.5.1 | -| PBS_MOMPORT | active port for mom daemon | -| PBS_TASKNUM | number of tasks requested | -| PBS_JOBCOOKIE | job cookie | -| PBS_SERVER | Server Running TORQUE | +| Variable | Description | +|:---------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| PBS_ENVIRONMENT | set to PBS_BATCH to indicate that the job is a batch job; otherwise, set to PBS_INTERACTIVE to indicate that the job is a PBS interactive job. | +| PBS_JOBID | the job identifier assigned to the job by the batch system. This is the same number you see when you do *qstat*. | +| PBS_JOBNAME | the job name supplied by the user | +| PBS_NODEFILE | the name of the file that contains the list of the nodes assigned to the job . Useful for Parallel jobs if you want to refer the node, count the node etc. | +| PBS_QUEUE | the name of the queue from which the job is executed | +| PBS_O_HOME | value of the HOME variable in the environment in which *qsub* was executed | +| PBS_O_LANG | value of the LANG variable in the environment in which *qsub* was executed | +| PBS_O_LOGNAME | value of the LOGNAME variable in the environment in which *qsub* was executed | +| PBS_O_PATH | value of the PATH variable in the environment in which *qsub* was executed | +| PBS_O_MAIL | value of the MAIL variable in the environment in which *qsub* was executed | +| PBS_O_SHELL | value of the SHELL variable in the environment in which *qsub* was executed | +| PBS_O_TZ | value of the TZ variable in the environment in which *qsub* was executed | +| PBS_O_HOST | the name of the host upon which the *qsub* command is running | +| PBS_O_QUEUE | the name of the original queue to which the job was submitted | +| PBS_O_WORKDIR | the absolute path of the current working directory of the *qsub* command. This is the most useful. Use it in every job script. The first thing you do is, cd $PBS_O_WORKDIR after defining the resource list. This is because, pbs throw you to your $HOME directory. | +| PBS_VERSION | Version Number of TORQUE, e.g., TORQUE-2.5.1 | +| PBS_MOMPORT | active port for mom daemon | +| PBS_TASKNUM | number of tasks requested | +| PBS_JOBCOOKIE | job cookie | +| PBS_SERVER | Server Running TORQUE | diff --git a/mkdocs/docs/HPC/troubleshooting.md b/mkdocs/docs/HPC/troubleshooting.md index 462cb839ccf..fd65e6ecad8 100644 --- a/mkdocs/docs/HPC/troubleshooting.md +++ b/mkdocs/docs/HPC/troubleshooting.md @@ -98,8 +98,9 @@ and thus requesting multiple cores and/or nodes will only result in wasted resou If you get from your job output an error message similar to this: -
=>> PBS: job killed: walltime <value in seconds> exceeded limit  <value in seconds>
-
+``` +=>> PBS: job killed: walltime exceeded limit +``` This occurs when your job did not complete within the requested walltime. See @@ -119,7 +120,6 @@ option is to request extra quota for your VO to the VO moderator/s. See section on [Pre-defined user directories](../running_jobs_with_input_output_data/#pre-defined-user-directories) and [Pre-defined quotas](../running_jobs_with_input_output_data/#pre-defined-quotas) for more information about quotas and how to use the storage endpoints in an efficient way. {% endif %} - ## Issues connecting to login node { #sec:connecting-issues} @@ -128,8 +128,9 @@ the key/lock analogy in [How do SSH keys work?](../account/#how-do-ssh-keys-work If you have errors that look like: -
{{ userid }}@{{ loginnode }}: Permission denied
-
+``` +{{ userid }}@{{ loginnode }}: Permission denied +``` or you are experiencing problems with connecting, here is a list of things to do that should help: @@ -150,12 +151,13 @@ things to do that should help: 2. Use `ssh-add` (see section [Using an SSH agent](../account/#using-an-ssh-agent-optional)) *OR;* 3. Specify the location of the key in `$HOME/.ssh/config`. You will need to replace the VSC login id in the `User` field with your own: -
                Host {{ hpcname }}
+                ```
+                Host {{ hpcname }}
                     Hostname {{ loginnode }}
-                    IdentityFile /path/to/private/key
-                    User {{ userid }}
-                
- Now you can just connect with ssh {{ hpcname }}. + IdentityFile /path/to/private/key + User {{ userid }} + ``` + Now you can connect with `ssh {{ hpcname }}`. {% endif %} 4. Please double/triple check your VSC login ID. It should look @@ -193,8 +195,9 @@ things to do that should help: {% if OS == windows %} If you are using PuTTY and get this error message: -
server unexpectedly closed network connection
-
+``` +server unexpectedly closed network connection +``` it is possible that the PuTTY version you are using is too old and doesn't support some required (security-related) features. @@ -217,49 +220,35 @@ and include it in the email. 2. Single click on the saved configuration -
![image](img/831change01.png) -
3. Then click ++"Load"++ button -
![image](img/831change02.png) -
4. Expand SSH category (on the left panel) clicking on the "+" next to SSH -
![image](img/831change03.png) -
5. Click on Auth under the SSH category -
![image](img/831change04.png) -
6. On the right panel, click ++"Browse"++ button -
![image](img/831change05.png) -
7. Then search your private key on your computer (with the extension ".ppk") 8. Go back to the top of category, and click Session -
![image](img/831change06.png) -
9. On the right panel, click on ++"Save"++ button -
![image](img/831change07.png) -
### Check whether your private key in PuTTY matches the public key on the accountpage @@ -269,28 +258,20 @@ Follow the instructions in [Change PuTTY private key for a saved configuration]( then select all text (push ++"Ctrl"++ + ++"a"++ ), then copy the location of the private key (push ++"Ctrl"++ + ++"c"++) -
![image](img/832check05.png) -
2. Open PuTTYgen -
![image](img/832check06.png) -
3. Enter menu item "File" and select "Load Private key" -
![image](img/832check07.png) -
4. On the "Load private key" popup, click in the textbox next to "File name:", then paste the location of your private key (push ++"Ctrl"++ + ++"v"++), then click ++"Open"++ -
![image](img/832check08.png) -
5. Make sure that your Public key from the "Public key for pasting into OpenSSH authorized_keys file" textbox is in your "Public @@ -298,15 +279,14 @@ Follow the instructions in [Change PuTTY private key for a saved configuration]( (Scroll down to the bottom of "View Account" tab, you will find there the "Public keys" section) -
![image](img/832check09.png) -
{% else %} Please add `-vvv` as a flag to `ssh` like: -
ssh -vvv {{ userid }}@{{ loginnode }}
-
+``` +ssh -vvv {{ userid }}@{{ loginnode }} +``` and include the output of that command in the message. {% endif %} @@ -320,7 +300,8 @@ system you are connecting to has changed. {% if OS == (linux or macos) %} -
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 
+```
+@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 
 @     WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!    @
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 
 IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! 
@@ -331,10 +312,10 @@ The fingerprint for the ECDSA key sent by the remote host is
 SHA256:1MNKFTfl1T9sm6tTWAo4sn7zyEfiWFLKbk/mlT+7S5s. 
 Please contact your system administrator. 
 Add correct host key in  ~/.ssh/known_hosts to get rid of this message. 
-Offending ECDSA key in  ~/.ssh/known_hosts:21
+Offending ECDSA key in  ~/.ssh/known_hosts:21
 ECDSA host key for {{ loginnode }} has changed and you have requested strict checking.
 Host key verification failed.
-
+``` You will need to remove the line it's complaining about (in the example, line 21). To do that, open `~/.ssh/known_hosts` in an editor, and remove the @@ -344,8 +325,9 @@ to. Alternatively you can use the command that might be shown by the warning under `remove with:` and it should be something like this: -
ssh-keygen -f "~/.ssh/known_hosts" -R "{{loginnode}}"
-
+``` +ssh-keygen -f "~/.ssh/known_hosts" -R "{{loginnode}}" +``` If the command is not shown, take the file from the "Offending ECDSA key in", and the host name from "ECDSA host key for" lines. @@ -356,8 +338,9 @@ After you've done that, you'll need to connect to the {{ hpc }} again. See [Warn You will need to verify that the fingerprint shown in the dialog matches one of the following fingerprints: -
{{ puttyFirstConnect }}
-
+``` +{{ puttyFirstConnect }} +``` **Do not click "Yes" until you verified the fingerprint. Do not press "No" in any case.** @@ -367,9 +350,7 @@ If it doesn't (like in the example) or you are in doubt, take a screenshot, pres {% include "../macros/sshedfingerprintnote.md" %} -
![image](img/putty_security_alert.jpg) -
{% if site == gent %} If you use X2Go client, you might get one of the following fingerprints: @@ -389,14 +370,16 @@ If it doesn't, or you are in doubt, take a screenshot, press "Yes" and contact { If you get errors like: -
qsub fibo.pbs
+```
+$ qsub fibo.pbs
 qsub: script is written in DOS/Windows text format
-
+``` or -
sbatch: error: Batch script contains DOS line breaks (\r\n)
-
+``` +sbatch: error: Batch script contains DOS line breaks (\r\n) +``` It's probably because you transferred the files from a Windows computer. See the [section about `dos2unix` in Linux tutorial](../linux-tutorial/uploading_files/#dos2unix) to fix this error. @@ -405,17 +388,20 @@ See the [section about `dos2unix` in Linux tutorial](../linux-tutorial/uploading {% if OS == (linux or macos) %} -
ssh {{userid}}@{{loginnode}}
-The authenticity of host {{loginnode}} (<IP-adress>) can't be established. 
-<algorithm> key fingerprint is <hash>
+```
+$ ssh {{userid}}@{{loginnode}}
+The authenticity of host {{loginnode}} () can't be established. 
+ key fingerprint is 
 Are you sure you want to continue connecting (yes/no)?
-
+``` Now you can check the authenticity by checking if the line that is at the place of the underlined piece of text matches one of the following lines: -
{{opensshFirstConnect}}
+``` +{{opensshFirstConnect}} +``` {% endif %} {% if site == gent %} @@ -473,8 +459,6 @@ you via the `ulimit -v` command *in your job script*. See [Generic resource requirements](../running_batch_jobs/#generic-resource-requirements) to set memory and other requirements, see [Specifying memory requirements](../fine_tuning_job_specifications/#specifying-memory-requirements) to finetune the amount of memory you request. - {% if site == gent %} ## Module conflicts @@ -496,7 +480,6 @@ While processing the following module(s): Module fullname Module Filename --------------- --------------- HMMER/3.1b2-intel-2017a /apps/gent/CO7/haswell-ib/modules/all/HMMER/3.1b2-intel-2017a.lua - ``` This resulted in an error because we tried to load two modules with different @@ -539,11 +522,12 @@ As a rule of thumb, toolchains in the same row are compatible with each other: Another common error is: -
$ module load cluster/{{othercluster}}
+```
+$ module load cluster/{{othercluster}}
 Lmod has detected the following error: A different version of the 'cluster' module is already loaded (see output of 'ml').
 
 If you don't understand the warning or error, contact the helpdesk at hpc@ugent.be
-
+``` This is because there can only be one `cluster` module active at a time. The correct command is `module swap cluster/{{othercluster}}`. See also [Specifying the cluster on which to run](../running_batch_jobs/#specifying-the-cluster-on-which-to-run). @@ -556,22 +540,24 @@ The correct command is `module swap cluster/{{othercluster}}`. See also [Specify When running software provided through modules (see [Modules](../running_batch_jobs/#modules)), you may run into errors like: -
$ module swap cluster/donphan
+```
+$ module swap cluster/donphan
 The following have been reloaded with a version change:
   1) cluster/doduo => cluster/donphan         3) env/software/doduo => env/software/donphan
   2) env/slurm/doduo => env/slurm/donphan     4) env/vsc/doduo => env/vsc/donphan
 
-$ module load Python/3.10.8-GCCcore-12.2.0
-$ python
+$ module load Python/3.10.8-GCCcore-12.2.0
+$ python
 Please verify that both the operating system and the processor support
 Intel(R) MOVBE, F16C, FMA, BMI, LZCNT and AVX2 instructions.
-
+``` or errors like: -
$ python
+```
+$ python
 Illegal instruction
-
+``` When we swap to a different cluster, the available modules change so they work for that cluster. That means that if the cluster and the login @@ -587,8 +573,9 @@ all our modules will get reloaded. This means that all current modules will be unloaded and then loaded again, so they'll work on the newly loaded cluster. Here's an example of how that would look like: -
$ module load Python/3.10.8-GCCcore-12.2.0
-$ module swap cluster/donphan
+```
+$ module load Python/3.10.8-GCCcore-12.2.0
+$ module swap cluster/donphan
 
 Due to MODULEPATH changes, the following have been reloaded:
   1) GCCcore/12.2.0                   8) binutils/2.39-GCCcore-12.2.0
@@ -602,7 +589,7 @@ Due to MODULEPATH changes, the following have been reloaded:
 The following have been reloaded with a version change:
   1) cluster/doduo => cluster/donphan         3) env/software/doduo => env/software/donphan
   2) env/slurm/doduo => env/slurm/donphan     4) env/vsc/doduo => env/vsc/donphan
-
+``` This might result in the same problems as mentioned above. When swapping to a different cluster, you can run `module purge` to unload all modules @@ -613,9 +600,10 @@ to avoid problems (see [Purging all modules](../running_batch_jobs/#purging-all- When using a tool that is made available via modules to submit jobs, for example [Worker](multi_job_submission.md), you may run into the following error when targeting a non-default cluster: -
$  wsub
-/apps/gent/.../.../software/worker/.../bin/wsub: line 27: 2152510 Illegal instruction     (core dumped) ${PERL} ${DIR}/../lib/wsub.pl "$@"
-
+``` +$ wsub +/apps/gent/.../.../software/worker/.../bin/wsub: line 27: 2152510 Illegal instruction (core dumped) ${PERL} ${DIR}/../lib/wsub.pl "$@" +``` When executing the `module swap cluster` command, you are not only changing your session environment to submit to that specific cluster, but also to use the part of the central software stack that is specific to that cluster. @@ -630,9 +618,13 @@ The same goes for the other clusters as well of course. !!! Tip To submit a Worker job to a specific cluster, like the [`donphan` interactive cluster](interactive_debug.md) for instance, use: -
$ module swap env/slurm/donphan 
+ ``` + $ module swap env/slurm/donphan + ``` instead of -
$ module swap cluster/donphan 
+ ``` + $ module swap cluster/donphan + ``` We recommend using a `module swap cluster` command after submitting the jobs. diff --git a/mkdocs/docs/HPC/useful_linux_commands.md b/mkdocs/docs/HPC/useful_linux_commands.md index 258dfa70d25..afaa87575e9 100644 --- a/mkdocs/docs/HPC/useful_linux_commands.md +++ b/mkdocs/docs/HPC/useful_linux_commands.md @@ -6,73 +6,32 @@ All the {{hpc}} clusters run some variant of the "{{operatingsystembase}}" opera that, when you connect to one of them, you get a command line interface, which looks something like this: -
{{userid}}@ln01[203] $
-
+``` +{{userid}}@ln01[203] $ +``` When you see this, we also say you are inside a "shell". The shell will accept your commands, and execute them. - - - - - - - - - - - - - -{% if site == gent %} - - - - -{% else %} - - - - +| Command | Description | +|---------|----------------------------------------------------| +| `ls` | Shows you a list of files in the current directory | +| `cd` | Change current working directory | +| `rm` | Remove file or directory | +| `echo` | Prints its parameters to the screen | +{% if site == gent %} | `nano` | Text editor | +{% else %} | `joe` | Text editor | {% endif %} - - - - -
- ls - - Shows you a list of files in the current directory -
- cd - - Change current working directory -
- rm - - Remove file or directory -
- nano - - Text editor -
- joe - - Text editor -
- echo - - Prints its parameters to the screen -
+ Most commands will accept or even need parameters, which are placed after the command, separated by spaces. A simple example with the "echo" command: -
$ echo This is a test
+```
+$ echo This is a test
 This is a test
-
+``` Important here is the "$" sign in front of the first line. This should not be typed, but is a convention meaning "the rest of this line should @@ -84,10 +43,11 @@ explained then if necessary. If not, you can usually get more information about a command, say the item or command "ls", by trying either of the following: -
$ ls --help 
-$ man ls
-$  info ls 
-
+``` +$ ls --help +$ man ls +$ info ls +``` (You can exit the last two "manuals" by using the "q" key.) For more exhaustive tutorials about Linux usage, please refer to the following @@ -125,38 +85,43 @@ hostname You can type both lines at your shell prompt, and the result will be the following: -
$ echo "Hello! This is my hostname:"
+```
+$ echo "Hello! This is my hostname:"
 Hello! This is my hostname:
-$ hostname
+$ hostname
 {{loginhost}}
-
+``` Suppose we want to call this script "foo". You open a new file for editing, and name it "foo", and edit it with your favourite editor {% if site == gent %} -
$ nano foo
-
+``` +nano foo +``` {% else %} -
$ vi foo
-
+``` +$ vi foo +``` {% endif %} or use the following commands: -
$ echo "echo Hello! This is my hostname:" > foo
-$ echo hostname >> foo
-
+``` +echo "echo Hello! This is my hostname:" > foo +echo hostname >> foo +``` The easiest ways to run a script is by starting the interpreter and pass the script as parameter. In case of our script, the interpreter may either be "sh" or "bash" (which are the same on the cluster). So start the script: -
$ bash foo
+```
+$ bash foo
 Hello! This is my hostname:
 {{loginhost}}
-
+``` Congratulations, you just created and started your first shell script! @@ -171,9 +136,10 @@ the following line on top of your shell script You can find this path with the "which" command. In our case, since we use bash as an interpreter, we get the following path: -
$ which bash
+```
+$ which bash
 /bin/bash
-
+``` We edit our script and change it with this information: @@ -188,15 +154,17 @@ script. Finally, we tell the operating system that this script is now executable. For this we change its file attributes: -
$  chmod +x foo
-
+``` +chmod +x foo +``` Now you can start your script by simply executing it: -
$ ./foo
+```
+$ ./foo
 Hello! This is my hostname:
 {{loginhost}}
-
+``` The same technique can be used for all other scripting languages, like Perl and Python. @@ -209,400 +177,99 @@ not ignore these lines, you may get strange results ... ### Archive Commands - - - - - - - - - - - - - - - - - -
- tar - - An archiving program designed to store and extract files from an archive known as a tar file. -
- tar -cvf foo.tar foo/ - - compress the contents of foo folder to foo.tar -
- tar -xvf foo.tar - - extract foo.tar -
- tar -xvzf foo.tar.gz - - extract gzipped foo.tar.gz -
+| Command | Description | +|-------------------------|-----------------------------------------------------------------------------------------------| +| `tar` | An archiving program designed to store and extract files from an archive known as a tar file. | +| `tar -cvf foo.tar foo/` | Compress the contents of `foo` folder to `foo.tar` | +| `tar -xvf foo.tar` | Extract `foo.tar` | +| `tar -xvzf foo.tar.gz` | Extract gzipped `foo.tar.gz` | + ### Basic Commands - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- ls - - Shows you a list of files in the current directory -
- cd - - Change the current directory -
- rm - - Remove file or directory -
- mv - - Move file or directory -
- echo - - Display a line or text -
- pwd - - Print working directory -
- mkdir - - Create directories -
- rmdir - - Remove directories -
+| Command | Description | +|---------|----------------------------------------------------| +| `ls` | Shows you a list of files in the current directory | +| `cd` | Change the current directory | +| `rm` | Remove file or directory | +| `mv` | Move file or directory | +| `echo` | Display a line or text | +| `pwd` | Print working directory | +| `mkdir` | Create directories | +| `rmdir` | Remove directories | ### Editor - - - - - - - - - - - - - -
- emacs - - -
- nano - - Nano's ANOther editor, an enhanced free Pico clone -
- vi - - A programmers text editor -
+| Command | Description | +|---------|----------------------------------------------------| +| `emacs` | | +| `nano` | Nano's ANOther editor, an enhanced free Pico clone | +| `vi` | A programmer's text editor | + ### File Commands - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- cat - - Read one or more files and print them to standard output -
- cmp - - Compare two files byte by byte -
- cp - - Copy files from a source to the same or different target(s) -
- du - - Estimate disk usage of each file and recursively for directories -
- find - - Search for files in directory hierarchy -
- grep - - Print lines matching a pattern -
- ls - - List directory contents -
- mv - - Move file to different targets -
- rm - - Remove files -
- sort - - Sort lines of text files -
- wc - - Print the number of new lines, words, and bytes in files -
+| Command | Description | +|---------|------------------------------------------------------------------| +| `cat` | Read one or more files and print them to standard output | +| `cmp` | Compare two files byte by byte | +| `cp` | Copy files from a source to the same or different target(s) | +| `du` | Estimate disk usage of each file and recursively for directories | +| `find` | Search for files in directory hierarchy | +| `grep` | Print lines matching a pattern | +| `ls` | List directory contents | +| `mv` | Move file to different targets | +| `rm` | Remove files | +| `sort` | Sort lines of text files | +| `wc` | Print the number of new lines, words, and bytes in files | + ### Help Commands - - - - - -
- man - - Displays the manual page of a command with its name, synopsis, description, author, copyright etc. -
+| Command | Description | +|---------|-----------------------------------------------------------------------------------------------------| +| `man` | Displays the manual page of a command with its name, synopsis, description, author, copyright, etc. | + ### Network Commands - - - - - - - - - - - - - -
- hostname - - show or set the system's host name -
- ifconfig - - Display the current configuration of the network interface. It is also useful to get the information about IP address, subnet mask, set remote IP address, netmask etc. -
- ping - - send ICMP ECHO_REQUEST to network hosts, you will get back ICMP packet if the host responds. This command is useful when you are in a doubt whether your computer is connected or not. -
+| Command | Description | +|------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `hostname` | Show or set the system's host name | +| `ifconfig` | Display the current configuration of the network interface. It is also useful to get the information about IP address, subnet mask, set remote IP address, netmask, etc. | +| `ping` | Send ICMP ECHO_REQUEST to network hosts. You will get back an ICMP packet if the host responds. This command is useful to check whether your computer is connected or not. | + ### Other Commands - - - - - - - - - - - - - - - - - -
- logname - - Print user's login name -
- quota - - Display disk usage and limits -
- which - - Returns the pathnames of the files that would be executed in the current environment -
- whoami - - Displays the login name of the current effective user -
+| Command | Description | +|-----------|--------------------------------------------------------------------------------------| +| `logname` | Print user's login name | +| `quota` | Display disk usage and limits | +| `which` | Returns the pathnames of the files that would be executed in the current environment | +| `whoami` | Displays the login name of the current effective user | + ### Process Commands - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- & - - In order to execute a command in the background, place an ampersand (&) on the command line at the end of the command. A user job number (placed in brackets) and a system process number are displayed. A system process number is the number by which the system identifies the job whereas a user job number is the number by which the user identifies the job -
- at - - executes commands at a specified time -
- bg - - Places a suspended job in the background -
- crontab - - crontab is a file which contains the schedule of entries to run at specified times -
- fg - - A process running in the background will be processed in the foreground -
- jobs - - Lists the jobs being run in the background -
- kill - - Cancels a job running in the background, it takes argument either the user job number or the system process number -
- ps - - Reports a snapshot of the current processes -
- top - - Display Linux tasks -
+| Command | Description | +|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `&` | In order to execute a command in the background, place an ampersand (`&`) at the end of the command line. A user job number (in brackets) and a system process number are displayed. The system process number identifies the job, while the user job number is used by the user. | +| `at` | Executes commands at a specified time | +| `bg` | Places a suspended job in the background | +| `crontab` | A file which contains the schedule of entries to run at specified times | +| `fg` | A process running in the background will be processed in the foreground | +| `jobs` | Lists the jobs being run in the background | +| `kill` | Cancels a job running in the background; it takes either the user job number or the system process number as an argument | +| `ps` | Reports a snapshot of the current processes | +| `top` | Displays Linux tasks | + ### User Account Commands - - - - - -
- chmod - - Modify properties for users -
\ No newline at end of file +| Command | Description | +|---------|------------------------------------| +| `chmod` | Modify properties for users | diff --git a/mkdocs/docs/HPC/web_portal.md b/mkdocs/docs/HPC/web_portal.md index 9c95a20f72d..2ea0b15bd78 100644 --- a/mkdocs/docs/HPC/web_portal.md +++ b/mkdocs/docs/HPC/web_portal.md @@ -12,11 +12,7 @@ required to connect to your VSC account via this web portal.\ Please note that we do recommend to use our interactive and debug cluster (see chapter [interactive and debug cluster](./interactive_debug.md)) with `OoD`. -To connect to the HPC-UGent infrastructure via the web portal, visit: - -
- -
+To connect to the HPC-UGent infrastructure via the web portal, visit Note that you may only see a "*Submitting...*" message appear for a couple of seconds, which is perfectly normal. @@ -66,9 +62,7 @@ requested to let the web portal access some of your personal information (VSC login ID, account status, login shell and institute name), as shown in this screenshot below: -
![image](img/ood_permission.png) -
**Please click "Authorize" here.** @@ -79,9 +73,7 @@ afterwards. Once logged in, you should see this start page: -
![image](img/ood_start.png) -
This page includes a menu bar at the top, with buttons on the left providing access to the different features supported by the web portal, @@ -92,9 +84,7 @@ high-level overview of the HPC-UGent Tier-2 clusters. If your browser window is too narrow, the menu is available at the top right through the "hamburger" icon: -
![image](img/ood_hamburger.png) -
## Features @@ -112,9 +102,7 @@ The drop-down menu provides short-cuts to the different `$VSC_*` directories and filesystems you have access to. Selecting one of the directories will open a new browser tab with the *File Explorer*: -
![image](img/ood_file_explorer.png) -
Here you can: @@ -188,9 +176,7 @@ Jobs* menu item under *Jobs*. A new browser tab will be opened that shows all your current queued and/or running jobs: -
![image](img/ood_active_jobs.png) -
You can control which jobs are shown using the *Filter* input area, or select a particular cluster from the drop-down menu *All Clusters*, both @@ -213,9 +199,7 @@ To submit new jobs, you can use the *Job Composer* menu item under *Jobs*. This will open a new browser tab providing an interface to create new jobs: -
![image](img/ood_job_composer.png) -
This extensive interface allows you to create jobs from one of the available templates, or by copying an existing job. @@ -232,9 +216,7 @@ Don't forget to actually submit your job to the system via the green In addition, you can inspect provided job templates, copy them or even create your own templates via the *Templates* button on the top: -
![image](img/ood_job_templates.png) -
### Shell access @@ -242,9 +224,7 @@ Through the *Shell Access* button that is available under the *Clusters* menu item, you can easily open a terminal (shell) session into your VSC account, straight from your browser! -
![image](img/ood_shell.png) -
Using this interface requires being familiar with a Linux shell environment (see @@ -263,9 +243,7 @@ terminal multiplexer tool like `screen` or `tmux`). To create a graphical desktop environment, use on of the *desktop on... node* buttons under *Interactive Apps* menu item. For example: -
![image](img/ood_launch_desktop.png) -
You can either start a desktop environment on a login node for some lightweight tasks, or on a workernode of one of the HPC-UGent Tier-2 @@ -279,9 +257,7 @@ To access the desktop environment, click the *My Interactive Sessions* menu item at the top, and then use the *Launch desktop on \... node* button if the desktop session is *Running*: -
![image](img/ood_desktop_running.png) -
#### Jupyter notebook @@ -289,48 +265,36 @@ Through the web portal you can easily start a [Jupyter notebook](https://jupyter.org/) on a workernode, via the *Jupyter Notebook* button under the *Interactive Apps* menu item. -
![image](img/ood_start_jupyter.png) -
After starting the Jupyter notebook using the *Launch* button, you will see it being added in state *Queued* in the overview of interactive sessions (see *My Interactive Sessions* menu item): -
![image](img/ood_jupyter_queued.png) -
When your job hosting the Jupyter notebook starts running, the status will first change the *Starting*: -
![image](img/ood_jupyter_starting.png) -
and eventually the status will change to *Running*, and you will be able to connect to the Jupyter environment using the blue *Connect to Jupyter* button: -
![image](img/ood_jupyter_running.png) -
This will launch the Jupyter environment in a new browser tab, where you can open an existing notebook by navigating to the directory where it located and clicking it, or using the *New* menu on the top right: -
![image](img/ood_jupyter_new_notebook.png) -
Here's an example of a Jupyter notebook in action. Note that several non-standard Python packages (like *numpy*, *scipy*, *pandas*, *matplotlib*) are readily available: -
![image](img/ood_jupyter_notebook_example.png) -
## Restarting your web server in case of problems @@ -340,9 +304,7 @@ web server running in your VSC account. You can do this via the *Restart Web Server* button under the *Help* menu item: -
![image](img/ood_help_restart_web_server.png) -
Of course, this only affects your own web portal session (not those of others). diff --git a/mkdocs/docs/HPC/x2go.md b/mkdocs/docs/HPC/x2go.md index f2daea36ff9..eb9af197d15 100644 --- a/mkdocs/docs/HPC/x2go.md +++ b/mkdocs/docs/HPC/x2go.md @@ -54,9 +54,7 @@ There are two ways to connect to the login node: This is the easier way to setup X2Go, a direct connection to the login node. -
![image](img/ch19-x2go-configuration-gent.png) -
1. Include a session name. This will help you to identify the session @@ -76,9 +74,7 @@ node. 1. Click on the "Use RSA/DSA.." folder icon. This will open a file browser. -
![image](img/ch19-x2go-ssh-key.png) -
{% if OS == (macos or linux) %} 2. You should look for your **private** SSH key generated in [Generating a public/private key pair](../account/#generating-a-publicprivate-key-pair). This file has been stored in the directory "*~/.ssh/*" (by default "**id_rsa**"). @@ -109,9 +105,7 @@ node. copy-pasting support. {% endif %} -
![image](img/ch19-x2go-configuration-xterm.png) -
1. **[optional]:** Change the session icon. @@ -123,9 +117,7 @@ This option is useful if you want to resume a previous session or if you want to set explicitly the login node to use. In this case you should include a few more options. Use the same **Option A** setup but with these changes: -
![image](img/ch19-x2go-configuration-gent-proxy.png) -
1. Include a session name. This will help you to identify the session if you have more than one (in our example "HPC UGent proxy login"). @@ -146,9 +138,7 @@ include a few more options. Use the same **Option A** setup but with these chang did for the server configuration (The "RSA/DSA key" field must be set in both sections) -
![image](img/ch19-x2go-proxy-key.png) -
4. Click the ++"OK"++ button after these changes. @@ -161,9 +151,7 @@ open session or if you click on the "shutdown" button from X2Go. If you want to suspend your session to continue working with it later just click on the "pause" icon. -
![image](img/ch19-x2go-pause.png) -
X2Go will keep the session open for you (but only if the login node is not rebooted). @@ -175,8 +163,9 @@ session, you should know which login node were used at first place. You can get this information before logging out from your X2Go session. Just open a terminal and execute: -
$ hostname
-
+``` +hostname +``` ![image](img/ch19-x2go-xterm.png) diff --git a/mkdocs/docs/HPC/xdmod.md b/mkdocs/docs/HPC/xdmod.md index 725f3aa7996..4c510bae823 100644 --- a/mkdocs/docs/HPC/xdmod.md +++ b/mkdocs/docs/HPC/xdmod.md @@ -4,11 +4,7 @@ The XDMoD web portal provides information about completed jobs, storage usage and the HPC UGent cloud infrastructure usage. To connect to the XDMoD portal, turn on your VPN connection to UGent and -visit - -
- -
+visit . Note that you may need to authorise XDMoD to obtain information from your VSC account through the VSC accountpage. @@ -18,8 +14,5 @@ web application shows you several features through a series of tips. Located in the upper right corner of the web page is the help button, taking you to the XDMoD User Manual. As things may change, we recommend -checking out the provided documenation for information on XDMoD use: - -
- -
+checking out the provided documenation for information on XDMoD use +. diff --git a/mkdocs/docs/macros/firsttimeconnection.md b/mkdocs/docs/macros/firsttimeconnection.md index 22b390a34a5..0e0f3d0ba63 100644 --- a/mkdocs/docs/macros/firsttimeconnection.md +++ b/mkdocs/docs/macros/firsttimeconnection.md @@ -3,7 +3,10 @@ Alert will appear and you will be asked to verify the authenticity of the login node. Make sure the fingerprint in the alert matches one of the following: -
{{ puttyFirstConnect }}
+ +``` +{{ puttyFirstConnect }} +``` If it does, press ***Yes***, if it doesn't, please contact {{ hpcinfo }}.