Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added conda section #649

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config/templates/hpc.template
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ nav:
- Best Practices: best_practices.md
- Graphical applications with VNC: VNC.md
- Graphical applications with X2Go: x2go.md
- Conda: conda.md
{%- if site == 'Gent' %}
- GPU clusters: gpu.md
- Cron scripts: crontab.md
Expand Down
69 changes: 69 additions & 0 deletions mkdocs/docs/HPC/conda.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Conda

Conda is a software package manager for data science that allows unprivileged (non-administrative) users to search,
fetch, install, upgrade, use, and manage supported open-source software packages and
programming languages/libraries/environments in a directory they have write access to.
Conda allows users to create reproducible software environments

We do not recommend using conda environments on HPC clusters due to several common issues,
which we will explain in this section. However, we do not forbid the use of conda entirely. In certain situations,
lbarraga marked this conversation as resolved.
Show resolved Hide resolved
such as testing software, creating new workflows, or teaching a course, it is perfectly acceptable to use it.

## Performance and Optimization

Conda's packages are pre-compiled binaries that are designed to work on a wide range of systems.
This means they are not optimized for the specific architecture of HPC clusters,
leading to potential performance drawbacks compared to modules compiled specifically for the HPC environment.

HPC modules on the other hand, are compiled for the specific architecture of the cluster,
and are optimized for performance.

## Compatibility and Dependency

Using Conda in conjunction with centrally installed modules can lead to conflicts and unexpected errors,
making it difficult to debug and manage dependencies.
lbarraga marked this conversation as resolved.
Show resolved Hide resolved

If you do wish to use conda, we recommend only using packages from conda itself, even when they are available as modules.
This will degrade performance, but will avoid conflicts between conda and module dependencies.

## Package Availability

Conda has a smaller repository of available packages compared to PyPI, the repository used by pip.
This can limit the availability of specific tools or libraries needed for certain workflows.

## Environment and Installation Issues

### Home Directory Usage

by default, conda installs packages in the user's home directory,
which can quickly fill up disk quotas due to the large number of files and directories it creates.
This is particularly problematic in the HPC environment where home directory quotas are limited.

If you do wish to use conda, we recommend setting the conda package directory to a location with more storage
space. This can be done by running:

```bash
export CONDA_PKGS_DIRS=$VSC_DATA/conda_pkgs
```

before installing packages into a conda environment.


### Modification of Configuration Files

Conda modifies the .bashrc file in the user's home directory,
lbarraga marked this conversation as resolved.
Show resolved Hide resolved
which can lead to conflicts and unintended side effects in the user's environment setup.

You can avoid the change of the `.bashrc` file by activating the environment with `source activate full/path/to/myenv`
instead of `conda activate myenv`.

## Anaconda Licensing

If your organization uses Conda packages from Anaconda channels,
be aware that Anaconda may now be actively enforcing its licensing terms.
Academic, non-profit, and research institutions that previously used Anaconda software for free may now be required to
purchase a commercial license if they have over 200 employees or contractors.
Anaconda has sent legal notices to institutions demanding compliance and threatening backdated billing.
This shift follows changes in Anaconda’s terms of service,
updated in March 2024, and may affect your access to Anaconda’s resources.
Ensure your organization reviews its usage and licensing agreements to avoid potential legal issues.
Loading