The hpc4neuro library of Python utilities

This project brings together a collection of utilities that have been factored out from different projects. In certain cases we need a specific functionality that does not appear to be available in existing packages. We therefore develop the required code in-house, and if it appears to be general enough, we add it to this collection in the hope that it may be useful to others.

This library has moved from https://gitlab.jsc.fz-juelich.de/hpc4ns/hpc4neuro to Github and will be, where possible, supported by the Multiscale team.

This library has been written by Fahad Khalid at the Forschungszentrum Juelich GmbH

Question can be placed at the main repo: https://github.com/multiscale-cosim/EBRAINS-cosim

or by sending an email to [email protected]

Setup and Requirements

The hpc4neuro package requires Python 3.6 or above. To install, please use the following command:

python -m pip install git+https://github.com/multiscale-cosim/HPC4Neuro.git

Available modules

The following modules are available at this time:

1. `hpc4neuro.distribution`

Note: This module requires mpi4py. To install mpi4py, please follow installation instructions available here.

This module exposes the following two classes:

DataDistributor
ErrorHandler

Any function that returns a sized iterable (i.e., an object that supports iter() and len(), e.g., list), can be be decorated by DataDistributor to seamlessly distribute items in the resulting object across all participating MPI ranks. Moreover, ErrorHandler implements exception handling functions that ensure graceful application termination via synchronization of all MPI ranks.

The primary motivation for creating this module was to hide the details of distributing training/validation data amongst MPI ranks when training deep artificial neural networks in a data-parallel fashion using Horovod. Even though Horovod hides the intricate details of distributed training, proper distribution of training/validation data is only possible via MPI programming.

The hpc4neuro.distribution module provides a high-level interface for data distribution with MPI, without the explicit need to write MPI code on the user's part. The following examples show what the module does, and how it can be useful.

Examples

Note: All examples are available in the hpc4neuro.examples.distribution package.

Consider the following code that defines a simple function which returns a list of files read from a given directory.

import os

def get_filenames(path):
    return os.listdir(path)

# List of the filenames in the 'hpc4neuro' directory
filenames = get_filenames('./hpc4neuro')

Distributed case 1: Using the static decorator syntax

Now consider a scenario in which we need to run this code on multiple processors across multiple nodes in a cluster, and distribute the returned filenames across all the processes. The following example shows how the hpc4neuro.distribution module can help with that.

import os
from mpi4py import MPI

from hpc4neuro.distribution import DataDistributor

@DataDistributor(MPI.COMM_WORLD)
def get_filenames(path):
    return os.listdir(path)

# List of rank-local file names
filenames = get_filenames('./hpc4neuro')

DataDistributor decorates the get_filenames function such that calling the function returns only a subset of filenames that are to be processed by the local MPI rank. All the MPI communication required for distribution of filenames is hidden from the user.

Distributed case 2: Dynamically decorating a function

In certain scenarios it is not possible to statically decorate a function using the decorator syntax, e.g., when the MPI communicator object is not available at the time of function definition. The following example demonstrates the use of DataDistributor in such cases.

import os
from mpi4py import MPI

from hpc4neuro.distribution import DataDistributor

# Initialize the decorator
dist_decorator = DataDistributor(MPI.COMM_WORLD)

# Decorate the function that reads a list of filenames.
get_rank_local_filenames = dist_decorator(os.listdir)

# Use the decorated function to get the rank-local list of filenames
filenames = get_rank_local_filenames('./hpc4neuro')

Support for graceful application shutdown

A function to be decorated by DataDistributor, such as os.listdir in the examples above, may raise an exception. Moreover, exceptions may be raised by DataDistributor due to other errors. In both cases, if an exception is raised by one MPI rank, the other MPI ranks may get stuck in a waiting state, unaware of the raised exception. To handle such a scenario and ensure graceful termination of the application, a flag can be set in the DataDistributor initializer to enable graceful application shutdown on error. The following code examples illustrate how to enable this feature with both the static and dynamic decoration syntax:

Static: @DataDistributor(MPI.COMM_WORLD, shutdown_on_error=True)

Dynamic: dist_decorator = DataDistributor(MPI.COMM_WORLD, shutdown_on_error=True)

API documentation

API documentation for hpc4neuro.distribution is available here.

Notes for contributors

Development setup

Clone this repository
Change to the cloned repository directory
Create and activate a virtual environment
If you use poetry, run poetry install to install all the required dependencies

To generate API documentation using sphinx, issue the following commands from the repository root:

sphinx-build -b html doc doc/html
sphinx-build -b text doc doc/text

Test setup

pytest is required for running and working with test code for this project.

Use the following command to run tests:

mpirun -np <n> python -m pytest

where <n> should be replaced with the number of MPI ranks to use for testing.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
dist		dist
doc		doc
hpc4neuro		hpc4neuro
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pylint_rc.cfg		pylint_rc.cfg
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The hpc4neuro library of Python utilities

Setup and Requirements

Available modules

1. `hpc4neuro.distribution`

Examples

Distributed case 1: Using the static decorator syntax

Distributed case 2: Dynamically decorating a function

Support for graceful application shutdown

API documentation

Notes for contributors

Development setup

Test setup

About

Releases

Packages

Languages

License

multiscale-cosim/HPC4Neuro

Folders and files

Latest commit

History

Repository files navigation

The hpc4neuro library of Python utilities

Setup and Requirements

Available modules

1. hpc4neuro.distribution

Examples

Distributed case 1: Using the static decorator syntax

Distributed case 2: Dynamically decorating a function

Support for graceful application shutdown

API documentation

Notes for contributors

Development setup

Test setup

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `hpc4neuro.distribution`

Packages