HCCL (Habana Collective Communication Library) supports inter-node communication based on OFI libfabric.
HCCL OFI wrapper introduced to act as a thin layer connecting between HCCL and libfabric APIs.
1.2.
To use HCCL over OFI libfabric:
- libfabric should be downloaded and installed.
- HCCL OFI Wrapper should be downloaded and built.
libfabric should be downloaded and installed in order to use it.
Please follow the instructions below:
-
Define required version to be installed
export REQUIRED_VERSION=<version> (for example: 1.22.0)
-
Download libfabric tarball from: https://github.com/ofiwg/libfabric/releases
wget https://github.com/ofiwg/libfabric/releases/download/v$REQUIRED_VERSION/libfabric-$REQUIRED_VERSION.tar.bz2 -P /tmp/libfabric
-
Store temporary download directory in stack
pushd /tmp/libfabric
-
Open the file
tar -xf libfabric-$REQUIRED_VERSION.tar.bz2
-
Define libfabric root location
export LIBFABRIC_ROOT=<libFabric library location>
-
Create folder for libfabric
mkdir -p ${LIBFABRIC_ROOT}
-
Change permissions for libfabric folder
chmod 777 ${LIBFABRIC_ROOT}
-
Change directory to libfabric folder created after opening tar file
cd libfabric-$REQUIRED_VERSION/
-
Configure libfabric
./configure --prefix=$LIBFABRIC_ROOT --with-synapseai=/usr
-
Build and install libfabric
make -j 32 && make install
-
Remove temporary download directory from stack
popd
-
Delete temporary download directory
rm -rf /tmp/libfabric
-
Include LIBFABRIC_ROOT in LD_LIBRARY_PATH:
export LD_LIBRARY_PATH=$LIBFABRIC_ROOT/lib:$LD_LIBRARY_PATH
Installation can be verified by running:
fi_info --version
.
For more information please see: https://github.com/ofiwg/libfabric
To use libfabric library, HCCL OFI wrapper should be built.
Please follow the instructions below:
- Clone wrapper from https://github.com/HabanaAI/hccl_ofi_wrapper
git clone https://github.com/HabanaAI/hccl_ofi_wrapper.git
- Define LIBFABRIC_ROOT
export LIBFABRIC_ROOT=<libFabric library location>
- Change directory to hccl_ofi_wrapper
cd hccl_ofi_wrapper
- Build wrapper
make
- Copy wrapper to /usr/lib/habanalabs/
cp libhccl_ofi_wrapper.so /usr/lib/habanalabs/libhccl_ofi_wrapper.so
- Run ldconfig utility
ldconfig
- Include libhccl_ofi_wrapper.so location in LD_LIBRARY_PATH:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/habanalabs/