Skip to content

Commit

Permalink
Merge branch 'develop' into patch_tensorflow
Browse files Browse the repository at this point in the history
  • Loading branch information
ashao authored Oct 17, 2024
2 parents ce42622 + 5c2de47 commit d4360b1
Show file tree
Hide file tree
Showing 3 changed files with 93 additions and 2 deletions.
9 changes: 7 additions & 2 deletions doc/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,12 @@ Jump to:

## SmartSim

### develop

To be released at some point in the future

Description

- Implement workaround for Tensorflow that allows RedisAI to build with GCC-14
- Add instructions for installing SmartSim on PML's Scylla

Detailed Notes

Expand All @@ -26,6 +25,12 @@ Detailed Notes
Future versions of Tensorflow may fix this problem, but for now this seems to be
the best workaround.
([SmartSim-PR738](https://github.com/CrayLabs/SmartSim/pull/738))
- PML's Scylla is still under development. The usual SmartSim
build instructions do not apply because the GPU dependencies
have yet to be installed at a system-wide level. Scylla has
its own entry in the documentation.
([SmartSim-PR733](https://github.com/CrayLabs/SmartSim/pull/733))


### 0.8.0

Expand Down
2 changes: 2 additions & 0 deletions doc/installation_instructions/platform.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ that SmartSim may be used on.

.. include:: platform/olcf-summit.rst

.. include:: platform/pml-scylla.rst

.. _site_installation:

.. include:: site-install.rst
Expand Down
84 changes: 84 additions & 0 deletions doc/installation_instructions/platform/pml-scylla.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
PML Scylla
==========

.. warning::
As of September 2024, the software stack on Scylla is still being finalized.
Therefore, please consider these instructions as preliminary for now.

One-time Setup
--------------

To install SmartSim on Scylla, follow these steps:

**Step 1:** Create and activate a Python virtual environment for SmartSim:

.. code:: bash
module use module use /scyllapfs/hpe/ashao/smartsim_dependencies/modulefiles
module load cudatoolkit cudnn git
python -m venv /scyllafps/scratch/$USER/venvs/smartsim
source /scyllafps/scratch/$USER/venvs/smartsim/bin/activate
**Step 2:** Build the SmartRedis C++ and Fortran libraries:

.. code:: bash
git clone https://github.com/CrayLabs/SmartRedis.git
cd SmartRedis
make lib-with-fortran
pip install .
cd ..
**Step 3:** Install SmartSim in the conda environment:

.. code:: bash
pip install git+https://github.com/CrayLabs/SmartSim.git
**Step 4:** Build Redis, RedisAI, the backends, and all the Python packages:

.. code:: bash
export TORCH_CUDA_ARCH_LIST="8.0 8.6 8.9 9.0" # Workaround for a PyTorch problem
smart build --device=cuda-12
module unload cudnn # Workaround for a PyTorch problem
.. note::
The first workaround is needed because for some reason the autodetection
of CUDA architectures is not consistent internally with one of PyTorch's
dependencies. This seems to be unique to this machine as we do not see
this on other platforms.

The second workaround is needed because PyTorch 2.3 (and possibly 2.2)
will attempt to load the version of cuDNN that is in the LD_LIBRARY_PATH
instead of the version shipped with PyTorch itself. This results in
unfound symbols.

**Step 5:** Check that SmartSim has been installed and built correctly:

.. code:: bash
srun -n 1 -p gpu --gpus=1 --pty smart validate --device gpu
The following output indicates a successful install:

.. code:: bash
[SmartSim] INFO Verifying Tensor Transfer
[SmartSim] INFO Verifying Torch Backend
[SmartSim] INFO Verifying ONNX Backend
[SmartSim] INFO Verifying TensorFlow Backend
16:26:35 login SmartSim[557020:MainThread] INFO Success!
Post-installation
-----------------

After completing the above steps to install SmartSim in a conda environment, you
can reload the conda environment by running the following commands:

.. code:: bash
module load cudatoolkit/12.4.1 git # cudnn should NOT be loaded
source /scyllafps/scratch/$USER/venvs/smartsim/bin/activate

0 comments on commit d4360b1

Please sign in to comment.