From e0bf2c9602f9e811426a260517e90d18cd593f74 Mon Sep 17 00:00:00 2001 From: Tom Vander Aa Date: Tue, 25 Jun 2024 09:31:34 +0200 Subject: [PATCH 1/4] docs: update workflow badge --- README.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.rst b/README.rst index 57a4d404..f32d9bbb 100644 --- a/README.rst +++ b/README.rst @@ -1,7 +1,7 @@ SMURFF - Scalable Matrix Factorization Framework ================================================ -|Azure Build Status| |Anaconda-Server Badge| +|GitHub Build Status| |Anaconda-Server Badge| What is Bayesian Matrix Factorization ------------------------------------- @@ -76,17 +76,17 @@ Citing SMURFF ------------- If you are using SMURFF in a scientific publication, please cite the following preprint plus the paper describing the corresponding algorithm: - + SMURFF: a High-Performance Framework for Matrix Factorization arXiv preprint `arXiv:1904:02514 `_ - + When using pure Bayesian Probabilistic Matrix Factorization, please also cite: - Salakhutdinov R, Mnih A. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine learning (ICML '08), 2008. ACM, New York, NY, USA, 880-887. - + Salakhutdinov R, Mnih A. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th international conference on Machine learning (ICML '08), 2008. ACM, New York, NY, USA, 880-887. + When using Bayesian Factorization with Side Information, please also cite: Simm J, Arany Á, Zakeri P, Haber T, Wegner JK, Chupakhin V, Ceulemans H, Moreau Y. Macau: Scalable Bayesian Factorization with High-Dimensional Side Information Using MCMC Proc. of the Machine Learning for Signal Processing (MLSP), 2017 IEEE 27th International Workshop on MLSP; 2017; Vol. 2017-September; pp. 1 - 6. Tokyo, Japan. - + When using Group Factor Analysis, please also cite: Klami A, Virtanen S, Leppäaho E, Kaski S., "Group Factor Analysis," in IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 9, pp. 2136-2147, Sept. 2015. @@ -98,8 +98,8 @@ Acknowledgements Over the course of the last 5 years, this work has been supported by the EU H2020 FET-HPC projects EPEEC (contract #801051), ExCAPE (contract #671555) and EXA2CT (contract #610741), and the Flemish Exaptation project. -.. |Azure Build Status| image:: https://dev.azure.com/ExaScience/smurff/_apis/build/status/ExaScience.smurff?branchName=master - :target: https://dev.azure.com/ExaScience/smurff/_build +.. |GitHub Build Status| image:: https://github.com/ExaScience/smurff/actions/workflows/build_linux.yml/badge.svg + :target: https://github.com/ExaScience/smurff .. |Anaconda-Server Badge| image:: https://anaconda.org/vanderaa/smurff/badges/version.svg :target: https://conda.anaconda.org/vanderaa From 0a292216b70033e0d99f9c5cfaf02cf40516af36 Mon Sep 17 00:00:00 2001 From: Tom Vander Aa Date: Fri, 28 Jun 2024 08:56:53 +0200 Subject: [PATCH 2/4] docs: update install docs --- docs/INSTALL.rst | 74 ++++++++++++++++++++---------------------------- 1 file changed, 30 insertions(+), 44 deletions(-) diff --git a/docs/INSTALL.rst b/docs/INSTALL.rst index 8e698b97..4621a1f6 100644 --- a/docs/INSTALL.rst +++ b/docs/INSTALL.rst @@ -1,13 +1,17 @@ Compilation of SMURFF ===================== -Note: the easiest way to install SMURFF is not to build it yourself. Install the binary +Note: the easiest way to install SMURFF is not to build it yourself. Install the binary `Conda `__ package: .. code:: bash conda install -c vanderaa smurff + +When compiling SMURFF yourself, you have 3 options: + + Compilation using `conda build` ------------------------------- @@ -15,67 +19,49 @@ Conda build works on Linux, macOS and Windows. Execute .. code:: bash - conda build smurff - + conda build -c conda-forge -c vanderaa smurff + in the `conda-recipes` directory. -Compilation using CMake ------------------------ +Compile the binary standalone binary `smurff` using CMake +--------------------------------------------------------- + +This will not create the `smurff` python package. C++ Requirements ~~~~~~~~~~~~~~~~ -- CMake 3.6 or later -- Eigen3 version 3.3.7 or later -- HighFive 2.2. from https://github.com/BlueBrain/HighFive/ +- CMake 3.15 or later +- Eigen3 version 3.3.7 or later +- HighFive 2.9. from https://github.com/BlueBrain/HighFive/ - Boost 1.5x or newer -Python Requirements -~~~~~~~~~~~~~~~~~~~ - -As in setup.py: - - install_requires = [ 'numpy', 'scipy', 'pandas', 'scikit-learn', 'h5sparse-tensor' ], - setup_requires = ['setuptools_scm', 'pybind11' ], +CMake Options +~~~~~~~~~~~~~ -Compile using setup.py -~~~~~~~~~~~~~~~~~~~~~~ +- Build type switches: + - `-DCMAKE\_BUILD\_TYPE` - Debug/Release -Running - setup.py install +- Algebra library: you can specify + - `-DENABLE\_BLAS` - ON/OFF: BLAS acceleration for Eigen is enable by default. + - `-DBLA_VENDOR` allows you to specify which BLAS implementation to use. -will run CMake to configure, compile and install SMURFF. -Extra arguments to CMake can be passed with +- Other: look in the top-level `CMakeFile.txt` for more options. - setup.py --extra-cmake-args <...> install - -or by setting the `CMAKE_ARGS` environment variables. +Python package using pip +------------------------ -CMake Options -~~~~~~~~~~~~~ +The python package is built using scikit-build-core , +which calls CMake to compile the C++ extension. Hence you can simply run: -- Build type switches: - - CMAKE\_BUILD\_TYPE - Debug/Release +.. code:: bash -- Algebra library switches (select only one): - - When no switches are specified, CMake will try to find - any LAPACK and BLAS library on your system. - - ENABLE\_OPENBLAS - ON/OFF (should include openblas - library when linking. openblas also contains - implementation of lapack called relapack) - - ENABLE\_MKL - ON/OFF: tries to find the `MKL single dynamic - library `_. + pip install . -- Python: - - ENABLE\_PYTHON -Linux and macOs Specific -~~~~~~~~~~~~~~~~~~~~~~~~ +Linux and macOs Specific +------------------------ Have a look in `ci/ <../ci/>`__ for Docker build scripts and for Linux+macOS wheel scripts. These scripts should give you a good idea on how to compiler on an Ubuntu and macOS system. -Windows Specific -~~~~~~~~~~~~~~~~ - -Work for a vcpkg-based build is in progress. From dca46080412a3af496d4b2caeee558b7b0abb3d2 Mon Sep 17 00:00:00 2001 From: Tom Vander Aa Date: Fri, 28 Jun 2024 08:57:26 +0200 Subject: [PATCH 3/4] notebooks: do not use direct method with ecfp. This will lead to an out-of-memory error. --- docs/notebooks/different_noise_models.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/notebooks/different_noise_models.ipynb b/docs/notebooks/different_noise_models.ipynb index a214fba0..9dc3d459 100644 --- a/docs/notebooks/different_noise_models.ipynb +++ b/docs/notebooks/different_noise_models.ipynb @@ -166,7 +166,7 @@ "\n", "## using activity threshold pIC50 > 6. to binarize train data\n", "trainSession.addTrainAndTest(ic50_train, ic50_test, smurff.ProbitNoise(ic50_threshold))\n", - "trainSession.addSideInfo(0, ecfp, direct = True)\n", + "trainSession.addSideInfo(0, ecfp, direct = False)\n", "predictions = trainSession.run()\n", "print(\"RMSE = %.2f\" % smurff.calc_rmse(predictions))\n", "print(\"AUC = %.2f\" % smurff.calc_auc(predictions, ic50_threshold))" From 4b42322266008087fa5c7e52ef23c43cd84ebce3 Mon Sep 17 00:00:00 2001 From: Tom Vander Aa Date: Fri, 28 Jun 2024 09:11:04 +0200 Subject: [PATCH 4/4] docs: update sphinx conf and deps --- docs/conf.py | 3 +-- docs/requirements.txt | 1 + 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/conf.py b/docs/conf.py index b550bb40..a8e393bd 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -20,7 +20,6 @@ # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. sys.path.insert(0, os.path.abspath('.')) -sys.path.insert(0, os.path.abspath('../python/matrix_io')) # Exclude build directory and Jupyter backup files: exclude_patterns = ['_build', '**.ipynb_checkpoints'] @@ -116,7 +115,7 @@ # # This is also used if you do content translation via gettext catalogs. # Usually you set "language" from the command line for these cases. -language = None +language = "en" # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: diff --git a/docs/requirements.txt b/docs/requirements.txt index a5023af6..99f7f5bb 100644 --- a/docs/requirements.txt +++ b/docs/requirements.txt @@ -1,5 +1,6 @@ ipython sphinx>=1.4 +sphinx_rtd_theme ipykernel nbsphinx pygments>=2.6.1