Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ESMfold with newer Ubuntu and CUDA version #500

Open
Wan8uq1 opened this issue Nov 2, 2024 · 8 comments
Open

Support for ESMfold with newer Ubuntu and CUDA version #500

Wan8uq1 opened this issue Nov 2, 2024 · 8 comments

Comments

@Wan8uq1
Copy link

Wan8uq1 commented Nov 2, 2024

Hi,
First of all, thanks for all works put in.
The system I have is Ubuntu 22.04
My issue is as follow. I am trying to use ESMfold which is depend on Openfold. What I did was installing https://github.com/aqlaboratory/openfold/tree/pl_upgrades to my environment and then install ESMfold via
pip install "fair-esm[esmfold]"
I run into this error when using the ESMFold Structure Prediction test script from ESMfold README
RuntimeError: Keys 'trunk.structure_module.ipa.linear_q_points.linear.weight, trunk.structure_module.ipa.linear_q_points.linear.bias, trunk.structure_module.ipa.linear_kv_points.linear.bias, trunk.structure_module.ipa.linear_kv_points.linear.weight' are missing.
From there I've found this thread on the issue: facebookresearch/esm#435
and the resolution is seems is to use openfold v1.
However, openfold v1 is using cuda 11.3 which is not supported on Ubuntu 22.04. It really seems this a dead end.

It is frustrating as I struggled for a week and couldn't find solution. Please spare some help.
Sincerely thanks!

@JoseEspinosa
Copy link

JoseEspinosa commented Nov 15, 2024

Hi, thanks for your effort in maintaining this project! 😄
I find the same dead end. In my case, I am trying to make ESMfold work in cuda 12 for this reason I followed this comment on this other issue and installed openfold from the pl_upgrades branch as suggested. This enables me to create the environment (docker image) but I am also struggling with the error reported by @Wan8uq1. I know this is probably more of a problem in the ESMfold site which is archived now, but if you could suggest a workaround to fix the issue that would be great!

@vaclavhanzl
Copy link
Contributor

@JoseEspinosa @Wan8uq1 Maybe you could use just the OpenFold itself in the Soloseq mode? It works fine for me with the pl_upgrades branch and provides the ESMfold functionality (and more) and is supported.

@Wan8uq1
Copy link
Author

Wan8uq1 commented Nov 15, 2024

@JoseEspinosa @Wan8uq1 Maybe you could use just the OpenFold itself in the Soloseq mode? It works fine for me with the pl_upgrades branch and provides the ESMfold functionality (and more) and is supported.

Hi, @vaclavhanzl Thanks so much for the reply! I found the solution earlier myself and it seems the trick is to separate dependencies and pip in setting up conda env, and making sure the environment is using - PyTorch 1.12.1 + CUDA 11.3 + cuDNN 8.9 + GCC 7. Indeed Ubuntu 22.04 doesn't support CUDA 11.3 officially but with right gcc version it can be complied. @JoseEspinos , I hope this might help.

My Env is as follow: Ubuntu 22.04 & Nvidia-smi is 12.2, and here is what I did:

Save the following as conda_environment.yml

name: esmfold
channels:
 - conda-forge
 - bioconda
 - pytorch
dependencies:
 - conda-forge::python=3.7
 - conda-forge::setuptools=59.5.0
 - conda-forge::pip
 - conda-forge::openmm=7.5.1
 - conda-forge::pdbfixer
 - conda-forge::cudatoolkit==11.3.*
 - conda-forge::cudatoolkit-dev==11.3.*
 - conda-forge::einops==0.6.1
 - conda-forge::fairscale
 - conda-forge::omegaconf
 - conda-forge::hydra-core
 - conda-forge::pandas
 - conda-forge::pytest
 - bioconda::hmmer==3.3.2
 - bioconda::hhsuite==3.3.0
 - bioconda::kalign2==2.04
 - pytorch::pytorch=1.12.*

Run the following command to create the esmfold environment:

conda env create -f conda_environment.yml

Save the following as pip_requirements.txt

biopython==1.79
deepspeed==0.5.9
dm-tree==0.1.6
ml-collections==0.1.0
numpy==1.21.2
PyYAML==5.4.1
requests==2.26.0
scipy==1.7.1
tqdm==4.62.2
typing-extensions==3.10.0.2
pytorch_lightning==1.5.10
wandb==0.12.21
biotite==0.39.0
matplotlib
joblib

Activate conda environment esmfold or whatever you name it, and Install the pip packages
Use the following command to install pip packages:

pip install -r pip_requirements.txt

And then run the esmfold installation command they provided:

pip install fair-esm           # Latest release
pip install git+https://github.com/facebookresearch/esm.git  # Main branch
pip install "fair-esm[esmfold]"
# OpenFold and its dependencies
pip install 'dllogger @ git+https://github.com/NVIDIA/dllogger.git'
pip install 'openfold @ git+https://github.com/aqlaboratory/openfold.git@4b41059694619831a7db195b7e0988fc4ff3a307'

Then it should be good to go.

@Wan8uq1 Wan8uq1 closed this as completed Nov 15, 2024
@Wan8uq1 Wan8uq1 reopened this Nov 15, 2024
@vaclavhanzl
Copy link
Contributor

Happy to hear you made it work @Wan8uq1 and thanks as lot for sharing your solution.

I'd still love to hear what does ESMfold provide and OpenFold's Soloseq does not? I guess it is using a different set of weights so you might get slightly different prediction sometimes?

@Wan8uq1
Copy link
Author

Wan8uq1 commented Nov 15, 2024

Happy to hear you made it work @Wan8uq1 and thanks as lot for sharing your solution.

I'd still love to hear what does ESMfold provide and OpenFold's Soloseq does not? I guess it is using a different set of weights so you might get slightly different prediction sometimes?

Hi, Using esmfold over Openfold soloseq is more on my end: I am trying to do structural cluster for some proteins, part of their structures are from esmfold repo so I just want to be consist for those that I have to predict.

@vaclavhanzl
Copy link
Contributor

OK, thanks for the explanation @Wan8uq1 and good luck!

@JoseEspinosa
Copy link

Thanks both for the feedback! 😄
Regarding why using esmfold, we are maintaning the nf-core/proteinfold pipeline which implements several structural prediction tools. We wanted to keep giving support to esmfold, but it is true that if it is not longer maintained might be tricky. Regarding openfold, actually we have an issue open from long time to implement it in the pipeline, so probably it is time to finally implement it

@lee-t
Copy link

lee-t commented Nov 18, 2024

I don't see a way to run Soloseq for multimer prediction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants