Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Configuration of pre-trained MPNN_CNN #58

Open
pykao opened this issue Dec 2, 2020 · 12 comments
Open

Training Configuration of pre-trained MPNN_CNN #58

pykao opened this issue Dec 2, 2020 · 12 comments
Labels
enhancement New feature or request

Comments

@pykao
Copy link
Contributor

pykao commented Dec 2, 2020

Hi Kexin Huang,

I am using the provided pre-trained MPNN_CNN model. When I looked into its model configuration file, it looks wired to me.

{'input_dim_drug': 1024,
'input_dim_protein': 8420,
'hidden_dim_drug': 128,
'hidden_dim_protein': 256,
'cls_hidden_dims': [1024, 1024, 512],
'batch_size': 16,
'train_epoch': 1,
'LR': 0.001,
'drug_encoding': 'MPNN',
'target_encoding': 'CNN',
'result_folder': './result/',
'binary': False,
'mpnn_hidden_size': 128,
'mpnn_depth': 3,
'cnn_target_filters': [32, 64, 96],
'cnn_target_kernels': [4, 8, 12],
'num_workers': 0,
'decay': 0}

Did you only train this model for only 1 epoch with batch size 16?

Best regards,
Po-Yu Kao

@kexinhuang12345
Copy link
Owner

That's weird, I must have stored the wrong model. Let me double-check and I will upload the correct model.

@kexinhuang12345
Copy link
Owner

kexinhuang12345 commented Dec 3, 2020

Hey it seems this model is wrong. You can use "MPNN_CNN_BindingDB_IC50" instead. It is trained on a much larger training set (~10^5 -> 10^6) and should have higher quality. Do note that the units now switches from Kd to IC50.

@pykao
Copy link
Contributor Author

pykao commented Dec 4, 2020

Did you use the latest BindingDB to train this model?

@kexinhuang12345
Copy link
Owner

Hey, it is using the past version 2020m2. There should be some minor difference with the current most up to date version regarding the number of training points.

@pykao
Copy link
Contributor Author

pykao commented Dec 4, 2020

Thank you for your reply 👍🏽 Please let me know if you want the trained MPNN_CNN on BindingDB using Kd.

@pykao pykao closed this as completed Dec 4, 2020
@kexinhuang12345
Copy link
Owner

No problem! Did you mean you are managed to train the model? If so, would be great to share with me ([email protected]), thanks!

@kexinhuang12345
Copy link
Owner

You can simply use the model.save('XXX') function and then send me the model file; i will upload to the server and update the link, thanks again!

@chemlove
Copy link

chemlove commented Jul 10, 2021

Hi Kexin,
It seems that the pre-trained model MPNN_CNN downloaded using pretrained_dir = download_pretrained_model('pretrained_models') in the oneliner.py still showing the old configuration:

{'input_dim_drug': 1024,
'input_dim_protein': 8420,
'hidden_dim_drug': 128,
'hidden_dim_protein': 256,
'cls_hidden_dims': [1024, 1024, 512],
'batch_size': 16,
'train_epoch': 1,
'LR': 0.001,
'drug_encoding': 'MPNN',
'target_encoding': 'CNN',
'result_folder': './result/',
'binary': False,
'mpnn_hidden_size': 128,
'mpnn_depth': 3,
'cnn_target_filters': [32, 64, 96],
'cnn_target_kernels': [4, 8, 12]}

Maybe you need to update the model file on the https://dataverse.harvard.edu/api/access/datafile/

Maybe the configure files corresponding to pretrained_dir = download_pretrained_model('models_configs') also need a update.

@kexinhuang12345
Copy link
Owner

Sounds good, do you want to contribute and train a new model for it?

@chemlove
Copy link

I'd like to have a try. Could you please give me the dataset of BindDB Kd? And what preproccess or data cleaning is needed before I start the train?
Subsequent help may be needed since I am a complete newbie for ML :)

@kexinhuang12345
Copy link
Owner

Sounds good, it should be the one in the https://github.com/kexinhuang12345/DeepPurpose/blob/master/DEMO/Transformer%2BCNN_BindingDB.ipynb

simply replacing the model and parameter should be good

@kexinhuang12345 kexinhuang12345 added the enhancement New feature or request label Jul 28, 2021
@Jameel9
Copy link

Jameel9 commented Sep 9, 2022

Thank you for your fruitful discussion and big thank-you to the developers of this library. My question is:
In the latest release od DeepPurpose, was the MPNN_CNN model corrected and it works fine now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants