Properties-to-molecules-Inverse-Mapping

This repository contains the code for the paper titled: Inverse mapping of quantum properties to structures for chemical space of small organic molecules

Description

This repository provides the code to reproduce the main results from the paper. The code is organized into various scripts and notebooks. The variable reproduce_paper is used in multiple scripts to automatically locate the data_paper folder. To train the model using the train.py script it usually takes around 3 hours (depending on your GPU). The notebook reproducing the main results should run in a few minutes, excluding the computation of RMSDs for the test set (depending on test set size) which can take longer.

Main Packages

The main packages to run scripts and notebooks are reported here, together with the version we tested on:

ase 3.22.0
matplotlib 3.5.0
numpy 1.21.4
pandarallel 1.6.4
pandas 1.3.4
pyarrow 7.0.0
pytorch-lightning 1.5.10
rmsd 1.4
scipy 1.7.3
torch 1.12.1
tqdm 4.62.3
openbabel 3.1.1

A lot of the code can be run without openbabel, dftb+ or machine learning force fields. These packages are needed though in order to add hydrogens and relax geometries. For a simple installation and use of a force field, any force field that can be used within the ase framework will do, for the one used in the work we refer to SpookyNet. For what concerns openbabel we reccomend using a conda environment.

The installation of the main packages should take a few minutes on standard hardware.

Data

The data used for training and testing in the paper can be downloaded here (zip folder). The relevant data is located in the data_paper directory. To use the data, place the data_paper folder in the same directory as the notebooks and scripts. New data can be prepared using the initialize_data.py script, which needs to be modified as required.

Models

The model architectures are defined in the models_old.py script (alternatively, models.py for testing alternatives). The PyTorch Lightning model definition is provided in the Model.py file. Pre-trained models are available in the models_saved folder.

Notebooks

testing.ipynb: Reproduces the main results of the paper.
mol_gen_test.ipynb: Demonstrates targeted molecule generation using functions from molecular_generation_utils.py.

Miscellaneous

Other scripts serve as utilities for various applications. For the interpolation we have here a script called interpolator.py which implements the procedure used in the paper. For the NEB part there is a notebook called NEB_interp.ipynb, please change the SpookyNet chackpoint (or force field) to what you want to use.

General Remarks

While the code could be better organized and structured, the current organization serves the purpose of scientifically presenting and prototyping the novel methodology outlined in the paper.

Licensing

Notebooks (*.ipynb): Licensed under the GNU General Public License, Version 2 (GPL-2.0). See the LICENSE_GPL file for more details.
All Other Files: Licensed under the MIT License. See the LICENSE file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Properties-to-molecules-Inverse-Mapping

Description

Main Packages

Data

Models

Notebooks

Miscellaneous

General Remarks

Licensing

Resources

About

Licenses found

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
interp_molecules		interp_molecules
models_saved		models_saved
plottini/neb_data		plottini/neb_data
CM_preparation.py		CM_preparation.py
Data_Handler.py		Data_Handler.py
Data_prep_utils.py		Data_prep_utils.py
LICENSE		LICENSE
LICENSE_GPL		LICENSE_GPL
Losses.py		Losses.py
Model.py		Model.py
NEB_interp.ipynb		NEB_interp.ipynb
README.md		README.md
gaussian_fit_best.py		gaussian_fit_best.py
initialize_data.py		initialize_data.py
interpolator.py		interpolator.py
invert_CM.py		invert_CM.py
models.py		models.py
models_old.py		models_old.py
mol_gen_test.ipynb		mol_gen_test.ipynb
molecular_generation_utils.py		molecular_generation_utils.py
testing.ipynb		testing.ipynb
train.py		train.py

License

Licenses found

AleFalla/Properties-to-molecules-Inverse-Mapping

Folders and files

Latest commit

History

Repository files navigation

Properties-to-molecules-Inverse-Mapping

Description

Main Packages

Data

Models

Notebooks

Miscellaneous

General Remarks

Licensing

Resources

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages