Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MC Docking Cation-Anion Poses #11

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

MC Docking Cation-Anion Poses #11

wants to merge 9 commits into from

Conversation

pafervi
Copy link
Collaborator

@pafervi pafervi commented Nov 8, 2024

This pull request includes updates introduced for the Monte Carlo docking creating a MinDistanceCationAnionFitness class that generates poses with the positive charge of docked molecules close to the acid site of the catalyst. Hence creating more chemically realistic processes

Pau Ferri-Vicedo added 9 commits May 28, 2024 16:39
…tan_distances now combines MinDistanceFitness + MinDistanceCationAnionFitness, works
…tan in def __init__ for ThresholdFitness class
…n-anion index identifiers moved out from VOID and parsed as external arguments
…itness and MinCationAnionFitness so --threshold_catan, --cation_index and --acid_sites will be ignored if any other kind of Fitness is called
Copy link

@ajhoffman1229 ajhoffman1229 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of this code looks very good. Everything is documented properly. I had a few clarifying questions and one critical change (in success.py) that probably need to made before merging, but otherwise this has my stamp of approval.

Comment on lines +52 to +60
def rescale(self, complex):
"""Rescale the complex to the 0-1 range so results can be visualized in direct and xyz format.

Args:
complex (Complex): The host-guest complex object containing the host and guest molecules.

Returns:
Complex: The rescaled host-guest complex object.
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure that I'm understanding correctly: this function is translating Cartesian coordinates to fractional coordinates using Pymatgen's Structure class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I knew this might bring some trouble. The reason for the use of this function is that the software works as a periodic crystal, but for every iteration the coordinates of the guest are displaced over the previous guest coordinates, so it was generating a correct pose but with an unvisualizable format (see example below). Though for running later calculations this shouldn't matter, I introduced this function that rescales the output once the optimal pose is achieved and allows visualization inside and outside htvs by different softwares.

Output without reescaling function:
Host coords: 1 1 1
2 2 2
3 3 3
Guest coords: 235 432 274
123 456 789

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think I've seen issues like what you describe. When you pull up the structure in VESTA or ASE (for example), the molecules are far outside the unit cell that is being shown. And this function will translate those positions to the fractional coordinates, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is only translating the molecule from where it ends up outside the unit cell range to where it is placed in the unit cell range. Not changing the result of the docked position achieved at all and allowing visualization.

Comment on lines +88 to +102
complex.host = Structure(
lattice=complex.host.lattice,
species=species[:num_host_atoms],
coords=cart_coords[:num_host_atoms],
coords_are_cartesian=True,
site_properties=complex.host.site_properties,
)

complex.guest = Structure(
lattice=complex.host.lattice,
species=species[num_host_atoms:],
coords=cart_coords[num_host_atoms:],
coords_are_cartesian=True,
site_properties=complex.guest.site_properties,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code might not follow best practices: normally, you don't want to modify the inputs to a function. It's not as efficient, but it would be better to create a copy of the input, modify the copy, and return the copy on the last line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that I completely get what you are saying. You mean substituting the self.rescale(cpx) in dock by the whole process inside the function? or ammending the rescale function so it does not modify the input cpx?
(I'm mostly sure you mean option B, but better double check)


    def dock(self, attempts):
        cpx = Complex(self.host.copy(), self.guest.copy())

        for trial in range(attempts):
            cpx = self.run(cpx, 1)

            if self.fitness(cpx) >= 0:
                print(f"{trial + 1} attempts to success")
                cpx = self.rescale(cpx)
                return [cpx]

        return []

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did mean the second thing you asked. Right now, the rescale function you wrote modifies the input complex. You have something like this:

def rescale(self, complex):
    ...
    complex.host = Structure(...)
    ...
    return complex

Best coding practices suggest that you should make a copy of the input and modify that instead, like this:

def rescale(self, complex):
    cpx = complex.copy() # this might not work for Complex objects, you should be able to use copy.deepcopy() as an alternative
    cpx.host = Structure(...)
    return cpx

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll ammend this and give it a test

@@ -74,6 +106,77 @@ def __call__(self, complex):
return self.normalize(self.get_distances(complex).min() - self.threshold)


class MinDistanceCationAnionFitness(ThresholdFitness):
PARSER_NAME = "min_catan_distance"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I saw a parser or function with this name. Is it located elsewhere? (maybe I am misunderstanding what the parser is and does)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_catan_distance parses the THRESHOLD_CATAN value, if not default is set to 3.5 A now.

this is a whole working line outside htvs:
python3 ../../scripts/dock.py LTA_2Al_AlH.cif triethylamine.xyz -d mcsuccess -s random -f min_catan_distance -o mcdocked --threshold_catan 2.0 --threshold 1.3 --attempts 10000

I'll create another folder at examples and add something to the README.md too, because it is not super intuitive

VOID/fitness/threshold.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants