-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MC Docking Cation-Anion Poses #11
base: master
Are you sure you want to change the base?
Conversation
…tan_distances now combines MinDistanceFitness + MinDistanceCationAnionFitness, works
…tan in def __init__ for ThresholdFitness class
…n-anion index identifiers moved out from VOID and parsed as external arguments
…MinDistanceCationFitness can be cleaner
…itness and MinCationAnionFitness so --threshold_catan, --cation_index and --acid_sites will be ignored if any other kind of Fitness is called
… multicationic molecules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most of this code looks very good. Everything is documented properly. I had a few clarifying questions and one critical change (in success.py
) that probably need to made before merging, but otherwise this has my stamp of approval.
def rescale(self, complex): | ||
"""Rescale the complex to the 0-1 range so results can be visualized in direct and xyz format. | ||
|
||
Args: | ||
complex (Complex): The host-guest complex object containing the host and guest molecules. | ||
|
||
Returns: | ||
Complex: The rescaled host-guest complex object. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure that I'm understanding correctly: this function is translating Cartesian coordinates to fractional coordinates using Pymatgen's Structure
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I knew this might bring some trouble. The reason for the use of this function is that the software works as a periodic crystal, but for every iteration the coordinates of the guest are displaced over the previous guest coordinates, so it was generating a correct pose but with an unvisualizable format (see example below). Though for running later calculations this shouldn't matter, I introduced this function that rescales the output once the optimal pose is achieved and allows visualization inside and outside htvs by different softwares.
Output without reescaling function:
Host coords: 1 1 1
2 2 2
3 3 3
Guest coords: 235 432 274
123 456 789
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think I've seen issues like what you describe. When you pull up the structure in VESTA or ASE (for example), the molecules are far outside the unit cell that is being shown. And this function will translate those positions to the fractional coordinates, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is only translating the molecule from where it ends up outside the unit cell range to where it is placed in the unit cell range. Not changing the result of the docked position achieved at all and allowing visualization.
complex.host = Structure( | ||
lattice=complex.host.lattice, | ||
species=species[:num_host_atoms], | ||
coords=cart_coords[:num_host_atoms], | ||
coords_are_cartesian=True, | ||
site_properties=complex.host.site_properties, | ||
) | ||
|
||
complex.guest = Structure( | ||
lattice=complex.host.lattice, | ||
species=species[num_host_atoms:], | ||
coords=cart_coords[num_host_atoms:], | ||
coords_are_cartesian=True, | ||
site_properties=complex.guest.site_properties, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code might not follow best practices: normally, you don't want to modify the inputs to a function. It's not as efficient, but it would be better to create a copy of the input, modify the copy, and return the copy on the last line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I completely get what you are saying. You mean substituting the self.rescale(cpx)
in dock by the whole process inside the function? or ammending the rescale function so it does not modify the input cpx?
(I'm mostly sure you mean option B, but better double check)
def dock(self, attempts):
cpx = Complex(self.host.copy(), self.guest.copy())
for trial in range(attempts):
cpx = self.run(cpx, 1)
if self.fitness(cpx) >= 0:
print(f"{trial + 1} attempts to success")
cpx = self.rescale(cpx)
return [cpx]
return []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did mean the second thing you asked. Right now, the rescale
function you wrote modifies the input complex
. You have something like this:
def rescale(self, complex):
...
complex.host = Structure(...)
...
return complex
Best coding practices suggest that you should make a copy of the input and modify that instead, like this:
def rescale(self, complex):
cpx = complex.copy() # this might not work for Complex objects, you should be able to use copy.deepcopy() as an alternative
cpx.host = Structure(...)
return cpx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I'll ammend this and give it a test
@@ -74,6 +106,77 @@ def __call__(self, complex): | |||
return self.normalize(self.get_distances(complex).min() - self.threshold) | |||
|
|||
|
|||
class MinDistanceCationAnionFitness(ThresholdFitness): | |||
PARSER_NAME = "min_catan_distance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I saw a parser or function with this name. Is it located elsewhere? (maybe I am misunderstanding what the parser is and does)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
min_catan_distance
parses the THRESHOLD_CATAN
value, if not default is set to 3.5 A now.
this is a whole working line outside htvs:
python3 ../../scripts/dock.py LTA_2Al_AlH.cif triethylamine.xyz -d mcsuccess -s random -f min_catan_distance -o mcdocked --threshold_catan 2.0 --threshold 1.3 --attempts 10000
I'll create another folder at examples and add something to the README.md too, because it is not super intuitive
This pull request includes updates introduced for the Monte Carlo docking creating a MinDistanceCationAnionFitness class that generates poses with the positive charge of docked molecules close to the acid site of the catalyst. Hence creating more chemically realistic processes