allow mdpow-solvationenergy to select alchemlyb estimators #148

VOD555 · 2021-06-02T14:26:09Z

Fix #135
-allow solvationenergy to use alchemlyb estimators

orbeckst

Let's make mdpow-solvationenergy a model for the scripts.

Replace optparse with argparse so that we can use better option processing.
The logic for force/no data is convoluted and should be simplified to avoid code duplication.
Please add more logger INFO output.

orbeckst · 2021-06-02T21:10:31Z

mdpow/fep.py

                if G.method != "TI":
                    errmsg = "Method %s is not implemented in MDPOW, use estimator='alchemlyb'" % G.method
                    logger.error(errmsg)
                    raise ValueError(errmsg)
+                G.start = kwargs.pop('start', 0)
+                G.stop = kwargs.pop('stop', None)
+                G.SI = kwargs.pop('stop', False)


Looks wrong — the kwarg should not be "start" but .. SI?

scripts/mdpow-solvationenergy

orbeckst · 2021-06-02T21:15:25Z

scripts/mdpow-solvationenergy

+    if not estimator in ('mdpow', 'alchemlyb'):
+        errmsg = "estimator = %r is not supported, must be 'mdpow' or 'alchemlyb'" % estimator
+        logger.error(errmsg)
+        raise ValueError(errmsg)


add INFO logger that states which estimator is used

scripts/mdpow-solvationenergy

orbeckst · 2021-06-02T21:26:41Z

scripts/mdpow-solvationenergy

+    parser.add_option('--noSI', dest='SI',
+                      action="store_false",
+                      help="Disable statistical inefficiency analysis"
+                      "Statitical inefficiency analysis is performed by default when using"
+                      "alchemlyb estimators and is disabled when using mdpow estimator.")


make this a BooleanOptionalAction action option with argparse so that we have --SI and --no-SI (helps with #147 )

scripts/mdpow-solvationenergy

orbeckst · 2021-06-03T04:15:14Z

scripts/mdpow-solvationenergy

-                        force=False, stride=10, permissive=False, solvent='water')
+    import argparse
+
+    parser = argparse.ArgumentParser(usage=__doc__)


IIRC you need to do something special so that argparse shows defaults, something like ArgumentParserWithDefaults or so — check the docs. Then remove the [%default] everywhere (IIRC...)

Should be setting argparse.ArgumentDefaultsHelpFormatter and then the default values would be shown in the help message automatically.

orbeckst · 2021-06-03T04:16:23Z

scripts/mdpow-solvationenergy

+    parser.add_argument('--noSI', dest='SI',
+                        action="store_false",
+                        help="Disable statistical inefficiency analysis"
+                        "Statitical inefficiency analysis is performed by default when using"
+                        "alchemlyb estimators and is disabled when using mdpow estimator.")


use the bool action

I check the python documentation, BooleanOptionalAction action option is added in 3.9, we can't use it in 2.7.

VOD555 · 2021-06-03T06:26:45Z

Meet an unexpected error about the usage messsage
ValueError: unsupported format character 'p' (0x70) at index 1 and found the solution here.
https://stackoverflow.com/questions/24180527/argparse-required-arguments-listed-under-optional-arguments

…nputs

VOD555 · 2021-06-04T04:02:27Z

Local tests output:

 mdpow-solvationenergy SM25 ../SM26/SM26/ -S octanol --start 1 --stop 20 --force
mdpow       : INFO     MDPOW 0.7.0-dev starting.
mdpow       : INFO     Copyright (c) 2010-2016 Ian Kenney, Bogdan Iorga, and Oliver Beckstein
mdpow       : INFO     Released under the GNU Public Licence, version 3.
mdpow       : INFO     For bug reports and help: https://github.com/Becksteinlab/MDPOW/issues
mdpow.config: INFO     Using the bundled force fields from GMXLIB='/nfs/homes4/sfan/miniconda3/envs/py27/lib/python2.7/site-packages/mdpow/top'.
mdpow.config: INFO     If required, override this behaviour by setting the environment variable GMXLIB yourself.
mdpow       : INFO     Analyzing directory 'SM25'... (can take a while)
mdpow       : INFO     [SM25] Reading octanol data 'SM25/FEP/octanol/Goct.fep'
mdpow.fep   : INFO     Solvation free energy calculation for molecule UNK in solvent octanol.
mdpow.fep   : INFO     Base directory is '/nfs/homes4/sfan/Projects/SAMPL7/gaff/SM25/SM25'
mdpow.fep   : INFO     Using setup directories under 'FEP/octanol': {'vdw': 'FEP/octanol/VDW', 'coulomb': 'FEP/octanol/Coulomb'}
mdpow.fep   : INFO     Default checkpoint file is '/nfs/homes4/sfan/Projects/SAMPL7/gaff/SM25/SM25/FEP/octanol/Goct.fep'
mdpow       : INFO     The solvent is octanol .
mdpow       : INFO     Analysis stride is 1 .
mdpow       : INFO     [SM25] Forcing (re)analysis of hydration free energy data
mdpow       : INFO     Estimator is alchemlyb .
mdpow       : INFO     Free energy calculation method is MBAR .
mdpow       : INFO     Analysis starts from frame 0 .
mdpow       : INFO     Analysis stops at frame None .
mdpow       : INFO     Statistial inefficiency analysis will be performed.
mdpow.fep   : INFO     [FEP/octanol] Finding dgdl xvg files, reading with stride=1 permissive=False.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 50.0223.
mdpow.fep   : INFO     The data are subsampled every 51 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 63.1531.
mdpow.fep   : INFO     The data are subsampled every 64 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 69.1584.
mdpow.fep   : INFO     The data are subsampled every 70 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 95.2881.
mdpow.fep   : INFO     The data are subsampled every 96 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 154.2010.
mdpow.fep   : INFO     The data are subsampled every 155 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 71.3375.

VOD555 · 2021-06-04T04:43:44Z

output from mdpow-solvationenergy -h

mdpow       : INFO     MDPOW 0.7.0-dev starting.
mdpow       : INFO     Copyright (c) 2010-2016 Ian Kenney, Bogdan Iorga, and Oliver Beckstein
mdpow       : INFO     Released under the GNU Public Licence, version 3.
mdpow       : INFO     For bug reports and help: https://github.com/Becksteinlab/MDPOW/issues
mdpow.config: INFO     Using the bundled force fields from GMXLIB='/nfs/homes4/sfan/miniconda3/envs/py27/lib/python2.7/site-packages/mdpow/top'.
mdpow.config: INFO     If required, override this behaviour by setting the environment variable GMXLIB yourself.
usage: mdpow-solvationenergy [options] DIRECTORY [DIRECTORY ...]

Run the free energy analysis for a solvent  in <DIRECTORY>/FEP
and return DeltaG.

DIRECTORY should contain all the files resulting from running
``mdpow.fep.Ghyd.setup()`` (or the corresponding ``Goct.setup()`` or
``Gcyclohexane.setup()`` and the results of the MD FEP
simulations. It relies on the canonical naming scheme (basically: just
use the defaults as in the tutorial).

The dV/dlambda plots can be produced automatically (--plot=auto). If multiple
DIRECTORY arguments are provided then one has to choose the auto option (or
None).

The total solvation free energy is calculated as

  DeltaG* = -(DeltaG_coul + DeltaG_vdw)

Note that the standard state refers to the "Ben-Naim" standard state
of transferring 1 M of compound in the gas phase to 1 M in the aqueous
phase.

Results are *appended* to a data file with **Output file format**::

          .                 ---------- kJ/mol ---
          molecule solvent  DeltaG*  coulomb  vdw

All observables are quoted with an error estimate, which is derived
from the correlation time and error propagation through Simpson's rule
(see :meth:`mdpow.fep.Gsolv`). It ultimately comes from the error on
<dV/dlambda>, which is estimated as the error to determine the
average.

molecule
    molecule name as used in the itp
DeltaG*
    solvation free energy vacuum --> solvent, in kJ/mol
coulomb
    discharging contribution to the DeltaG*
vdw
    decoupling contribution to the DeltaG*
directory
    folder in which the simulations were stored

positional arguments:
  directory             directory or directories which contain all the files resulting from running
                        mdpow.fep.Ghyd.setup()

optional arguments:
  -h, --help            show this help message and exit
  --plotfile FILE       plot dV/dlambda to FILE; use png or pdf suffix to determine the file type. 'auto' generates a pdf
                        file DIRECTORY/dVdl_<molname>_<solvent>.pdf and 'None' disables it The plot function is only
                        available for mdpow estimator,and is disabled when using alchemlyb estimators. (default: none)
  --solvent NAME, -S NAME
                        solvent NAME for compound, 'water', 'octanol', or 'cyclohexane' (default: water)
  -o FILE, --outfile FILE
                        append one-line results summary to FILE (default: gsolv.txt)
  -e FILE, --energies FILE
                        append solvation free energies to FILE (default: energies.txt)
  --estimator {mdpow,alchemlyb}
                        choose the estimator to be used, alchemlyb or mdpow estimators (default: alchemlyb)
  --method {TI,MBAR,BAR}
                        choose the method to calculate free energy (default: MBAR)
  --force               force rereading all data (default: False)
  --noSI                Disable statistical inefficiency analysisStatitical inefficiency analysis is performed by default
                        when usingalchemlyb estimators and is disabled when using mdpow estimator. (default: True)
  -s N, --stride N      Use every N-th datapoint from the original dV/dlambda data. (default: 1)
  --start START         Start point for the data used from the original dV/dlambda data. (default: None)
  --stop STOP           Stop point for the data used from the original dV/dlambda data. (default: None)
  --ignore-corrupted    skip lines in the md.xvg files that cannot be parsed. WARNING: Other lines in the file might have
                        been corrupted in such a way that they appear correct but are in fact wrong. WRONG RESULTS CAN
                        OCCUR! USE AT YOUR OWN RISK (default: False)

iorga · 2021-06-04T07:59:28Z

Local tests output:

 mdpow-solvationenergy SM25 ../SM26/SM26/ -S octanol --start 1 --stop 20 --force
mdpow       : INFO     MDPOW 0.7.0-dev starting.
mdpow       : INFO     Copyright (c) 2010-2016 Ian Kenney, Bogdan Iorga, and Oliver Beckstein
mdpow       : INFO     Released under the GNU Public Licence, version 3.
mdpow       : INFO     For bug reports and help: https://github.com/Becksteinlab/MDPOW/issues
mdpow.config: INFO     Using the bundled force fields from GMXLIB='/nfs/homes4/sfan/miniconda3/envs/py27/lib/python2.7/site-packages/mdpow/top'.
mdpow.config: INFO     If required, override this behaviour by setting the environment variable GMXLIB yourself.
mdpow       : INFO     Analyzing directory 'SM25'... (can take a while)
mdpow       : INFO     [SM25] Reading octanol data 'SM25/FEP/octanol/Goct.fep'
mdpow.fep   : INFO     Solvation free energy calculation for molecule UNK in solvent octanol.
mdpow.fep   : INFO     Base directory is '/nfs/homes4/sfan/Projects/SAMPL7/gaff/SM25/SM25'
mdpow.fep   : INFO     Using setup directories under 'FEP/octanol': {'vdw': 'FEP/octanol/VDW', 'coulomb': 'FEP/octanol/Coulomb'}
mdpow.fep   : INFO     Default checkpoint file is '/nfs/homes4/sfan/Projects/SAMPL7/gaff/SM25/SM25/FEP/octanol/Goct.fep'
mdpow       : INFO     The solvent is octanol .
mdpow       : INFO     Analysis stride is 1 .
mdpow       : INFO     [SM25] Forcing (re)analysis of hydration free energy data
mdpow       : INFO     Estimator is alchemlyb .
mdpow       : INFO     Free energy calculation method is MBAR .
mdpow       : INFO     Analysis starts from frame 0 .
mdpow       : INFO     Analysis stops at frame None .
mdpow       : INFO     Statistial inefficiency analysis will be performed.
mdpow.fep   : INFO     [FEP/octanol] Finding dgdl xvg files, reading with stride=1 permissive=False.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 50.0223.
mdpow.fep   : INFO     The data are subsampled every 51 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 63.1531.
mdpow.fep   : INFO     The data are subsampled every 64 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 69.1584.
mdpow.fep   : INFO     The data are subsampled every 70 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 95.2881.
mdpow.fep   : INFO     The data are subsampled every 96 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 154.2010.
mdpow.fep   : INFO     The data are subsampled every 155 frames.
mdpow.fep   : INFO     Perform statistical inefficiency analysis.
mdpow.fep   : INFO     The statistical inefficiency value is 71.3375.

Thanks @VOD555 , it looks good, but can you also add before each SI analysis the window which is analyzed ? Thanks.

iorga · 2021-06-04T08:08:51Z

positional arguments:
directory directory or directories which contain all the files resulting from running
mdpow.fep.Ghyd.setup()

optional arguments:
-h, --help show this help message and exit
--plotfile FILE plot dV/dlambda to FILE; use png or pdf suffix to determine the file type. 'auto' generates a pdf
file DIRECTORY/dVdl__.pdf and 'None' disables it The plot function is only
available for mdpow estimator,and is disabled when using alchemlyb estimators. (default: none)
--solvent NAME, -S NAME
solvent NAME for compound, 'water', 'octanol', or 'cyclohexane' (default: water)
-o FILE, --outfile FILE
append one-line results summary to FILE (default: gsolv.txt)
-e FILE, --energies FILE
append solvation free energies to FILE (default: energies.txt)
--estimator {mdpow,alchemlyb}
choose the estimator to be used, alchemlyb or mdpow estimators (default: alchemlyb)
--method {TI,MBAR,BAR}
choose the method to calculate free energy (default: MBAR)
--force force rereading all data (default: False)
--noSI Disable statistical inefficiency analysisStatitical inefficiency analysis is performed by default
when usingalchemlyb estimators and is disabled when using mdpow estimator. (default: True)
-s N, --stride N Use every N-th datapoint from the original dV/dlambda data. (default: 1)
--start START Start point for the data used from the original dV/dlambda data. (default: None)
--stop STOP Stop point for the data used from the original dV/dlambda data. (default: None)
--ignore-corrupted skip lines in the md.xvg files that cannot be parsed. WARNING: Other lines in the file might have
been corrupted in such a way that they appear correct but are in fact wrong. WRONG RESULTS CAN
OCCUR! USE AT YOUR OWN RISK (default: False)

@VOD555 Looks good. At the noSI option, please add dots at the end of sentences et correct the typo in Statitical.

iorga · 2021-06-04T09:26:38Z

Thanks @VOD555 !

orbeckst · 2021-06-08T00:22:16Z

@VOD555 is this ready for another review?

@iorga if you have any comments could you please also do a review? You can also do a review that says "LGTM' (looks good to me) and approve. Thanks.

VOD555 · 2021-06-08T02:51:50Z

@orbeckst It's ready for review.
If the current mdpow-solvationenergy is ok, I can do the same changes of to mdpow-pow and mdpow-pcw (in this pr or another one?)

orbeckst

Thanks for all the changes. Looking good but see comments

The logic of the for G in (G1, G2) loop is broken and needs to be fixed — that's not your fault but it needs to be fixed and this PR is a good place.
I'd like the option for SI to be called --SI and --no-SI. I hope @iorga won't be too upset. I'd like it to work like a proper bool option.
Let the script fail right away if SI == True and estimator == "mdpow" just so that there's never any confusion of what is and isn't applied. If the user wants SI and then they need to use alchemlyb. This will work by default.

orbeckst · 2021-06-08T04:18:53Z

mdpow/fep.py

@@ -1366,17 +1372,24 @@ def p_transfer(G1, G2, **kwargs):

        # for this version. use the method given instead of the one in the input cfg file
        G.method = kwargs.pop('method', 'MBAR')


The whole loop logic needs to be fixed.

for G in (G1, G2): G_kwargs = kwargs.copy() # then use G_kwargs.pop() etc in this loop. ...

At the moment, G2 gets an almost empty kwargs and uses defaults for everything.

scripts/mdpow-pow

orbeckst · 2021-06-08T04:22:12Z

scripts/mdpow-pow

@@ -208,7 +208,7 @@ if __name__ == "__main__":
                      help="force rereading all data [%default]")
    parser.add_option('--noSI', dest='SI',


Rename as --no-SI so that when we use argparse bool options, it has the same name.

To make things clear, also add

parser.add_option('--SI', dest='SI', action="store_true", ...)

as the on switch. Set the default for SI to True.

scripts/mdpow-solvationenergy

orbeckst · 2021-06-08T04:41:28Z

scripts/mdpow-solvationenergy

+    if len(opts.directory) == 0:
        logger.fatal("A directory is required. See --help.")
        sys.exit(1)


I think this can be omitted thanks to narg+ — or not?

Yes, narg+ can automatically raise an error when given no input.

orbeckst · 2021-06-08T04:48:48Z

Fix mdpow-solvationenergy here and then do a PR for the others. Smaller PRs are better!

iorga · 2021-06-08T12:45:10Z

@VOD555 @orbeckst LGTM. I'm not upset about --SI, I very much prefer this way.

Once this PR is merged, I will test mdpow-solvationenergy and report back to you.

- changed `--noSI` to `--no-SI` - added `--SI` (the default)

orbeckst

Let the script fail right away if SI == True and estimator == "mdpow" just so that there's never any confusion of what is and isn't applied. If the user wants SI and then they need to use alchemlyb. This will work by default.
update CHANGES

(Add --SI / --no-SI to mdpow-pow in other PR).

mdpow/fep.py

scripts/mdpow-solvationenergy

orbeckst · 2021-06-09T01:34:31Z

Add notes to CHANGES and do whatever check you can think of (run the script at least locally... test would be nice but I don't think we test any of the scripts :-p ). When you're happy, squash merge (or tell me to do the merge). Thank youm, @VOD555 !

VOD555 · 2021-06-09T01:40:54Z

Local tests show no problem with --SI and --no-SI.
I'm working on the CHANGES.

VOD555 · 2021-06-09T03:52:28Z

I'm going to merge the PR

VOD555 added 4 commits May 1, 2021 12:54

add default setting for SI option

0ebfb59

update docs

092b983

merge develop

4a9cb1b

add new options for solvationenergy

f74ff82

VOD555 requested a review from orbeckst June 2, 2021 14:27

VOD555 self-assigned this Jun 2, 2021

orbeckst requested changes Jun 2, 2021

View reviewed changes

VOD555 added 2 commits June 2, 2021 20:14

use argparse instead of optionparser

33d1511

remove duplicate initial settings

3851606

orbeckst reviewed Jun 3, 2021

View reviewed changes

VOD555 added 3 commits June 3, 2021 05:58

use positional args for directory input

905aa5e

make opts.plotfile unchanged during the loop for multiple directory i…

a8e3787

…nputs

add INFO loggers

1dba7da

VOD555 added 2 commits June 4, 2021 02:23

fix description for noSI

d3f78d8

print out lambda window info when doing SI analysis

ca2ef03

VOD555 added 3 commits June 4, 2021 02:53

disable alchemlyb loggers when using mdpow estimator

eabc7ea

add loggers for calculation setting in fep.py

3a62bcd

print stride start and stop values within analysis functions

c8a4ea4

orbeckst requested changes Jun 8, 2021

View reviewed changes

use copy of kwargs in fep

5223563

change SI option for mdpow-solvationenergy

55e4ceb

- changed `--noSI` to `--no-SI` - added `--SI` (the default)

orbeckst requested changes Jun 8, 2021

View reviewed changes

mdpow/fep.py Outdated Show resolved Hide resolved

mdpow/fep.py Outdated Show resolved Hide resolved

scripts/mdpow-solvationenergy Outdated Show resolved Hide resolved

scripts/mdpow-solvationenergy Outdated Show resolved Hide resolved

orbeckst and others added 4 commits June 8, 2021 15:27

Apply suggestions from code review

2f4d54d

Raise error when SI=Ture and estimator=mdpow

82527bb

fix error logger

b81ee0a

pop keywords for final output before analysis

650c3cf

orbeckst approved these changes Jun 9, 2021

View reviewed changes

update CHANGES

202f4c9

VOD555 merged commit 985f20f into Becksteinlab:develop Jun 9, 2021

VOD555 deleted the ghyd_solvationenergy branch December 6, 2022 19:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow mdpow-solvationenergy to select alchemlyb estimators #148

allow mdpow-solvationenergy to select alchemlyb estimators #148

VOD555 commented Jun 2, 2021

orbeckst left a comment

orbeckst Jun 2, 2021

orbeckst Jun 2, 2021

orbeckst Jun 2, 2021

orbeckst Jun 3, 2021

VOD555 Jun 3, 2021

orbeckst Jun 3, 2021

VOD555 Jun 3, 2021

VOD555 commented Jun 3, 2021

VOD555 commented Jun 4, 2021

VOD555 commented Jun 4, 2021

iorga commented Jun 4, 2021

iorga commented Jun 4, 2021

iorga commented Jun 4, 2021

orbeckst commented Jun 8, 2021

VOD555 commented Jun 8, 2021

orbeckst left a comment

orbeckst Jun 8, 2021

orbeckst Jun 8, 2021

orbeckst Jun 8, 2021

orbeckst Jun 8, 2021

VOD555 Jun 8, 2021

orbeckst commented Jun 8, 2021

iorga commented Jun 8, 2021

orbeckst left a comment •

edited

Loading

orbeckst commented Jun 9, 2021

VOD555 commented Jun 9, 2021

VOD555 commented Jun 9, 2021

		@@ -1366,17 +1372,24 @@ def p_transfer(G1, G2, **kwargs):

		# for this version. use the method given instead of the one in the input cfg file
		G.method = kwargs.pop('method', 'MBAR')

		@@ -208,7 +208,7 @@ if __name__ == "__main__":
		help="force rereading all data [%default]")
		parser.add_option('--noSI', dest='SI',

allow mdpow-solvationenergy to select alchemlyb estimators #148

allow mdpow-solvationenergy to select alchemlyb estimators #148

Conversation

VOD555 commented Jun 2, 2021

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VOD555 commented Jun 3, 2021

VOD555 commented Jun 4, 2021

VOD555 commented Jun 4, 2021

iorga commented Jun 4, 2021

iorga commented Jun 4, 2021

iorga commented Jun 4, 2021

orbeckst commented Jun 8, 2021

VOD555 commented Jun 8, 2021

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Jun 8, 2021

iorga commented Jun 8, 2021

orbeckst left a comment • edited Loading

Choose a reason for hiding this comment

orbeckst commented Jun 9, 2021

VOD555 commented Jun 9, 2021

VOD555 commented Jun 9, 2021

orbeckst left a comment •

edited

Loading