Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCUMENTATION] Estimating the error when using histogram reweighting to estimate melting temperature #130

Open
alec-cook opened this issue Oct 2, 2024 · 1 comment
Labels
documentation Improvements or additions to documentation

Comments

@alec-cook
Copy link

Hello,

I am attempting to simulate the melting of a 9bp DNA duplex; I would like to estimate its melting temperature under a variety of conditions. I have been basing my simulations on the RNA Duplex Melt example provided with oxDNA.

How many simulation steps are required to make a precise estimate of the melting temperature? Is there a well-established method for estimating the error of such a measurement? Looking at e.g. Sengar et al (2021), the authors produced a plot (figure 8) that appended error bars to the duplex fractions produced by this technique, but I cannot find a description of how these error bars are produced. Should the simulation be repeated several times to provide an estimate of its uncertainty? Is there an appropriate resampling technique?

I ran a few trial simulations with a low number (1e6) of simulation steps to make certain the simulation was working, but the melting temperatures extrapolated from the produced histograms varied by more than 10 degrees. I recognize that the RNA melting example uses 2e9 simulation steps for an 8-mer; would a 9-mer require a similar number of conformations? Or are the populations of the histogram bins more important? In my trial simulations, the least frequently visited bin was visited ~1e4 times; how high does this need to be? Millions? Billions?

The oxDNA primer also mentions that it is important to equilibrate the duplex for some time before collecting VMMC statistics; however, the RNA melting example does not include any equilibration steps. Is the best practice to conduct a separate MD simulation, confirm that all base pairs are closed, and then use this conformation for the VMMC simulation? Or is it appropriate to simply include an equilibration_steps statement in the VMMC input file?

Thank you so much for your help,
Alec

https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2021.693710/full

@alec-cook alec-cook added the documentation Improvements or additions to documentation label Oct 2, 2024
@lorenzo-rovigatti
Copy link
Owner

Hi! It's hard to come up with rules when using enhanced sampling techniques such as umbrella sampling, therefore you'll most likely find my answers rather unsatisfactorily.

I don't remember how the error bars in fig 8 of the primer have been estimated, but I invite you to ask the corresponding author. If I had to guess, I'd say that he used multiple simulations.

As for the number of steps, it depends greatly on the system and conditions you are simulating, so there are no rules of thumbs you can rely on. However, you can do some a-posteriori checks by looking at the histograms: every bin should have been visited "many" times (hundreds/thousands) if you want to have a decent statistics. If that's not the case, then you need to either lengthen the simulation or change the weights.

About equilibration, it is not that important for small systems (where decorrelation times are small), but it gets increasingly important as the system becomes more and more complicated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants