-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/bands wc #235
Feature/bands wc #235
Conversation
On the lines of the common structure relaxation workflow, we implement the abstract classes for the creation of the bands workflow with common interface (inputs and outputs). Each code will need to make its own implementation. This commit includes the siesta implementation.
Thanks @bosonie ! Following up from our discussion this morning, I added a comment to the issue with a slightly different suggestion. Here I summarise some answer to your points:
|
Question: the |
|
||
# K points for bands, possible change of structure due to SeeK-Path | ||
if bands_kpoints is None: | ||
res = seekpath_explicit_kp_path(structure, orm.Dict(dict=seekpath_parameters)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dilemma has come up before but I really think we shouldn't be calling calcfunctions in input generator calls. It should be possible to call get_builder
without the database being affected. On the other hand, I think changing the input structure is something that should be captured in the workchain.
I am wondering whether we should simply make the seekpath
analysis an actual step of the WorkChain
. This is what I already do in the PwBandsWorkChain
to address the problem of keeping provenance while not doing it in the generators. If we agree that using seekpath
to normalize the structure and determine the high-symmetry k-point path for the bands should be done for all implementations, it only make sense this is done in the common workflow itself. I am just not sure it is possible, because the generic workchain class cannot know where in the input namespace the structure
input is located and where the kpoints
should be defined. I will see if there is a way around this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. We agreed on the fact that two possibilities are foreseen: the user specifies a full list of kpoints in input or it uses `seekpath`. In the implementation, the use of seekpath is default, unless the input “bands_kpoints” is specified. Should we maybe make more explicit this choice? How?
Okey, so did we agree how complex the common-workflow bands data should be like? E.g. should each plugin possibly utilize their own band generation workchain stack? I see pros and cons for both. From the VASP side I think we would like to run our own bands workchains as we anyway need those to run outside the common-workflows.
So we use seekpath
as part of the workflow if the user wants to, and in doing so, when for instance running something outside of regular DFT it is sometimes required to pass the original grid in addition to the line extraction points. For us it would not be either or, both both. Assuming of course that we would not utilize our own workchain stack. Also, we should not lock into seekpath
in case users want to supply their own line sets. As you know for some symmetries you have a choice and seekpath
follows one of those. In order to honour reproducibility I think we should generally opt for user choice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it only make sense this is done in the common workflow itself
Yes, but only if you look at common workflows project. It would probably be important for codes to offer their own bands workchain to run outside of common workflows if need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dilemma has come up before but I really think we shouldn't be calling calcfunctions in input generator calls. It should be possible to call
get_builder
without the database being affected. On the other hand, I think changing the input structure is something that should be captured in the workchain.
Yes, I also fully support this. Also, it is more likely for users to inspect a well defined workchain that it is to dive into the code base. Maybe documentation is also easier to do in this way. Generally I think it is good practice to keep everything related to the numerics/science in workchains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but only if you look at common workflows project. It would probably be important for codes to offer their own bands workchain to run outside of common workflows if need be.
Sure, but then it probably is optional as well in your workchain, or maybe it should be. The PwBandsWorkChain
allows to skip the relaxation and seekpath normalization entirely. The one thing I still need to address is that currently it enforces running an SCF step. Maybe we could add more flexibility there to even skip this as well and just take an existing SCF but that comes with its own complications in terms of validating the compatibility of inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okey, so did we agree how complex the common-workflow bands data should be like?
This is a good point and going through the text in #222 I don't see a discussion of this (I didn't attend the meetings). The current implementation in this PR also doesn't seem to be dealing with the fact that when using seekpath, the structure cannot be assumed to remain the same and so any existing parent_folder
needs to be considered useless and a new SCF has to be done first before computing the bands.
The possible scenarios that I can from the user's perspective:
- User just has a structure and just wants to compute the bands
- Required inputs
structure
- Workflow logic:
- Structure is primitivized and k-points are determined
- An SCF is run for the primitivized structure
- A bands calculation is performed on results of SCF and the k-points determined in first step
- Required inputs
- User has a structure and a
remote_folder
containing restart files of a previously completed calculation- Required inputs
structure
kpoints
remote_folder
- Workflow logic:
- Just run bands calculation restarting from
remote_folder
usingstructure
andkpoints
as input (note, here it is not guaranteed that the inputs are compatible but it is not trivial to see that the workflow can check this and maybe we should just accept whatever and run it. User is responsible for making sure the inputs make sense)
- Just run bands calculation restarting from
- Required inputs
One potential third option I see is where a user has a structure
and remote_folder
but no kpoints
yet and would want them to be generated, but that would require seekpath
(or some other method) to be able to compute them without modifying the structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point and going through the text in #222 I don't see a discussion of this (I didn't attend the meetings). The current implementation in this PR also doesn't seem to be dealing with the fact that when using seekpath, the structure cannot be assumed to remain the same and so any existing
parent_folder
needs to be considered useless and a new SCF has to be done first before computing the bands.
I was also thinking along the lines of decisions made if we should rely on bands workchains on the plugin side. I think we would prefer to run those for different reasons. Obviously structure
and k-points
need to be determined on common grounds, leading to the unavoidable fact that seekpath
needs to be executed on the common workflows to avoid dealing with verifications that the codes run this in the same way. Or if one foresee that we should be able to just run the existing workchain pipelines in the common workflows for this?
My maintenance point here being that many plugins anyway need a bands workchain outside of common-workflows so it would be nice not having to maintain two slightly different pipelines and the user support coming from those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My maintenance point here being that many plugins anyway need a bands workchain outside of common-workflows so it would be nice not having to maintain two slightly different pipelines and the user support coming from those.
I see your point, however, I think for the common workflows having some level of consistency in how the various implementations perform the bands is important. Here I am talking about consistency on a high-level, such as making sure that k-point generation and if a structure gets modified in that case, is identical for all implementations. I think you agree with this though. So I would definitely think this needs to go in the common workflows. The problem you describe for the plugins having fully-fledged band workchains of their own also applies to aiida-quantumespresso
but I don't think it necessarily adds a maintenance burden on it. All we need to do is make those workflows flexible enough to make parts of the logic, such as structure normalization and k-point generation, optional. But this is anyway valuable for the plugin users, so I don't think this should be a problem personally.
The most important part of the design process here is exactly define the high-level logic of the common bands workflow and what parts should be common. This is what I tried to sketch with my previous comment and we should probably work out before continuing with the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you agree with this though.
Absolutely. And I think there is no escaping having seekpath
as an option in common workflows. And that should be part of its own workchain in that band common workflow stack. That we pipe structure
and kpoints
down to the band workchain of the respective plugins, which handle the sequence of calcs necessary.
One could also imagine generalizing this a bit to offer more of a symmetry workchain, where also say spglib
is used so that we ensure some kind of consistency (by default, can be overridden by the user). I know from previous experience that these details are often overlooked by users and/or causing a lot of time and frustration. seekpath
would then naturally fall into that symmetry workchain, which is something we could make the common workflow band workchain depend on. I think that would be useful in other contexts downstream as well (and is something we anyway plan on doing plugin side).
Not to derail, but do you have any info you can point to which discusses the future of this? |
If latter we should possibly not use |
I think at some point it makes sense to do nested protocols, but we are not there yet. Until that let us not make the code base too complex in order to save a few lines. The protocol is considered a recipe and is also a clear documentation to the user. It is thus important that the user can look quickly at it to get an idea on what is done, instead of spending time going back and forth in the code base. At least that is my opinion on this. |
There never have been any discussions on changing any of this (at least that I am aware of) so there is nothing to point to as far as I can tell. |
100% agree, but this should be indeed just the |
Just to clarify my original suggestion - a bit different than what's implemented here:
|
Thanks all for the feedback. Few observations:
Overall I'm getting to like @giovannipizzi suggestion and I will try to implement it. However I would leave time for some other feedback in #222 before asking a new review. |
Implements the abstract classes for the common workflow calculating the band structure, as discussed in #222
Some design choices that can be debated:
seekpath
. In the implementation, the use of seekpath is default, unless the input “bands_kpoints” is specified. Should we maybe make more explicit this choice? How?BandsData
aiida-core#5032) or return the fermi energy as direct output.RemoteData
folder. Will this concept be present in aiida 2.0?RelaxInputGenerator
andBandsInputGenerator
. Moreover the protocol list (yaml file) is duplicated. Any better way to do that?Also the siesta implementation is provided as a reference for other codes.