-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi step MD flow #489
Multi step MD flow #489
Conversation
@gpetretto thanks for the PR. I will be able to take a look at this and get back to you soon if I have comments. |
Hi @gpetretto. I've taken a look the new flow. The question in the |
(@mjwen I think you need to actually publish the review) |
src/atomate2/vasp/flows/core.py
Outdated
else: | ||
md_structure = md_job.output.structure | ||
md_prev_vasp_dir = md_job.output.dir_name | ||
md_job = self.md_maker.make(md_structure, prev_vasp_dir=md_prev_vasp_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From line 561, a next run will take the structure from the previous run as input, and additional copy files from the previous dir.
Will this ensure that the MD will continue from the last stopped point. Positions of atoms will of course be, but how about others? For example, if one uses a thermostat, will the state of the thermostate be preserved? In other words, if I have a single MD will 100 steps and run 5 MD each with 20 steps using the new flow, will I get exactly the same results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to look into this and sorry for the delay.
My understanding is that there is no way of preserving the exact state of the thermostat. I also tried to look into VASP input and output to check if any information about the state of thermostat could be set, but it does not look like that. Splitting a calculation over chunks will not produce exactly the same trajectory as if the calculation was not split, as the total energy of system composed of the structure+thermostat will not be preserved over a restart, however this should still lead to meaninful result.
As far as I know, splitting the calculation over several chunks is a standard procedure, if all the required steps do not fit inside a single run. The VASP wiki also describes this as the procedure to follow: https://www.vasp.at/wiki/index.php/Molecular_dynamics_calculations
If a continuation run is performed copy CONTCAR to POSCAR or possibly deliver initial velocities in the POSCAR file.
So it should be fine to proceed in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
site_properties = structure.site_properties | ||
poscar = Poscar( | ||
structure, | ||
velocities=site_properties.get("velocities"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will these guarantee the continuation of the MD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I have discovered that if an NpT simulation is performed also the lattice velocities are added to the CONTCAR (see https://www.vasp.at/wiki/index.php/POSCAR) and should be preserved. At the moment they are completely ignored from the pymatgen Poscar object. I will add them.
As an additional point, we have further discussed about how to best set variables like class MultiMDMaker(Maker):
@classmethod
def from_parameters(cls, nsteps, time_steps, n_runs, ensamble, start_temp, end_temp):
generator = MDSetGenerator(nsteps=nsteps, time_step=time_step, enamble=ensamble, start_temp=start_temp, end_temp=end_temp)
md_maker = MDMaker(input_set_generator=generator)
return cls(md_maker=md_maker, n_runs=n_runs).make(structure) It seems that this approach is not used in other Flow Makers, but it would be particularly useful in this case, since the default values of these parameters will almost never fit the users's need and will require defining multiple objects manually. @utf do you think this might be a suitable approach? |
Hello, We have discussed with @gpetretto about one non-critical point but still to be decided. I have the feeling that "MultiMDMaker" is a bit odd and I would prefer to call this flow maker "MDMaker". The point is the job maker also has the same name. We see four options for this: 1/ Keep MultiMDMaker for the flow and MDMaker for the job (i.e. as it is now) 2/ Call the flow maker an "MDMaker" and keep MDMaker for the job 3/ Call the flow maker an "MDMaker" and rename MDMaker to SingleMDMaker for the job 4/ Call the flow maker an "MDMaker" and rename MDMaker to SingleMDMaker for the job, but keeping an "MDMaker" wrapper to SingleMDMaker, with a warning or deprecation (?) warning mentioning that it is better to use the flow My personal preference is on 4/ (as you may have guessed) but I am perfectly ok to any of the 3 other options (or yet another one). Pinging @mjwen @JaGeo @utf to get your comments/preference/decision on this. We would like to have this merged as soon as possible once it is ready (@gpetretto is doing the latest changes and adding tests as of now) but we would like to have the above decided. We fixed ourselves next Tuesday so if by next Tuesday we don't have any replies or if it leads to too much discussion around it because it's too tricky, we would keep it as it is now (option 1/). Thanks a lot! |
I have restructured the
Tests will thus require the latest version of emmet-core to pass. Not sure if I should set it in this PR. As David mentioned above, as far as we are concerned this is ready to be reviewed and it would be convenient for us if it could be merged. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #489 +/- ##
==========================================
+ Coverage 76.00% 76.03% +0.03%
==========================================
Files 83 86 +3
Lines 7027 7111 +84
Branches 1045 1055 +10
==========================================
+ Hits 5341 5407 +66
- Misses 1364 1381 +17
- Partials 322 323 +1
|
Hi @utf Just to know if you had time to look into this PR and if it can be merged ? I've started to implement an MLFF-based MD flow maker and I would like to reuse the MultiMDMaker as one piece of it, so it would be nice to have it merged when you feel it's ok. If there is anything left to be done for this PR, just let us know. (maybe if you wanted to comment or tell your preference on my above comment on the MultiMDMaker name, otherwise we keep it as it is now) Thanks a lot! Best, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidwaroquiers I left a few minor comments.
Thanks for your comments @janosh. I updated the code and tests. |
Thanks @gpetretto and @davidwaroquiers! |
This PR introduces a flow that allows to split an MD simulation over multiple steps and to continue from a previously terminated flow.
The reason for introducing this is that long MD calculations would usually not fit inside a single submission and that the amount of steps required to achive a good convergence might not be know a priori. The idea is that should be easyto split/join MD flows and then reconstruct a full trajectory.
A few points to describe the changes:
n_runs
parameter.MultiMDFlow
.allow_external_references=True
inrun_locally
or inflow_to_workflow
Also some questions/points that could be discussed:
nsteps
,time_step
andn_runs
in one place. Are there better ways to do this?properties
. However, when passing aStructure
to thePoscar
objectsite_properties
andproperties
in the structure are ignored. At the moment I modified the InputSet to take this into account when writing the POSCAR. Would it be better to change pymatgen so that it considers the structure properties when generating aPoscar
?md_output
contain more data (e.g. be a subclass of emmet'sStructureMetadata
)Tests are still missing.
Also pinging @mjwen as the original author of the
MDMaker
.