Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to judge if the training is convergent #19

Open
bingsimon opened this issue Mar 26, 2024 · 2 comments
Open

How to judge if the training is convergent #19

bingsimon opened this issue Mar 26, 2024 · 2 comments

Comments

@bingsimon
Copy link

Hello, I am learning to use jax-reaxff recently. I don't know how to see if the force field training results in convergence. I have the following three questions, I hope to get your answers:

  1. In the SI of your article, you have convergence plots for three examples from Datasets. Does the ordinate in the figure refer to the 'True total loss' value for each iteration in the output file?
  2. I used the ffield_lit, geo, params, trainset.in files from Datasets/disulfide to run the program and got the convergence as shown in the figure below. It seems not to converge, but I don't know what the problem is.
    1711453318297
    Here are the parameters I used:
    jaxreaxff --init_FF ffield_lit
    --params params
    --geo geo
    --train_file trainset.in
    --num_e_minim_steps 200
    --e_minim_LR 1e-3
    --out_folder ffields
    --save_opt all
    --num_trials 10
    --num_steps 20
    --init_FF_type fixed \
  3. Whether there is a clear convergence standard to judge whether the training results are convergent.

I am looking forward to your answer and thank you for your time!

@cagrikymk
Copy link
Owner

cagrikymk commented Mar 30, 2024

Hello,
There is nothing wrong with the way you run the code and yes the figures from the SI report the true loss value.

I recently extensively rewrote the code, so the results may differ from those in the paper. However, you should still be able to achieve similar fitness scores overall.

Assuming you have the most up-to-date code from the master branch, I believe the issue you are encountering is related to adding noise to the parameters when stuck. I have incorporated logic for such scenarios wherein the optimizer becomes trapped in a local minimum and fails to make progress. In such cases, I introduce slight noise to the parameters and continue the optimization process. This might result in fluctuations in the current true loss value, but since I keep track of the best loss encountered during optimization, it should not negatively affect the overall performance of the optimizer (the final true loss value).

If you want to reduce the amount of noise added, you can modify the following line in your local copy:
driver.py line: 153

I hand-tuned and adjusted some parameters to simplify overall usage. In the original paper, I set that value to "0.01," whereas now it's "0.04," indicating a more aggressive noise approach. I might make this parameter modifiable through an argument to the driver.

I hope this answers your question, let me know if it is not clear.

@bingsimon
Copy link
Author

Thank you very much for your answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants