Add ALIGNN FF (aborted) #47

janosh · 2023-07-12T00:39:58Z

All credit for this PR goes to @pbenner. Thanks for this submission! 🙏

This PR copies over a subset of files in pbenner/matbench-discovery#621d4f1a6.

Update 2023-08-09

This model submission was aborted due to ongoing technical challenges. See the model readme for details.

with minor tweaks to pbenner's original alignn_relax.py

with parallelization recommendations based on available cores

uses ALIGNN FF relaxed structures generated by alignn_ff_relax.py

janosh · 2023-07-12T00:41:29Z

models/alignn_ff/metadata.yml

+doi: https://doi.org/10.1039/D2DD00096B
+preprint: https://arxiv.org/abs/2209.05554
+requirements:
+  ase: 3.22.0


@pbenner Could you check if these package version numbers are correct (i.e. match the ones you were using for this submission).

Yes, the versions are correct, except for ase -> 3.22.1

janosh · 2023-07-12T00:42:55Z

models/alignn_ff/metadata.yml

+  pandas: 2.0.1
+  scikit-learn: 1.2.2
+  torch: 1.9.0+cu111
+trained_for_benchmark: false


@pbenner Just to clarify, you used alignnff_wt10?

Exactly, it was called best_model.pt before:
alignn/ff/best_model.pt → alignn/ff/alignnff_wt10/best_model.pt

Not sure if it makes sense to test some of the newly added models. Can give it a try, at least check if the convergence is better

janosh · 2023-07-12T00:43:53Z

models/alignn_ff/alignn_ff_relax.py

+
+import numpy as np
+import pandas as pd
+from pqdm.processes import pqdm


Can we work around this new dependency? I'm not sure how it differs from tqdm? Is tqdm lacking parallel process support?

Yes, you can just use tqdm. For parallel processing pqdm has a bit nicer output, that's all.

janosh · 2023-07-14T03:40:14Z

@pbenner I started evaluating the ALIGNN FF predictions and they look bad enough that I think we may have made a mistake. Are we sure ALIGNN FF was trained on the MP legacy correction scheme (MaterialsProjectCompatibility)? Perhaps they trained on raw VASP energies? We'll have to consult the paper or check the code base.

2023-07-11-alignn-ff-wbm-IS2RE.csv.gz rename pred_col

pbenner · 2023-07-15T18:16:07Z

I think I forgot to remove the energy corrections for the custom ALIGNN model. Uploaded two new files:

2023-07-15-custom-alignn-relaxed-wbm-IS2RE.csv.gz: Custom model without energy correction
2023-07-15-mp_e_form_alignn-relaxed-wbm-IS2RE.csv.gz: Pre-trained model with correction applied
But still, I would like to check the ALIGNN-FF model in detail. Will do that next week. Maybe it makes sense to train it from scratch.

janosh · 2023-07-15T21:33:52Z

Just to clarify, the model you trained from scratch that produced 2023-07-15-custom-alignn-relaxed-wbm-IS2RE.csv.gz was trained on MP ComputedStructureEntries that already include the 2020 MP energy correction scheme. So the predictions it makes are also corrected energies? By "custom model without energy correction", do you mean you subtracted the energy corrections again from every model prediction?

pbenner · 2023-07-16T07:27:32Z

By "custom model without energy correction" I meant that I didn't apply them. It's this if-statement that I forgot to add to your script:
https://github.com/pbenner/matbench-discovery/blob/25616eeba7e71b0909feccddfef88dfc63c60314/models/alignn/test_alignn_relaxed.py#L118
You can find the changes here in this commit:
pbenner@25616ee
And I re-executed the script twice with the corresponding model_name variable set. Hope this makes sense.

Just also added a few comments to the script.

pbenner · 2023-07-21T13:05:09Z

Since ALIGNN-FF was pre-trained on the JARVIS data, which is DFT simulation data with OptB88vdW functional (with van der Waals correction), the results are not comparable and I would drop it. Would you consider adding training data to matbench discovery (forces/stresses) that allows the user to train a force field model? I think this would be very valuable and make the benchmark applicable to a wider class of models, where no good pre-trained model exists.

janosh · 2023-07-21T16:55:04Z

Since ALIGNN-FF was pre-trained on the JARVIS data, which is DFT simulation data with OptB88vdW functional (with van der Waals correction), the results are not comparable and I would drop it.

That's one option. We could also add it anyway with a big disclaimer just to highlight the importance of homogeneous DFT settings between train and test set.

Would you consider adding training data to matbench discovery (forces/stresses) that allows the user to train a force field model?

Yes, that data is definitely part of the MBD training set. I have a local copy of the energies, forces, stresses and magmoms for every ionic step in the MP database. It's ~14.7 GB. I was planning to upload it to figshare at some point.

But considering my local data is identical to the MPtraj dataset used to train CHGNet (~11.3 GB) on Figshare (except for some cleaning which most people will benefit from), I think we might as well just point to that.

JaGeo · 2023-07-21T16:58:25Z

That's one option. We could also add to the benchmark anyway with a big disclaimer just to highlight how important homogeneous DFT settings between train and test set are.

I wouldn't do this. I think it should be obvious for anyone using DFT that without compatibility scheme you should not do any kind of comparison.

pbenner · 2023-07-23T20:02:52Z

I converted the MPtraj dataset to ALIGNN compatible format:
https://github.com/pbenner/matbench-discovery/blob/alignn/models/alignn/make_train_data_ff.py
It will probably take too long to train ALIGNN-FF, but I'm just giving it a try.

janosh · 2023-07-25T00:10:20Z

I wouldn't do this. I think it should be obvious for anyone using DFT that without compatibility scheme you should not do any kind of comparison.

Thanks for your input @JaGeo! Always good to hear other's thoughts. I'm leaning the same way.

It will probably take too long to train ALIGNN-FF, but I'm just giving it a try

@pbenner That's great! Thanks for the update!

Even if that doesn't pan out, I might still merge this PR just to keep a record that we tried to add ALIGNN-FF and show how far we got as well as explain in a readme why we aborted.

pbenner · 2023-08-06T09:31:21Z

I tried to optimize the ALIGNN-FF training. Unfortunately, the model is quite resource hungry. The 12 GB of CHGNet training data turn into 600 GB of graph data. It still fits into memory, but prevents me to use the accelerator for training on multiple GPUs, which creates a copy for each process. Also, a batch size of 16 already fills up all 80 GB of GPU memory. Training is quite slow with such a small batch size (roughly a day per epoch).

I'm now fine tuning the ALIGNN-FF WT10 model on the CHGNet data, which hopefully takes fewer epochs than training a new model from scratch.

janosh · 2023-08-08T21:36:55Z

I'm now fine tuning the ALIGNN-FF WT10 model on the CHGNet data, which hopefully takes fewer epochs than training a new model from scratch.

Thanks for trying @pbenner! I think we've done everything that can reasonably be expected here to get ALIGNN FF into MBD. Rather than spend more time on it, how about we call it quits and move on the wrapping up the preprint?

pbenner · 2023-08-09T14:43:52Z

Yes agreed. I also saw that the WT10 model trained on Jarvis has a worse initial loss on the CHGNet data than the untrained model, which indicates that these two datasets are very much incompatible. I'll let it rest for now...

janosh added 8 commits July 11, 2023 15:56

add workflow_dispatch trigger to .github/workflows/test.yml

3c7e186

update alignn readme after train_alignn.py refactor

7f515cd

add alignn_ff/metadata.yml

2d5c6df

add alignn-ff-2023.07.05.patch to apply git diff

abc1c2d

add alignn_ff_relax.py

755a7a2

with minor tweaks to pbenner's original alignn_relax.py

add alignn_ff/readme.md

3673059

with parallelization recommendations based on available cores

add test_alignn_ff.py

0bdd45a

uses ALIGNN FF relaxed structures generated by alignn_ff_relax.py

add predictions generated by @pbenner

2e6ecc1

janosh added the new model Model submission label Jul 12, 2023

janosh commented Jul 12, 2023

View reviewed changes

janosh temporarily deployed to github-pages July 12, 2023 00:42 — with GitHub Actions Inactive

janosh commented Jul 12, 2023

View reviewed changes

readme add intro line

ae833bd

2023-07-11-alignn-ff-wbm-IS2RE.csv.gz rename pred_col

janosh temporarily deployed to github-pages July 14, 2023 15:31 — with GitHub Actions Inactive

janosh force-pushed the main branch from edc2487 to afe6ce9 Compare July 30, 2023 01:31

janosh added 2 commits August 9, 2023 14:41

Merge branch 'main' into alignn-ff

0aadbb2

document reasons to abort ALIGNN FF submission in model readme

cbf3fce

janosh temporarily deployed to github-pages August 9, 2023 22:10 — with GitHub Actions Inactive

janosh merged commit d324b02 into main Aug 9, 2023

janosh deleted the alignn-ff branch August 9, 2023 22:14

janosh changed the title ~~Add ALIGNN FF~~ Add ALIGNN FF (aborted) Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ALIGNN FF (aborted) #47

Add ALIGNN FF (aborted) #47

janosh commented Jul 12, 2023 •

edited

Loading

janosh Jul 12, 2023 •

edited

Loading

pbenner Jul 12, 2023

janosh Jul 12, 2023

pbenner Jul 12, 2023

pbenner Jul 12, 2023

janosh Jul 12, 2023

pbenner Jul 12, 2023

janosh commented Jul 14, 2023

pbenner commented Jul 15, 2023

janosh commented Jul 15, 2023

pbenner commented Jul 16, 2023 •

edited

Loading

pbenner commented Jul 21, 2023

janosh commented Jul 21, 2023 •

edited

Loading

JaGeo commented Jul 21, 2023

pbenner commented Jul 23, 2023 •

edited

Loading

janosh commented Jul 25, 2023

pbenner commented Aug 6, 2023

janosh commented Aug 8, 2023

pbenner commented Aug 9, 2023

Add ALIGNN FF (aborted) #47

Add ALIGNN FF (aborted) #47

Conversation

janosh commented Jul 12, 2023 • edited Loading

Update 2023-08-09

janosh Jul 12, 2023 • edited Loading

Choose a reason for hiding this comment

pbenner Jul 12, 2023

Choose a reason for hiding this comment

janosh Jul 12, 2023

Choose a reason for hiding this comment

pbenner Jul 12, 2023

Choose a reason for hiding this comment

pbenner Jul 12, 2023

Choose a reason for hiding this comment

janosh Jul 12, 2023

Choose a reason for hiding this comment

pbenner Jul 12, 2023

Choose a reason for hiding this comment

janosh commented Jul 14, 2023

pbenner commented Jul 15, 2023

janosh commented Jul 15, 2023

pbenner commented Jul 16, 2023 • edited Loading

pbenner commented Jul 21, 2023

janosh commented Jul 21, 2023 • edited Loading

JaGeo commented Jul 21, 2023

pbenner commented Jul 23, 2023 • edited Loading

janosh commented Jul 25, 2023

pbenner commented Aug 6, 2023

janosh commented Aug 8, 2023

pbenner commented Aug 9, 2023

janosh commented Jul 12, 2023 •

edited

Loading

janosh Jul 12, 2023 •

edited

Loading

pbenner commented Jul 16, 2023 •

edited

Loading

janosh commented Jul 21, 2023 •

edited

Loading

pbenner commented Jul 23, 2023 •

edited

Loading