Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update POTCAR summary stats to include 6.4 POTCARs and add dev_script utils for future updates #3370

Merged
merged 17 commits into from
Oct 3, 2023

Conversation

esoteric-ephemera
Copy link
Contributor

@esoteric-ephemera esoteric-ephemera commented Oct 2, 2023

Major changes:

  • Follow-up to issue #3342 and closed PR #3351
  • Adds support for version 64 POTCARs
  • Adds hidden function for PMG devs to continuously update potcar_summary_stats.json.gz used to validate potcars, pymatgen.io.vasp.inputs._gen_potcar_summary_stats
  • Added functionality to generate fake POTCARs from real POTCARs (by randomizing numeric and boolean values in them) for tests in dev_scripts/potcar_scrambler.py

Comment on lines 2286 to 2290
def _gen_potcar_summary_stats():
"""
This function solely intended to be used for PMG development to regenerate the
potcar_summary_stats.json.gz file used to validate POTCARs
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! This is great to have. Even though it's marked as private, we should add a test for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a Test_gen_potcar_summary_stats to tests/io/vasp/test_inputs.py. Requires generating a library of fake POTCARs, which are in tests/fake_POTCAR_library. Code to generate this library from a real library of POTCARs is in dev_scripts/potcar_scrambler.py

@janosh janosh added enhancement A new feature or improvement to an existing one io Input/output functionality vasp Vienna Ab initio Simulation Package labels Oct 2, 2023
@@ -2280,6 +2283,43 @@ def __repr__(self) -> str:
return f"{cls_name}({symbol=}, {functional=}, {TITEL=}, {VRHFIN=}, {n_valence_elec=:.0f})"


def _gen_potcar_summary_stats():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine here, but noting there is also a top-level dev_scripts folder (largely undocumented). On balance it's probably better here for discoverability for anyone curious how this file was generated.

Minor comments: the bare return is unnecessary, and monty.serialization.dumpfn can be used directly with dumpfn(new_summary_stats, "filename.json.gz") which is a convenience function we use a lot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Much better way of doing this. For placement, you and @janosh probably have a better sense of where this belongs. _gen_potcar_summary_stats is a "destructive" function, in that it overwrites the file that's distributed with PMG. dev_scripts might then be a better place for it

Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work, thanks @esoteric-ephemera! 👍

@janosh janosh changed the title Updating POTCAR summary stats to include 64 version POTCARs Update POTCAR summary stats to include 6.4 POTCARs and add dev_script utils for future updates Oct 3, 2023
@janosh janosh enabled auto-merge (squash) October 3, 2023 23:00
@janosh janosh disabled auto-merge October 3, 2023 23:11
@janosh janosh merged commit 8c594d7 into materialsproject:master Oct 3, 2023
@lan496
Copy link
Contributor

lan496 commented Oct 4, 2023

Hi. I think pymatgen/io/vasp/potcar_summary_stats.json.gz should be added to setup.package_data to use it via pip-installed version.

pymatgen/setup.py

Lines 95 to 119 in 44bf383

package_data={
"pymatgen.analysis": ["*.yaml", "*.json", "*.csv"],
"pymatgen.analysis.chemenv": [
"coordination_environments/coordination_geometries_files/*.json",
"coordination_environments/coordination_geometries_files/*.txt",
"coordination_environments/strategy_files/ImprovedConfidenceCutoffDefaultParameters.json",
],
"pymatgen.analysis.structure_prediction": ["*.yaml", "data/*.json"],
"pymatgen.analysis.diffraction": ["*.json"],
"pymatgen.analysis.magnetism": ["default_magmoms.yaml"],
"pymatgen.analysis.solar": ["am1.5G.dat"],
"pymatgen.entries": ["*.json.gz", "*.yaml", "data/*.json"],
"pymatgen.core": ["*.json"],
"pymatgen": ["py.typed"],
"pymatgen.io.vasp": ["*.yaml", "*.json"],
"pymatgen.io.feff": ["*.yaml"],
"pymatgen.io.cp2k": ["*.yaml"],
"pymatgen.io.lobster": ["lobster_basis/*.yaml"],
"pymatgen.command_line": ["*"],
"pymatgen.util": ["structures/*.json", "*.json"],
"pymatgen.vis": ["*.yaml"],
"pymatgen.io.lammps": ["CoeffsDataType.yaml", "templates/*.template"],
"pymatgen.symmetry": ["*.yaml", "*.json", "*.sqlite"],
"cmd_line": ["**/*"],
},

@janosh
Copy link
Member

janosh commented Oct 4, 2023

Ah crap!

@mkhorton
Copy link
Member

mkhorton commented Oct 4, 2023

Not the first time don’t worry! I once added a catch-all to package all .json, .yaml files by default but this was vetoed for reasons I forget, but perhaps worth reconsidering

janosh added a commit that referenced this pull request Oct 4, 2023
….package_data` (#3372)

* fix bare URLS

they don't work in https://pymatgen.org/CHANGES.html

* add pymatgen/io/vasp/potcar_summary_stats.json.gz to setup.package_data

thanks #3370 (comment)
@janosh
Copy link
Member

janosh commented Oct 4, 2023

I once added a catch-all to package all .json, .yaml files by default but this was vetoed for reasons I forget, but perhaps worth reconsidering

I think it's worth a try. @shyuep Any concerns?

If we get false positives, we can always revert the change.

@shyuep
Copy link
Member

shyuep commented Oct 4, 2023

My concern is that data files should be few and far between and EXPLICIT.

@janosh
Copy link
Member

janosh commented Oct 4, 2023

My concern is that data files should be few and far between and EXPLICIT.

Exactly, which is why the ones we deliberately place in the pymatgen pkg dir should be there for a good reason and hence also be included in the PyPI release. At least that's the case right now as you can see from d8151e5 passing.

@shyuep
Copy link
Member

shyuep commented Oct 4, 2023

Yes, that is why it should be explicitly specified. For the same reason, you do not usually do from package import *.

@janosh
Copy link
Member

janosh commented Oct 4, 2023

The biggest reason not to do from package import * imo is that it breaks tooling. IDEs can trace where things come from so no doc string glancing, no type hints and worse code completions. But none of those apply here, there's no tooling for pkg data anyway. So far, there's been a 1-to-1 map from data files in pymatgen to things included in SOURCES.txt so we might as well make the map automatic imo.

@shyuep
Copy link
Member

shyuep commented Oct 4, 2023

There are reasons. Say someone decides to add a very_big_file.yaml for some internal testing purposes. That file will automatically be added to package data, even if it has no real purpose in actual function of the main code. Like I said, it is better to be explicit what files are needed and what are not.

@janosh
Copy link
Member

janosh commented Oct 4, 2023

I don't merge very big files into master. Not even in test files, let alone in pkg dir. Test files should never be placed in pkg dir.

@shyuep
Copy link
Member

shyuep commented Oct 4, 2023

You don't != Someone else don't. Anyway, we don't add data files all the time. So just leave it as is. I don;'t want to go round and round on an issue that has occurred once in two years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement to an existing one io Input/output functionality vasp Vienna Ab initio Simulation Package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants