Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why POTCAR hash from mp_api doesn't match local POTCAR file hash: explained by md5_header_hash vs md5_computed_file_hash. #944

Closed
hongyi-zhao opened this issue Nov 26, 2024 · 4 comments

Comments

@hongyi-zhao
Copy link

See my following checking:

In [1]: from mp_api.client import MPRester
   ...: 
   ...: material_id = "mp-126"
   ...: with MPRester() as mpr:
   ...:     # 获取elastic数据来获取task id
   ...:     elasticity_doc = mpr.materials.elasticity.search(material_ids=[material_id])
   ...:     opt_id = elasticity_doc[0].fitting_data.optimization_task.string
   ...: 
   ...:     # 使用task id获取详细信息
   ...:     opt_doc = mpr.materials.tasks.search([opt_id],
   ...:         fields=["input", "calcs_reversed"])
   ...: 
   ...:     # 提取 potcar_spec
   ...:     potcar_spec = opt_doc[0].calcs_reversed[0].input.potcar_spec
   ...:     print(f"POTCAR specifications: {potcar_spec}")
   ...: 
Retrieving ElasticityDoc documents: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 10618.49it/s]
Retrieving TaskDoc documents: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 12557.80it/s]
POTCAR specifications: [PotcarSpec(titel='PAW_PBE Pt 05Jan2001', hash='a604ea3c6a9cc23c739b762f625cf449', summary_stats=None)]

The potcar on my machine:

werner@x13dai-t:~/Desktop/pot$ ug 'PAW_PBE Pt 05Jan2001' -l | xargs md5sum
b1f52f99a6ad883e9383660a9fc1eebf  potpaw_PBE/Pt_ZORA/POTCAR
079fdad3df93dbceac18c56a69852e29  potpaw_PBE/Pt/POTCAR

As you can see, the hash doesn't match. So, what's the exact potcar version used by MP?

Regards,
Zhao

@hongyi-zhao hongyi-zhao changed the title About the potcar used by MP. About the exact potcar version used by MP. Nov 26, 2024
@tschaume
Copy link
Member

That's a good question but it's better asked on our forum for the benefit of the whole community. Please search for existing explanations/solutions on the forum before posting your question there. Thanks!

@hongyi-zhao
Copy link
Author

hongyi-zhao commented Nov 27, 2024

See here for the related discussion.

@hongyi-zhao
Copy link
Author

hongyi-zhao commented Nov 27, 2024

I understand the issue now. There is no problem with potcar. The mp_api query shows the MD5 hash of the metadata defining the PotcarSingle, aka, PotcarSingle.md5_header_hash, not the MD5 hash of the entire PotcarSingle, aka, PotcarSingle.md5_computed_file_hash. The above two types of information are stored in pymatgen as below:

# The `MD5 hash of the metadata defining the PotcarSingle`:
werner@x13dai-t:~/Public/repo/github.com/materialsproject/pymatgen.git$ ug 'a604ea3c6a9cc23c739b762f625cf449' -l
tests/files/surfaces/ucell_entries.txt
docs/pymatgen.io.vasp.html
docs/searchindex.js
src/pymatgen/io/vasp/MITRelaxSet.yaml
src/pymatgen/io/vasp/vasp_potcar_pymatgen_hashes.json
tests/files/io/vasp/fixtures/grid_data_files/vasp_inputs_for_grid_check.json

# The `MD5 hash of the entire PotcarSingle`:
werner@x13dai-t:~/Public/repo/github.com/materialsproject/pymatgen.git$ ug 'b1f52f99a6ad883e9383660a9fc1eebf' -l
src/pymatgen/io/vasp/vasp_potcar_file_hashes.json
werner@x13dai-t:~/Public/repo/github.com/materialsproject/pymatgen.git$ ug '079fdad3df93dbceac18c56a69852e29' -l
src/pymatgen/io/vasp/vasp_potcar_file_hashes.json

The PotcarSpec used by the MP database and retrieved by mp_api can be generated as follows:

In [15]: from pymatgen.io.vasp import Potcar
    ...: from collections import namedtuple
    ...: import os
    ...: 
    ...: PotcarSpec = namedtuple('PotcarSpec', ['titel', 'hash', 'summary_stats'])
    ...: 
    ...: # Expand the ~ in the path
    ...: filepath = os.path.expanduser("~/Public/hpc/vasp/pot/pmg_potcar/POT_GGA_PAW_PBE/POTCAR.Pt.gz")
    ...: 
    ...: # Read POTCAR and create specs
    ...: potcar = Potcar.from_file(filepath)
    ...: 
    ...: specs = [
    ...:     PotcarSpec(
    ...:         titel=p.titel,
    ...:         hash=p.md5_header_hash,
    ...:         summary_stats=p._summary_stats
    ...:     )
    ...:     for p in potcar
    ...: ]
    ...: 
    ...: print(f"POTCAR specifications: {specs}")
POTCAR specifications: [PotcarSpec(titel='PAW_PBE Pt 05Jan2001', hash='a604ea3c6a9cc23c739b762f625cf449', summary_stats={'keywords': {'header': ['dexc', 'eatom', 'eaug', 'enmax', 'enmin', 'iunscr', 'lcor', 'lexch', 'lpaw', 'lultra', 'ndata', 'orbitaldescriptions', 'pomass', 'qcut', 'qgam', 'raug', 'rcloc', 'rcore', 'rdep', 'rmax', 'rpacor', 'rrkj', 'rwigs', 'step', 'titel', 'vrhfin', 'zval'], 'data': ['localpart', 'gradientcorrectionsusedforxc', 'corecharge-density(partial)', 'atomicpseudocharge-density', 'nonlocalpart', 'reciprocalspacepart', 'realspacepart', 'reciprocalspacepart', 'realspacepart', 'nonlocalpart', 'reciprocalspacepart', 'realspacepart', 'reciprocalspacepart', 'realspacepart', 'nonlocalpart', 'reciprocalspacepart', 'realspacepart', 'reciprocalspacepart', 'realspacepart', 'pawradialsets', '(5e20.12)', 'augmentationcharges(nonsperical)', 'uccopanciesinatom', 'grid', 'aepotential', 'corecharge-density', 'kineticenergy-density', 'pspotential', 'corecharge-density(pseudized)', 'pseudowavefunction', 'aewavefunction', 'pseudowavefunction', 'aewavefunction', 'pseudowavefunction', 'aewavefunction', 'pseudowavefunction', 'aewavefunction', 'pseudowavefunction', 'aewavefunction', 'pseudowavefunction', 'aewavefunction', 'endofdataset']}, 'stats': {'header': {'MEAN': 49.75002565203251, 'ABSMEAN': 49.81755410731707, 'VAR': 11430.679203375377, 'MIN': -4.114, 'MAX': 729.1171}, 'data': {'MEAN': 101824.1204566872, 'ABSMEAN': 101837.40806970962, 'VAR': 565262541893.0613, 'MIN': -175.707504926, 'MAX': 7968615.30025}}})]

The following code snippet can be used to compute the two hashes mentioned above:

In [31]: from pymatgen.io.vasp.inputs import PotcarSingle

In [32]: psingle = PotcarSingle.from_file('/home/werner/Desktop/pot/potpaw_PBE/Pt/POTCAR')
In [33]: psingle.md5_header_hash
Out[33]: 'a604ea3c6a9cc23c739b762f625cf449'

In [34]: psingle.md5_computed_file_hash
Out[34]: '079fdad3df93dbceac18c56a69852e29'

In [35]: psingle
Out[35]: PotcarSingle(symbol='Pt', functional='PBE', TITEL='PAW_PBE Pt 05Jan2001', VRHFIN='Pt: s1d9', n_valence_elec=10)

See here for the corresponding source code implementation. See here and here for the related discussion.

@tschaume
Copy link
Member

Thanks for following up!

@hongyi-zhao hongyi-zhao changed the title About the exact potcar version used by MP. Why POTCAR hash from mp_api doesn't match local POTCAR file hash: explained by md5_header_hash vs md5_computed_file_hash. Nov 28, 2024
@hongyi-zhao hongyi-zhao changed the title Why POTCAR hash from mp_api doesn't match local POTCAR file hash: explained by md5_header_hash vs md5_computed_file_hash. Why POTCAR hash from mp_api doesn't match local POTCAR file hash: explained by md5_header_hash vs md5_computed_file_hash. Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants