Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch_process_wbm_dataset.py: bad JSON file checksum #66

Closed
pbenner opened this issue Nov 29, 2023 · 1 comment
Closed

fetch_process_wbm_dataset.py: bad JSON file checksum #66

pbenner opened this issue Nov 29, 2023 · 1 comment
Labels
bug Something isn't working data Data loading and processing

Comments

@pbenner
Copy link
Collaborator

pbenner commented Nov 29, 2023

matbench-discovery/data/wbm > python fetch_process_wbm_dataset.py
[...]
From: https://drive.google.com/u/0/uc?id=1639IFUG7poaDE2uB6aISUOi65ooBwCIg
To: /home/pbenner/Source/tmp/matbench-discovery-pbenner/data/wbm/raw/wbm-summary.txt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14.3M/14.3M [00:00<00:00, 118MB/s]
step=1

  File "/home/pbenner/Source/tmp/matbench-discovery-pbenner/data/wbm/fetch_process_wbm_dataset.py", line 113, in <module>
    assert checksum == wbm_struct_json_checksums[step - 1], f"bad JSON file checksum, expected {wbm_struct_json_checksums[step - 1]} but got {checksum}"
AssertionError: bad JSON file checksum, expected -7815922250032563359 but got 10630821823676988257
@janosh
Copy link
Owner

janosh commented Nov 29, 2023

Thanks for reporting! Bit concerning that pandas.util.hash_pandas_object is apparently unstable but given those negative checksums were from pandas v1, I guess v2 allowed them to make breaking changes.

@janosh janosh added bug Something isn't working data Data loading and processing labels Nov 29, 2023
@janosh janosh closed this as completed in e901031 Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data Data loading and processing
Projects
None yet
Development

No branches or pull requests

2 participants