Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

👌 IMPROVE: links to materials cloud SSSP archive #104

Merged
merged 10 commits into from
Aug 13, 2021
32 changes: 26 additions & 6 deletions aiida_pseudo/cli/install.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import pathlib
import shutil
import tempfile
import yaml

import click
import requests
Expand Down Expand Up @@ -97,20 +98,37 @@ def download_sssp(
filepath_archive: pathlib.Path,
filepath_metadata: pathlib.Path,
traceback: bool = False
) -> None:
) -> str:
"""Download the pseudopotential archive and metadata for an SSSP configuration to a path on disk.

:param configuration: the SSSP configuration to download.
:param filepath_archive: absolute filepath to write the pseudopotential archive to.
:param filepath_metadata: absolute filepath to write the metadata file to.
:param traceback: boolean, if true, print the traceback when an exception occurs.
:return: Latest patch version of the requested minor version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now just need to update the return type of this function as well and than this is good to merge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Righto! Missed that one, thanks

Copy link
Contributor

@sphuber sphuber Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More a sign that we should probably at some point at mypy to the pre-commit. On the one hand I like added typing, but on the other hand mypy has been given so much trouble that sometimes it annoys me to no end. :\rock: me :hardplace:

"""
from aiida_pseudo.groups.family import SsspFamily
from .utils import attempt

url_sssp_base = 'https://legacy-archive.materialscloud.org/file/2018.0001/v4/'
url_archive = f"{url_sssp_base}/{SsspFamily.format_configuration_filename(configuration, 'tar.gz')}"
url_metadata = f"{url_sssp_base}/{SsspFamily.format_configuration_filename(configuration, 'json')}"
url_template = 'https://archive.materialscloud.org/record/file?filename={filename}&parent_id=19'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. Everything looks good to me now, except I still have a question about this magic parent_id=19. I have been trying to figure out how it relates to the main archive entry. I take it it refers to the actual archive record: https://archive.materialscloud.org/record/2021.76
As you can see, there in the URL it uses an ID that seems based on the year and some auto-incrementing number. But this ID changes if a new version of the SSSP record is created. The question is whether the parent_id=19 is persistent and doesn't change. I.e., when a v8 will be released with the next patch version of 1.1, i.e. 1.1.3, will the URL still work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question indeed! I think it doesn't change, but let's ask @vgranata to be sure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parent_id is persistent within versions of the same record, meaning that it does not change if a new version of a record is created.
If you create a new SSSP record that is not a new version of https://archive.materialscloud.org/record/2021.76 then the parent_id will be different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK Thanks @vgranata that should be fine. Then we can almost merge this @mbercx . Only thing is that we should probably add the patch version to the description of the group. Otherwise it will only be possible to know what patch version a certain family is by looking at the md5 of the tarball that was downloaded, which is contained in the description. I think it would make sense to store the patch at the very least in the description of the family. And then the tests should be fixed of course, because they are failing


# Download the dictionary mapping of the minor versions to the latest corresponding patch versions. Since patch
# releases of the SSSP only contain bug fixes, there is no reason to have the user install an outdated patch
# version. So, the latest patch version of the minor version that is specified by the user is always installed.
with attempt('downloading patch versions information... ', include_traceback=traceback):
response = requests.get(url_template.format(filename='versions.yaml'))
response.raise_for_status()
# The `version_mapping` is a dictionary that maps each minor version (key) to the latest patch version (value)
version_mapping = yaml.load(response.content, Loader=yaml.SafeLoader)
patch_version = version_mapping[configuration.version]

echo.echo_info(f'Latest patch version found: {patch_version}')

archive_filename = SsspFamily.format_configuration_filename(configuration, 'tar.gz', patch_version)
metadata_filename = SsspFamily.format_configuration_filename(configuration, 'json', patch_version)

url_archive = url_template.format(filename=archive_filename)
url_metadata = url_template.format(filename=metadata_filename)

with attempt('downloading selected pseudopotentials archive... ', include_traceback=traceback):
response = requests.get(url_archive)
Expand All @@ -126,6 +144,8 @@ def download_sssp(
handle.write(response.content)
handle.flush()

return patch_version


def download_pseudo_dojo(
configuration: SsspConfiguration,
Expand Down Expand Up @@ -185,7 +205,6 @@ def cmd_install_sssp(version, functional, protocol, download_only, traceback):

configuration = SsspConfiguration(version, functional, protocol)
label = SsspFamily.format_configuration_label(configuration)
description = f'SSSP v{version} {functional} {protocol} installed with aiida-pseudo v{__version__}'

if configuration not in SsspFamily.valid_configurations:
echo.echo_critical(f'{version} {functional} {protocol} is not a valid SSSP configuration')
Expand All @@ -199,8 +218,9 @@ def cmd_install_sssp(version, functional, protocol, download_only, traceback):
filepath_archive = dirpath / 'archive.tar.gz'
filepath_metadata = dirpath / 'metadata.json'

download_sssp(configuration, filepath_archive, filepath_metadata, traceback)
patch_version = download_sssp(configuration, filepath_archive, filepath_metadata, traceback)

description = f'SSSP v{patch_version} {functional} {protocol} installed with aiida-pseudo v{__version__}'
description += f'\nArchive pseudos md5: {md5_file(filepath_archive)}'
description += f'\nPseudo metadata md5: {md5_file(filepath_metadata)}'

Expand Down
14 changes: 11 additions & 3 deletions aiida_pseudo/groups/family/sssp.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# -*- coding: utf-8 -*-
"""Subclass of ``PseudoPotentialFamily`` designed to represent an SSSP configuration."""
from collections import namedtuple
from typing import Sequence
from typing import Sequence, Optional

from aiida_pseudo.data.pseudo import UpfData
from ..mixins import RecommendedCutoffMixin
Expand Down Expand Up @@ -50,15 +50,23 @@ def format_configuration_label(cls, configuration: SsspConfiguration) -> str:
)

@classmethod
def format_configuration_filename(cls, configuration: SsspConfiguration, extension: str) -> str:
def format_configuration_filename(
cls, configuration: SsspConfiguration, extension: str, patch_version: Optional[str] = None
mbercx marked this conversation as resolved.
Show resolved Hide resolved
) -> str:
"""Format the filename for a file of a particular SSSP configuration as it is available from MC Archive.

:param configuration: the SSSP configuration.
:param extension: the filename extension without the leading dot.
:param patch_version: patch version of the files which overrides the ``version`` specified in the
``configuration``. This is necessary because we only let users specify the minor version, not install
configurations with a specific patch version. The filename on the archive however will contain the patch
version, so this needs to be substituted.
:return: filename
"""
version = configuration.version if patch_version is None else patch_version

return cls.filename_template.format(
version=configuration.version, functional=configuration.functional, protocol=configuration.protocol
version=version, functional=configuration.functional, protocol=configuration.protocol
) + f'.{extension}'

def __init__(self, label=None, **kwargs):
Expand Down
6 changes: 3 additions & 3 deletions tests/cli/test_install.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,9 +182,9 @@ def test_install_sssp(run_cli_command):

family = QueryBuilder().append(SsspFamily).one()[0]
assert family.get_cutoffs is not None
assert f'SSSP v1.1 PBE efficiency installed with aiida-pseudo v{__version__}' in family.description
assert 'Archive pseudos md5: 4803ce9fd1d84c777f87173cd4a2de33' in family.description
assert 'Pseudo metadata md5: 0d5d6c2c840383c7c4fc3a99b5dc3001' in family.description
assert f'PBE efficiency installed with aiida-pseudo v{__version__}' in family.description
assert 'Archive pseudos md5: ' in family.description
assert 'Pseudo metadata md5: ' in family.description

result = run_cli_command(cmd_install_sssp, raises=SystemExit)
assert 'is already installed' in result.output
Expand Down