Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

👌 IMPROVE: links to materials cloud SSSP archive #104

Merged
merged 10 commits into from
Aug 13, 2021

Conversation

mbercx
Copy link
Member

@mbercx mbercx commented Aug 11, 2021

Fixes #13

The current links used to download the SSSP pseudopotentials still point
to the legacy materials cloud archive, which are hosted on a server that
they want to shut down. Next to this, the archive now also contains new
patch versions for minor version 1.1, with 1.1.2 being the latest. Since
these patch versions usually implement bug fixes, there is no reason to
install an outdated patch version. So instead of allowing the user to
install specific patch versions, we'll simply stick to only using minor
versions and make sure that the latest patch version is installed.

Here we update the links to the new servers. However, we also add some
extra functionality to retrieve a mapping of the latest patch version
corresponding to the minor version of the configuration. Based on this
mapping, the correct url is constructed for the archive and metadata.

@mbercx mbercx added this to the v0.6.2 milestone Aug 11, 2021
@mbercx mbercx requested a review from sphuber August 11, 2021 13:07
@mbercx
Copy link
Member Author

mbercx commented Aug 11, 2021

@sphuber the links in this PR are still for the staging version of the archive, since we wanted to test this before @vgranata puts this in production.

If you agree with the implementation, I'll ask @vgranata to move forward, then update the final links and fix the tests.

@mbercx mbercx force-pushed the fix/13/materialscloud-links branch from 91ce0b3 to a0dc52c Compare August 11, 2021 13:11
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mbercx . Have some suggestions and questions

aiida_pseudo/cli/install.py Outdated Show resolved Hide resolved
aiida_pseudo/cli/install.py Outdated Show resolved Hide resolved
aiida_pseudo/cli/install.py Outdated Show resolved Hide resolved
aiida_pseudo/cli/install.py Outdated Show resolved Hide resolved
aiida_pseudo/groups/family/sssp.py Outdated Show resolved Hide resolved
@@ -50,15 +53,17 @@ def format_configuration_label(cls, configuration: SsspConfiguration) -> str:
)

@classmethod
def format_configuration_filename(cls, configuration: SsspConfiguration, extension: str) -> str:
def format_configuration_filename(cls, configuration: SsspConfiguration, extension: str, patch_version: Optional[str] = None) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it is a good idea to keep the patch version as a separate argument. Shouldn't we try to integrate this in the SsspConfiguration. It already contains a version attribute. If that is 1.2.3 than the patch version is 3 by definition. So to have a separate argument that can cause inconsistencies seems weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did this is because I figured we'd stick to only having the minor versions as valid_configurations. Else we still have to update the code in case a new patch version is released. So really all we need is to make sure the filename is updated to the correct patch version, not the actual SsspConfiguration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, @sphuber! I made some first corrections, but have a look at this comment and see if you agree with my reasoning here before I continue.

I can remember the discussion about not fixing the patch number explicitly as that would require updates of the plugin if a new patch is released of the family, but I am having difficulty seeing how best to integrate this limitation. Can you share the content of the YAML mapping? Than I can form a better idea of what to do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! It's a very simple mapping of the minor version to the latest patch version:

---
# This maps each minor version to the latest patch version
# corresponding to that minor version
'1.0': '1.0'
'1.1': '1.1.2'

@mbercx
Copy link
Member Author

mbercx commented Aug 11, 2021

Thanks for the review, @sphuber! I made some first corrections, but have a look at this comment and see if you agree with my reasoning here before I continue.

@mbercx
Copy link
Member Author

mbercx commented Aug 11, 2021

image

(aiida-pseudo) mbercx@theospc46:~/envs/aiida-pseudo/code/aiida-pseudo$ git commit -a
Fix double quoted strings................................................Passed
Fix End of Files.........................................................Passed
Fix python encoding pragma...............................................Passed
Mixed line ending........................................................Passed
Trim Trailing Whitespace.................................................Passed
flynt....................................................................Passed
yapf.....................................................................Passed
pylint...................................................................Passed
pydocstyle...............................................................Passed
[fix/13/materialscloud-links 8874770] Apply more corrections from prime ape instance
 2 files changed, 4 insertions(+), 6 deletions(-)

url_sssp_base = 'https://legacy-archive.materialscloud.org/file/2018.0001/v4/'
url_archive = f"{url_sssp_base}/{SsspFamily.format_configuration_filename(configuration, 'tar.gz')}"
url_metadata = f"{url_sssp_base}/{SsspFamily.format_configuration_filename(configuration, 'json')}"
url_template = 'https://archive.materialscloud.org/record/file?filename={filename}&parent_id=19'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. Everything looks good to me now, except I still have a question about this magic parent_id=19. I have been trying to figure out how it relates to the main archive entry. I take it it refers to the actual archive record: https://archive.materialscloud.org/record/2021.76
As you can see, there in the URL it uses an ID that seems based on the year and some auto-incrementing number. But this ID changes if a new version of the SSSP record is created. The question is whether the parent_id=19 is persistent and doesn't change. I.e., when a v8 will be released with the next patch version of 1.1, i.e. 1.1.3, will the URL still work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question indeed! I think it doesn't change, but let's ask @vgranata to be sure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parent_id is persistent within versions of the same record, meaning that it does not change if a new version of a record is created.
If you create a new SSSP record that is not a new version of https://archive.materialscloud.org/record/2021.76 then the parent_id will be different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK Thanks @vgranata that should be fine. Then we can almost merge this @mbercx . Only thing is that we should probably add the patch version to the description of the group. Otherwise it will only be possible to know what patch version a certain family is by looking at the md5 of the tarball that was downloaded, which is contained in the description. I think it would make sense to store the patch at the very least in the description of the family. And then the tests should be fixed of course, because they are failing

@sphuber
Copy link
Contributor

sphuber commented Aug 12, 2021

Fixed the CI issues in #105

mbercx and others added 7 commits August 13, 2021 15:48
The current links used to download the SSSP pseudopotentials still point
to the legacy materials cloud archive, which are hosted on a server that
they want to shut down. Next to this, the archive now also contains new
patch versions for minor version 1.1, with 1.1.2 being the latest. Since
these patch versions usually implement bug fixes, there is no reason to
install an outdated patch version. So instead of allowing the user to
install specific patch versions, we'll simply stick to only using minor
versions and make sure that the latest patch version is installed.

Here we update the links to the new servers. However, we also add some
extra functionality to retrieve a mapping of the latest patch version
corresponding to the minor version of the configuration. Based on this
mapping, the correct url is constructed for the archive and metadata.
Co-authored-by: Sebastiaan Huber <[email protected]>
The release `psycopg2-binary==2.9` breaks the unit test manager of
`aiida-core`. This was fixed for in v1.6, however, that version no
longer supports Python 3.6, which falls back on `aiida-core==1.5.2` and
that doesn't apply the upper limit on `psycopg2-binary`, so we are
forced to do it here. This limit can be removed once we drop support for
Python 3.6.

Also fixes two minor things brought up by `pylint`.
@mbercx mbercx force-pushed the fix/13/materialscloud-links branch from 83eb4c3 to 665206d Compare August 13, 2021 13:50
@mbercx
Copy link
Member Author

mbercx commented Aug 13, 2021

@sphuber I've removed the md5 hashes and the version from the test_sssp_install test, since these will depend on the patch version. I.e. if this is updated on the archive, the test will start to fail.

@@ -104,13 +105,30 @@ def download_sssp(
:param filepath_archive: absolute filepath to write the pseudopotential archive to.
:param filepath_metadata: absolute filepath to write the metadata file to.
:param traceback: boolean, if true, print the traceback when an exception occurs.
:return: Latest patch version of the requested minor version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now just need to update the return type of this function as well and than this is good to merge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Righto! Missed that one, thanks

Copy link
Contributor

@sphuber sphuber Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More a sign that we should probably at some point at mypy to the pre-commit. On the one hand I like added typing, but on the other hand mypy has been given so much trouble that sometimes it annoys me to no end. :\rock: me :hardplace:

@sphuber sphuber merged commit 03645e9 into aiidateam:master Aug 13, 2021
@sphuber
Copy link
Contributor

sphuber commented Aug 13, 2021

Thanks @mbercx

@mbercx mbercx deleted the fix/13/materialscloud-links branch August 13, 2021 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update materials cloud archive URL's
3 participants