Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download check for cache directory #559

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gerhardol
Copy link
Contributor

Add a marker .download file to validate the contents in cache directories. Previously only the existence of the directory was used, so if the download was aborted the cache directory had to be deleted manually if this occurred (with a likely cryptic error message). If the .download check file does not exist, the directory will be deleted and downloaded again.

It is also possible to check the contents with a checksum. If not matching, the directory will be deleted and downloaded again.

For Git repos the repos can be deleted if the status is not clean, a checksum is not relevant (but used in the tests).

Note: The important part of this PR is the .download file, that will cover the majority of the issues.

@kyle-verdant
Copy link

Does this handle submodules well? The failures I most often saw were not that the main directory was empty, but that some required submodule didn't exist - most often with aws-sdk-cpp (don't use aws-sdk-cpp with CPM, it's too big).

@gerhardol
Copy link
Contributor Author

Does this handle submodules well? The failures I most often saw were not that the main directory was empty, but that some required submodule didn't exist - most often with aws-sdk-cpp (don't use aws-sdk-cpp with CPM, it's too big).

This will currently make no change if the download is aborted. CMake FetchContents makes some retries, then the job fails. If CMake reports this correctly, then the marker file is not created.
I do not know if that is an issue really, have not got good statistics why the core problem occurs.

The change is for cache and reusing the downloaded data. If no marker file, the directory is incomplete and will be deleted and downloaded again.
I assume CMake detects failures correctly, this should be the core change.

If checksum is used, it could handle also failed checksums.
With Git you do not need to use a checksum. If you set CPM_CHECK_CACHE_CHECKSUM then the directory is removed if git-status fail.

@ScottBailey
Copy link
Contributor

@gerhardol This looks like something that was missing from CPM! Maybe it's jammed up because it fails the tests? IDK. You can fix that pretty easy, the instructions are in the last few lines of CONTRIBUTING.md.

LMK when you get that done and I'll try to get a review in (but I can only really do a big review on the weekend, I think). Then we can ping a maintainer to get this merged.

@gerhardol
Copy link
Contributor Author

@gerhardol This looks like something that was missing from CPM! Maybe it's jammed up because it fails the tests? IDK. You can fix that pretty easy, the instructions are in the last few lines of CONTRIBUTING.md.

LMK when you get that done and I'll try to get a review in (but I can only really do a big review on the weekend, I think). Then we can ping a maintainer to get this merged.

When I apply the style, there are many changes to lines I have not changed too.
I pushed a tmp branch for now, on vacation and cannot investigate this more right now.

@ScottBailey
Copy link
Contributor

When I apply the style, there are many changes to lines I have not changed too. I pushed a tmp branch for now, on vacation and cannot investigate this more right now.

Enjoy your vacation! This can wait. :-)

@gerhardol gerhardol force-pushed the feature/download-check branch 4 times, most recently from 1539d3e to 7d7113f Compare September 9, 2024 17:00
@gerhardol
Copy link
Contributor Author

Test fails due to the example test/unit/checksum_directory.sh is used in tests and MacOs has other arguments.
MacOs can be fixed, but Windows is harder, I would like to skip those in tests.
(Checksum is not essential)

@gerhardol gerhardol force-pushed the feature/download-check branch 3 times, most recently from bc31cc5 to b641804 Compare September 10, 2024 07:27
Add a marker .download file to validate the contents in cache directories.
Previously only the existence of the directory was used, so if the
download was aborted the cache directory had to be deleted manually if
this occurred (with a likely cryptic error message).
If the .download check file does not exist, the directory will be deleted
and downloaded again.

It is also possible to check the contents with a checksum.
If not matching, the directory will be deleted and downloaded again.

For Git repos the repos can be deleted if the status is not clean,
a checksum is not relevant (but used in the tests).
@gerhardol gerhardol force-pushed the feature/download-check branch from b641804 to 8fceb72 Compare September 10, 2024 07:31
@gerhardol
Copy link
Contributor Author

MacOs handled, Windows checksum example is not provided (it depends on if msys, cygwin or some native program is used, it is up to Win users to provide such an example).
CMake native to calculate the checksum is too slow, I expect users that really want the checksum to tweak the algorithm anyway.
The important part of this change is the .download file marker, to not handle aborted downloads.

@christopherbate
Copy link

Can this be merged?

@gerhardol
Copy link
Contributor Author

Can this be merged?

I cannot do much. But if there are reviews and usage stories, maintainers are inspired and more comfortable merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants