Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read zipped input data #391

Merged
merged 5 commits into from
May 30, 2024
Merged

Read zipped input data #391

merged 5 commits into from
May 30, 2024

Conversation

alexdewar
Copy link
Contributor

@alexdewar alexdewar commented May 29, 2024

This PR adds the option to provide the input data as a path to a zip file rather than to a directory, e.g.:

HealthGPS.Console -f example_new/Config.json -s data_file.zip

Note that the index.json file must be in the root folder of the zip file otherwise it won't work.

While this feature isn't particularly useful by itself, it's a necessary step to being able to automatically download input data from a URL (#364).

I haven't added tests yet because the input files are likely to be moved out of this repo soon (#366), so I'll wait until we know what the final project structure looks like first. I didn't want to commit a zip file to the repo either if its contents are likely to change soon.

I've also broken out the schema-related code into its own .cpp file as datamanager.cpp was getting a bit bloated.

Closes #363.

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@alexdewar alexdewar force-pushed the read_zipped_input_data branch from 20a246f to c8dd65a Compare May 29, 2024 12:56
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@alexdewar alexdewar marked this pull request as ready for review May 29, 2024 13:30
Copy link
Contributor

@jamesturner246 jamesturner246 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. Just one question on how well temp creation behaves on the HPC, and to consider if we want to be a bit careful, or else do some testing.

src/HealthGPS.Datastore/zip_file.cpp Show resolved Hide resolved
@alexdewar alexdewar force-pushed the read_zipped_input_data branch from c8dd65a to fb9e4e2 Compare May 30, 2024 11:22
@alexdewar alexdewar force-pushed the read_zipped_input_data branch from fb9e4e2 to c836cbe Compare May 30, 2024 11:28
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a few suggestions, mainly related to error handling and explicitly. Maybe I'm in a pythonic mindset, but I thought worth to highlight them.

src/HealthGPS.Datastore/CMakeLists.txt Show resolved Hide resolved
src/HealthGPS.Datastore/datamanager.cpp Show resolved Hide resolved
src/HealthGPS.Datastore/datamanager.cpp Show resolved Hide resolved
src/HealthGPS.Datastore/datamanager.cpp Outdated Show resolved Hide resolved
Comment on lines +40 to +49
if (!std::filesystem::create_directories(out_path)) {
throw std::runtime_error{
fmt::format("Failed to create directory: {}", out_path.string())};
}
} else {
std::ofstream ofs{out_path};
if (!ofs) {
throw std::runtime_error{
fmt::format("Failed to create file: {}", out_path.string())};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to attach the original error to get an idea of why these fail, so the user can investigate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason it'll fail here is because of an IO error and C++ won't give any information beyond that, unfortunately.

@alexdewar alexdewar requested a review from dalonsoa May 30, 2024 12:08
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@alexdewar alexdewar merged commit c1a6316 into main May 30, 2024
4 checks passed
@alexdewar alexdewar deleted the read_zipped_input_data branch May 30, 2024 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for reading data (and configs?) from zip file
3 participants