-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Risk factor refactor #195
Risk factor refactor #195
Changes from 3 commits
161e70e
8c1bcf4
c6eec51
c1e237d
ad428bb
f71d951
b685faa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The list of diseases can be long, 30+. The current validation will list all the invalid diseases key, if any, before exiting. The exception alternative will break at the first invalid key, the use will need to keep on trying until all are correct, in the end, both methods are accomplishing the required validation, it is just a matter of style. Mapping the config with the back-end in a case-sensitive world is never user friend, the identifier type tries to minimize the pain by making all keys lowercase. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Case-sensitivity: fair point. In which case, the task then becomes removing the parts of the codebase that are case-sensitive. I've spotted a few. Diseases: agree, a style choice. My concern in these cases has always been that by warning and continuing for dodgy input, that it makes the problem slightly less obvious to diagnose when it fails later on in the program, than if it were to immediately fail on loading. Perhaps this can be replaced with a function which reads all diseases, detects missing ones with printed debug info, before throwing an exception for any missing inputs, as opposed to a case-by-case exception. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm with @israel-vieira on this - collecting all that is invalid and providing a full list to the user (within meaningful indication of what is wrong) results on a better user experience than having to try again and again until all issues are sorted. We have another project where this is the approach and is indeed much better. All in all, having a strong validation and input pre-processing steps where things can fail early and inconsistencies fixed (casing, proxis, etc.) is a good thing that will result in simpler, easier to understand (and diagnose) code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally, I don't consider users input validation as an exceptional circumstance, the validation should focus on helping the users to fix the wrong or missing information provided, perhaps the system should show valid options before exiting gracefully. In general, users don't read or understand stack traces with a message hidden inside a large block of text, this is kind of information is more relevant for developers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there are two issues here:
My view is:
The problem with at least some of the config loading code atm is that if validation fails, it doesn't abort immediately, but it also doesn't abort after the validation step either, so while you may get a warning on the console, the program will blithely plough on using default values, which is not good (see #161). I'm a big believer in using exceptions to propagate errors in C++; it can make code much cleaner and safer. I'll give an example of how you might do "lazy failure" when loading disease info using exceptions: std::vector<core::DiseaseInfo> get_diseases_info(core::Datastore &data_api, Configuration &config) {
const auto diseases = data_api.get_diseases();
fmt::print("\nThere are {} diseases in storage, {} selected.\n", diseases.size(),
config.diseases.size());
std::vector<core::DiseaseInfo> result;
std::set<std::string> missing_diseases;
for (const auto &code : config.diseases) {
try {
result.emplace_back(data_api.get_disease_info(code));
} catch (const std::invalid_argument &) {
missing_diseases.emplace(code);
}
}
if (!missing_diseases.empty()) {
std::cerr << "ERROR: The following diseases were not found:\n";
for (const auto &disease : missing_diseases) {
std::cerr << "\t" << disease << "\n";
}
throw std::runtime_error{"Some invalid diseases were given"};
}
return result;
} (There might be things we could tweak -- e.g. I haven't used the This code assumes that the caller will appropriately handle any thrown Note that this code is also safer, because a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree with this ^. The user gets helpful human readable messages noting which keys were missing, and execution stops before there's a chance of accidental continuation on default values. I'm happy to merge, and happy to leave it with you if you already have the idea in mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is autogenerated from a folder with 50+ files and the country data file, not sure we gain much by hand editing this, perhaps changing the script that generate this file would be more beneficial. Alternatively, you could remove this section from the config file and read from here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, the respective healthgps-tools script should also be updated. I'm considering a Python rewrite of the scripts, to sidestep the Windows R binding dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point @israel-vieira.
Though we do want the config files in this repo to be up to date and working with the current version of the code for testing purposes. Whether we edit it directly or just the script for generating it isn't so important I think, as long as it's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made an issue for later.