-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release candidate v0.1.17rc1 #663
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Merge back master in develop after release
Why: During resolution of conflicts, the user's script configuration file is parsed for any marked resolution. This parsing is not handling properly values that may not be string and causes Oríon to crash. How: The values should only be handled if they are string and contain the value '~' as this is the only way to mark a resolution. They can be safely ignored otherwise.
Why: Starting with v0.1.16 the EVC is disabled by default. It must be activated for the tests starting from v0.1.16.
Handle properly all types in config during branch
Why: The children should not be listed multiple times. If they appear in the EVC tree of some parent experiment then they should not be listed as a root as well. How: Problem was that all experiments matching the name would be fetched and printed with their tree. When name is specified, version queried should be 1 by default unless specified by the user.
Why: If the HEAD of the repo is in an invalid state (no branch, no commit), gitpython will crash when attempting to fetch the information required for the EVC. How: First check if HEAD state is valid, if not raise warning and ignore repo.
List EVC tree once only when querying with name
Handle repo with invalid HEAD state
Support to configure executor for benchmark
Why: The GMM for real values could not sample efficiently in narrow spaces. It would often lead to RuntimeError because the number of attempts allowed would be exhausted. We could increase the default number of attempts allowed but that would increase the computational cost for any space, even those easy to sample. How: Use numpy array to avoid playing with a list. It is more efficient. Also, increase the number of attempts as needed until it reaches a max value of attempts. This way easy samples do not take more time while difficult ones are allowed more.
Fix TPE sampling in narrow spaces
Add missing documentation for EVC enable option
Why: When trying to branch from an experiment that already has a child with the same name, Oríon will crash with a RaceCondition error. The problem is that this issue and a real race-condition are indiscernible as they lead to the same state. The only thing we can do is clarify the error message and warn that this error can also be caused by branching from a parent experiment that already has a child with the same name. Note: This issue only arises if the user specifies the version of the parent experiment, otherwise the EVC will use the child or branch from the child without any issue.
Why: The shape of categorical dimensions was not included in `get_prior_string`, causing the lost of the shape during branching. How: Add the shape to `get_prior_string` of Categorical dimensions and add tests to catch this issue. Also add tests for Conflict and Resolutions of priors with different shape. The adaptor for a change of prior does not raise an issue anymore when shapes are different and rather logs a warning. The trials are all ignored is this case.
Why: The option `user_script_config` is part of the worker configuration, not the EVC. As such, this option is not part of `branching` group of option and does not find its way to the Conflict objects of the EVC. The value of user_script_config is already available anyway inside the experiment configuration, so there is no need to pass it. Furthermore, it may differ from past experiments to the new one, and both experiments should be handled based on their respective `user_script_config`, not the new one. For these reasons, it is better to use the information available in the experiment configurations.
Use user_script_config from parser in EVC
…_dup Clear error message for dup branching error
Fix shape argument in categorical dims
Why: A log integer dimension would be casted to real, linearized, then casted back to integer. This reduces dramatically the number of possible values that can be sampled as many exp(int(log(x))) will result in the same integer for many different values of x. An algorithm that needs linearization should be able to handle real space or otherwise state a requirement for integers. This should be handled separately and quantization of linearized log integer should not be applied by default. Note: Due to the use of floor instead of rounding in Quantize, the values of int(exp(log(x))) would still clash for close values of x. Using rounding instead solves the issue. Rounding may be problematic however for algorithms that require integer type, as the rounding may cast real integers to values that are out-of-bound. For now there are no foreseeable algorithms that may require integer type so I avoid fixing the issue and leave it for later if the need even arises (which I highly doubt).
Support linearize log integer properly
Why: The new default behavior is confusing for users. It is also difficult to determine a good default max_trials, so having not enough or to many trials sampled by default at the start of HPO can be annoying for many users. Using inf by default and iterating with pool-size may be the best alternative. Now that we have a support for n-workers, the argument pool-size we previously deprecated actually make sense. By default, pool-size should be equal to number of workers. We have n-workers set to 1 by default, so by default we are back to previous behavior; sampling 1 trial at a time, until max_trials. How: The producer now takes a pool size as argument when producing. The same applies to ExperimentClient.suggest() and ExperimentClient.workon(). The pool size is used to sample multiple trials at a time and increase I/O efficiency. The producer now keeps track of number of new trials so that if multiple workers are producing new trials with a non-seed algorithm (hence they produce different trials and there are no conflicts leading to backoff) they will stop if they generated together up to `pool_size` trials. Note: Pool-size is moved to to worker configuration instead. Since pool-size relates to n-workers, which is part of worker configuration, having pool-size in worker configuration makes more sense.
Why: There was a bug in the tests. The functions to generate trials would generate more than requested because of the new behavior of producer attempting to produce all trials at once, once the value of `max_trials` was conflicting with the number of trials requested to the trial generating function for tests (`orion.testing.evc.generate_trials`). Fortunately the bug in the tests did not seem to miss any bugs in the code they were testing. How: Adjust the expected numbers based on the corrected behiavor. The numbers make indeed more sense now.
Signed-off-by: Fabrice Normandin <[email protected]>
Co-authored-by: Lin Dong <[email protected]>
… into feature/back_to_pool_size
Revert to default inf max_trials and pool-size of 1
Fix grouping of plots in legend
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🏗 Enhancements
🐛 Bug Fixes
📜 Documentation