Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter optimization #264

Closed
lars-reimann opened this issue May 4, 2023 · 1 comment · Fixed by #843
Closed

Hyperparameter optimization #264

lars-reimann opened this issue May 4, 2023 · 1 comment · Fixed by #843
Assignees
Labels
enhancement 💡 New feature or request released Included in a release

Comments

@lars-reimann
Copy link
Member

lars-reimann commented May 4, 2023

Is your feature request related to a problem?

Finding appropriate values for hyperparameters by hand is tedious. There should be automation to try different combinations of values.

Desired solution

  1. For all hyperparameters of models of type T it should also be possible to pass a Choice[T] (see feat: add Choice class for possible values of hyperparameter #325). Example:
# Before
class KNearestNeighbors(Classifier):
  def __init__(self, number_of_neighbors: int) -> None:
    ...

# After
class KNearestNeighbors(Classifier):
  def __init__(self, number_of_neighbors: int | Choice[int]) -> None:
    ...

# Usage
KNearestNeighbors(number_of_neighbors = Choice(1, 10, 100))
  1. Adjust the getters (Getters for hyperparameters of models #260) accordingly.
  2. When a user tries to call fit on a model that contains Choice at any level (can be nested), raise an exception. Also point to the correct method (see 4.).
  3. Add new method fit_by_exhaustive_search to Classifier and subclasses with parameter:
    • optimization_metric: The metric to use to find the best model. It should have type ClassifierMetric, which is an enum with one value for each classifier metric we have available:
    class ClassifierMetric(Enum):
        ACCURACY = "accuracy"
        PRECISION = "precision
        RECALL = "recall"
        F1_SCORE = "f1_score"
    The parameter should be required.
  4. Add new method fit_by_exhaustive_search to Regressor and subclasses with parameter:
    • optimization_metric: The metric to use to find the best model. It should have type RegressorMetric, which is an enum with one value for each regressor metric we have available:
    class RegressorMetric(Enum):
        MEAN_SQUARED_ERROR = "mean_squared_error"
        MEAN_ABSOLUTE_ERROR = "mean_absolute_error"
    The parameter should be required.
  5. Both of those methods should then collect the Choices inside of the model and its children, and for each possible setting create a model without choices, fit this, and compute the listed metric on it. It should then keep track of the best (fitted) model according to the metric and return it at the end. GridSearchCV of scikit-learn can be useful for this.
@lars-reimann lars-reimann added the enhancement 💡 New feature or request label May 4, 2023
@github-project-automation github-project-automation bot moved this to Backlog in Library May 4, 2023
@lars-reimann lars-reimann self-assigned this May 5, 2023
@lars-reimann lars-reimann moved this from Backlog to 🧱 Blocked in Library May 7, 2023
lars-reimann added a commit that referenced this issue May 26, 2023
### Summary of Changes

Add a class to represent possible choices for the value of a
hyperparameter. This is in preparation for #264.
@lars-reimann lars-reimann removed their assignment May 26, 2023
@lars-reimann lars-reimann moved this from 🧱 Blocked to Backlog in Library May 26, 2023
lars-reimann pushed a commit that referenced this issue Jun 1, 2023
## [0.13.0](v0.12.0...v0.13.0) (2023-06-01)

### Features

* add `Choice` class for possible values of hyperparameter ([#325](#325)) ([d511c3e](d511c3e)), closes [#264](#264)
* Add `RangeScaler` transformer ([#310](#310)) ([f687840](f687840)), closes [#141](#141)
* Add methods that tell which columns would be affected by a transformer ([#304](#304)) ([3933b45](3933b45)), closes [#190](#190)
* Getters for hyperparameters of Regression and Classification models ([#306](#306)) ([5c7a662](5c7a662)), closes [#260](#260)
* improve error handling of table ([#308](#308)) ([ef87cc4](ef87cc4)), closes [#147](#147)
* Remove warnings thrown in new `Transformer` methods ([#324](#324)) ([ca046c4](ca046c4)), closes [#323](#323)
@sibre28 sibre28 linked a pull request Jun 19, 2024 that will close this issue
@github-project-automation github-project-automation bot moved this from Backlog to ✔️ Done in Library Aug 31, 2024
lars-reimann pushed a commit that referenced this issue Sep 17, 2024
## [0.28.0](v0.27.0...v0.28.0) (2024-09-17)

### Features

* hyperparameter optimization for classical models ([#843](#843)) ([d8f7491](d8f7491)), closes [#264](#264)
* hyperparamteroptimization for rnns and cnns ([#923](#923)) ([b1e8933](b1e8933)), closes [#912](#912)
@lars-reimann
Copy link
Member Author

🎉 This issue has been resolved in version 0.28.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@lars-reimann lars-reimann added the released Included in a release label Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request released Included in a release
Projects
Status: ✔️ Done
Development

Successfully merging a pull request may close this issue.

2 participants