-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add autologging for scikit-learn #3287
Conversation
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harupy This is awesome! Left some initial comments. Great work!
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
branches: | ||
- master | ||
paths: | ||
- mlflow/sklearn/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a pull request, run this action only when files under mlflow/sklearn
have changed.
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy <[email protected]>
I found that the doc for https://scikit-learn.org/stable/modules/generated/sklearn.utils.all_estimators.html
|
* Add autologging for scikit-learn Signed-off-by: harupy <[email protected]> * Update sklearn's version Signed-off-by: harupy <[email protected]> * Remove unrelated file Signed-off-by: harupy <[email protected]> * rename Signed-off-by: harupy <[email protected]> * rename load_model Signed-off-by: harupy <[email protected]> * Remove blank line Signed-off-by: harupy <[email protected]> * Reorder imports Signed-off-by: harupy <[email protected]> * fix Signed-off-by: harupy <[email protected]> * DRY Signed-off-by: harupy <[email protected]> * Emit warning on older versions of sklearn Signed-off-by: harupy <[email protected]> * Use warnings.warn Signed-off-by: harupy <[email protected]> * Remove unused argument Signed-off-by: harupy <[email protected]> * Revert changes on requirements Signed-off-by: harupy <[email protected]> * Use LooseVersion Signed-off-by: harupy <[email protected]> * Fix _get_all_estimators Signed-off-by: harupy <[email protected]> * rename vars Signed-off-by: harupy <[email protected]> * Use backported all_estimators Signed-off-by: harupy <[email protected]> * Fix lint errors Signed-off-by: harupy <[email protected]> * Remove print Signed-off-by: harupy <[email protected]> * simplify code Signed-off-by: harupy <[email protected]> * Create sklearn directory Signed-off-by: harupy <[email protected]> * Move _all_estimators to utils Signed-off-by: harupy <[email protected]> * Remove link Signed-off-by: harupy <[email protected]> * fix Signed-off-by: harupy <[email protected]> * Fix active_run_exists' condition Signed-off-by: harupy <[email protected]> * Verify no children Signed-off-by: harupy <[email protected]> * Add experiment_id Signed-off-by: harupy <[email protected]> * Specify stacklevel Signed-off-by: harupy <[email protected]> * Wrap fit with try-except Signed-off-by: harupy <[email protected]> * Remove use_caplog Signed-off-by: harupy <[email protected]> * rename test Signed-off-by: harupy <[email protected]> * Remove temp_tracking_uri Signed-off-by: harupy <[email protected]> * Remove unused imports Signed-off-by: harupy <[email protected]> * Add docstring for _all_estimators Signed-off-by: harupy <[email protected]> * Fix assertions Signed-off-by: harupy <[email protected]> * Wrap score with try-except Signed-off-by: harupy <[email protected]> * fix Signed-off-by: harupy <[email protected]> * Add log assertion to test_autolog_marks_run_as_failed_when_fit_fails Signed-off-by: harupy <[email protected]> * indent Signed-off-by: harupy <[email protected]> * simplify code Signed-off-by: harupy <[email protected]> * Add failure reasons Signed-off-by: harupy <[email protected]> * Fix assertion order Signed-off-by: harupy <[email protected]> * Assert metrics is empty when score fails Signed-off-by: harupy <[email protected]> * minor fix Signed-off-by: harupy <[email protected]> * Assert after with Signed-off-by: harupy <[email protected]> * Use readable class name Signed-off-by: harupy <[email protected]> * Throw when fit fails Signed-off-by: harupy <[email protected]> * Rename active_run_exists to should_start_run Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * fix if condition Signed-off-by: harupy <[email protected]> * pass sample_weight if both fit and score have it Signed-off-by: harupy <[email protected]> * Exclude property methods from patching Signed-off-by: harupy <[email protected]> * Chunk params to avoid hitting log_batch API limit Signed-off-by: harupy <[email protected]> * Fix args handling Signed-off-by: harupy <[email protected]> * Use all_estimators if sklearn.utils.all_estimators exists Signed-off-by: harupy <[email protected]> * Fix lint errors Signed-off-by: harupy <[email protected]> * Remove useless () Signed-off-by: harupy <[email protected]> * Fix test name Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * Use model.fit if not parametrized Signed-off-by: harupy <[email protected]> * Temporarily add sklearn job Signed-off-by: harupy <[email protected]> * Add pytest Signed-off-by: harupy <[email protected]> * rerun Signed-off-by: harupy <[email protected]> * fix config Signed-off-by: harupy <[email protected]> * Fix install Signed-off-by: harupy <[email protected]> * do not run install-common.sh Signed-off-by: harupy <[email protected]> * Disable fail-fast Signed-off-by: harupy <[email protected]> * Remove sklearn.datasets Signed-off-by: harupy <[email protected]> * Add 0.22.2 Signed-off-by: harupy <[email protected]> * Try print_changed_only=True Signed-off-by: harupy <[email protected]> * Truncate dict value Signed-off-by: harupy <[email protected]> * Add test for value truncation Signed-off-by: harupy <[email protected]> * De-hardcode tests Signed-off-by: harupy <[email protected]> * Remove set_config Signed-off-by: harupy <[email protected]> * Use try_mlflow_log for mlflow.end_run Signed-off-by: harupy <[email protected]> * Use try-catch for _all_estimators Signed-off-by: harupy <[email protected]> * Fix waring message for scoring error Signed-off-by: harupy <[email protected]> * Mark autolog as experimental Signed-off-by: harupy <[email protected]> * Mark tests for autolog as large Signed-off-by: harupy <[email protected]> * Add test_fit_takes_Xy_as_keyword_arguments Signed-off-by: harupy <[email protected]> * Emit warning message when truncating key or value Signed-off-by: harupy <[email protected]> * De-hardcode x and y names Signed-off-by: harupy <[email protected]> * Fix mangled-signatures link Signed-off-by: harupy <[email protected]> * Add comments Signed-off-by: harupy <[email protected]> * Add doc for _get_args_for_score Signed-off-by: harupy <[email protected]> * Add large option Signed-off-by: harupy <[email protected]> * Apply truncation to expected dict Signed-off-by: harupy <[email protected]> * DRY Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * De-hardcode sklearn version Signed-off-by: harupy <[email protected]> * Fix lint Signed-off-by: harupy <[email protected]> * De-hardcode model dir Signed-off-by: harupy <[email protected]> * Fix patch target Signed-off-by: harupy <[email protected]> * Use called_once_with Signed-off-by: harupy <[email protected]> * Add a new test case to test_both_fit_and_score_contain_sample_weight Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * Remove unused function Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * DRY Signed-off-by: harupy <[email protected]> * Fix func order Signed-off-by: harupy <[email protected]> * Fix lint Signed-off-by: harupy <[email protected]> * Capitalize x Signed-off-by: harupy <[email protected]> * Override unbound methods Signed-off-by: harupy <[email protected]> * Introduce throw_if_try_mlflow_log_has_emitted_warnings fixture Signed-off-by: harupy <[email protected]> * Fix test_fit_takes_Xy_as_keyword_arguments Signed-off-by: harupy <[email protected]> * Add assertions for logged data Signed-off-by: harupy <[email protected]> * Add assert_called_once_with to test_call_fit_with_arguments_score_does_not_accept Signed-off-by: harupy <[email protected]> * Split Xy Signed-off-by: harupy <[email protected]> * Move log_metric to else clause Signed-off-by: harupy <[email protected]> * Fix lint errros Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * Add docstring for autolog Signed-off-by: harupy <[email protected]> * Move pylint disable comments Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * Fix fixture for try_mlflow_log Signed-off-by: harupy <[email protected]> * Add test_autolog_does_not_throw_when_mlflow_logging_fails Signed-off-by: harupy <[email protected]> * Fix lint Signed-off-by: harupy <[email protected]> * Replace key_is_none with val_is_none Signed-off-by: harupy <[email protected]> * Use MAX_ENTITY_KEY_LENGTH Signed-off-by: harupy <[email protected]> * Add comment for _MIN_SKLEARN_VERSION Signed-off-by: harupy <[email protected]> * Enhance comment for prop methods exclusion Signed-off-by: harupy <[email protected]> * Add todo for wrap & patch Signed-off-by: harupy <[email protected]> * bump sklearn Signed-off-by: harupy <[email protected]> * Update action config Signed-off-by: harupy <[email protected]> * Rename is_old_version Signed-off-by: harupy <[email protected]> * test _is_supported_version Signed-off-by: harupy <[email protected]> * Fix command Signed-off-by: harupy <[email protected]> * Fix _is_supported_version Signed-off-by: harupy <[email protected]> * Add continue-on-error Signed-off-by: harupy <[email protected]> * Emit a warning if test fail on an unsupported version Signed-off-by: harupy <[email protected]> * Add warning step Signed-off-by: harupy <[email protected]> * Fix workflow Signed-off-by: harupy <[email protected]> * Use set +e Signed-off-by: harupy <[email protected]> * debug Signed-off-by: harupy <[email protected]> * Fix condition Signed-off-by: harupy <[email protected]> * Simplify workflow Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * Add comment on why include unsupported version Signed-off-by: harupy <[email protected]> * Update doc Signed-off-by: harupy <[email protected]> * black Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * fix comment Signed-off-by: harupy <[email protected]> * nit Signed-off-by: harupy <[email protected]> * Fix syntax Signed-off-by: harupy <[email protected]> * lint Signed-off-by: harupy <[email protected]>
Signed-off-by: harupy [email protected]
What changes are proposed in this pull request?
Add autologging for scikit-learn
How is this patch tested?
unite tests
Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: Local serving, model deployment tools, spark UDFsarea/server-infra
: MLflow server, JavaScript dev serverarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, JavaScript, plottingarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes