Log evaluation results to MLflow #2337

tstadel · 2022-03-21T12:00:31Z

Proposed changes:

add execute_eval_run() to Pipeline which encapsulates pipeline.eval() and logs to MLflow via a MLflowTrackingHead
log metadata about pipeline, evalset and corpus
remove MLflow version constraint
rework MLFlowLogger:
- introduce Tracker facade that can utilize different TackingHeads by calling Tracker.set_tracking_head()
- implement MLflowTrackingHead with auto_track_environment param

Usage:

for idx, query_pipeline in enumerate([pipe1, pipe2, pipe3]):
    eval_result = Pipeline.execute_eval_run(
        index_pipeline=index_pipeline,
        query_pipeline=query_pipeline,
        evaluation_set_labels=labels,
        corpus_file_paths=file_paths,
        corpus_file_metas=file_metas,
        experiment_tracking_tool="mlflow",
        experiment_tracking_uri="http://localhost:5000",
        experiment_name="my-query-pipeline-experiment",
        experiment_run_name=f"run_{idx+1}",
        pipeline_meta={"name": f"my-pipeline-{idx+1}"},
        evaluation_set_meta={"name": "my-evalset"},
        corpus_meta={"name": "my-corpus"}.
        add_isolated_node_eval=True,
        reuse_index=False
    )

Status (please check what you already did):

First draft (up for discussions & feedback)
Final code

closes #2188

…eval_to_mlflow

julian-risch

Great draft. I made many comments that are best to discuss in a call I believe.

haystack/modeling/training/base.py

haystack/nodes/retriever/dense.py

haystack/pipelines/base.py

setup.cfg

haystack/utils/experiment_tracking.py

haystack/schema.py

…eval_to_mlflow

julian-risch · 2022-04-25T07:15:12Z

Seems like the test test_get_all_documents_large_quantities[elasticsearch] is causing problems. I haven't seen that before:

E       elasticsearch.exceptions.TransportError: TransportError(429, 'circuit_breaking_exception', '[parent] Data too large, data for [<http_request>] would be [131020332/124.9mb], which is larger than the limit of [127506841/121.5mb], real usage: [131020280/124.9mb], new bytes reserved: [52/52b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=52/52b, model_inference=0/0b, accounting=10064/9.8kb]')

And mypy:

haystack/pipelines/base.py:964: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "index"
haystack/pipelines/base.py:965: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "index"
haystack/pipelines/base.py:966: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "delete_index"
haystack/pipelines/base.py:966: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "index"

tstadel · 2022-04-25T09:03:03Z

Seems like the test test_get_all_documents_large_quantities[elasticsearch] is causing problems. I haven't seen that before:

E       elasticsearch.exceptions.TransportError: TransportError(429, 'circuit_breaking_exception', '[parent] Data too large, data for [<http_request>] would be [131020332/124.9mb], which is larger than the limit of [127506841/121.5mb], real usage: [131020280/124.9mb], new bytes reserved: [52/52b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=52/52b, model_inference=0/0b, accounting=10064/9.8kb]')

And mypy:

haystack/pipelines/base.py:964: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "index"
haystack/pipelines/base.py:965: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "index"
haystack/pipelines/base.py:966: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "delete_index"
haystack/pipelines/base.py:966: error: Item "None" of "Optional[BaseDocumentStore]" has no attribute "index"

Mypy issues is fixed. The other was probably a glitch in the (elasticsearch) environment. Seems like the scroll of 1000 docs ran out of mem. As this test is in place for quite a while I would mark this as very rare event that probably won't happen again. However we should monitor the tests and if it does happen again we can decrease the batch_size during get_all_documents within this test.

julian-risch

LGTM! 👍

julian-risch · 2022-04-25T09:12:24Z

@tstadel Let's make sure this new feature is properly documented on the website. Maybe you could sync on that topic with @brandenchan ?

tstadel and others added 21 commits March 21, 2022 12:59

track eval results in mlflow

5debe7a

Update Documentation & Code Style

e1734bf

add pipeline.yaml and environment info

4f3b116

Merge branch 'eval_to_mlflow' of github.com:deepset-ai/haystack into …

84270e7

…eval_to_mlflow

Merge branch 'master' into eval_to_mlflow

eb42da6

improve logging to mlflow

58c89a0

Update Documentation & Code Style

959b667

introduce ExperimentTracker

ff4882f

Update Documentation & Code Style

b99eff6

move modeling.utils.logger to utils.experiment_tracking

0224199

Merge branch 'eval_to_mlflow' of github.com:deepset-ai/haystack into …

dc0441b

…eval_to_mlflow

renaming: tracker and TrackingHead

94092e2

Update Documentation & Code Style

6da446f

refactor env tracking

2c4baf5

fix pylint findings

7e7020a

Update Documentation & Code Style

bae230e

rename MLFlowTrackingHead to MLflowTrackingHead

f31795f

implement dataset hash

b0f6752

Update Documentation & Code Style

210a36b

set docstrings

93fe45f

Update Documentation & Code Style

c5cf743

tstadel requested a review from julian-risch March 23, 2022 13:52

tstadel marked this pull request as ready for review March 23, 2022 13:53

tstadel marked this pull request as draft March 24, 2022 09:11

tstadel and others added 4 commits March 24, 2022 11:08

introduce PipelineBundle and Corpus

85cf978

Update Documentation & Code Style

8aafbdb

support reusing index

b410f6d

Update Documentation & Code Style

b5242d0

julian-risch reviewed Mar 24, 2022

View reviewed changes

julian-risch added the topic:eval label Mar 24, 2022

tstadel and others added 9 commits April 20, 2022 16:49

remove helper classes

1c353d9

Merge branch 'eval_to_mlflow' of github.com:deepset-ai/haystack into …

de30677

…eval_to_mlflow

Update Documentation & Code Style

4180521

fix imports

8c7ce4f

fix another unused import

fa98cd9

update docstrings

94fcb52

Update Documentation & Code Style

fbf2f0a

simplify usage of experiment tracking tools

1059bab

fix Literal import

7e4e48f

tstadel marked this pull request as ready for review April 21, 2022 11:29

tstadel and others added 4 commits April 21, 2022 13:33

revert schema changes

46c9fee

Update Documentation & Code Style

ddbeda2

always end run

9ebad51

Update Documentation & Code Style

7b9639d

fix mypy issue

6b81230

julian-risch approved these changes Apr 25, 2022

View reviewed changes

brandenchan added the action:needs documentation label Apr 25, 2022

tstadel and others added 5 commits April 25, 2022 19:02

rename to execute_eval_run

44858ea

Update Documentation & Code Style

ea6b512

fix merge of get_or_create_env_meta_data

edde846

improve docstrings

7dc3e7f

Update Documentation & Code Style

e099a59

tstadel merged commit 60ff46e into master Apr 25, 2022

tstadel deleted the eval_to_mlflow branch April 25, 2022 18:14

This was referenced Apr 27, 2022

MLFlowLogging always disabled for training FARMReader models #2244

Closed

mlflow <=1.13.1 still required? #2201

Closed

sjrl mentioned this pull request Oct 20, 2022

Feature: Training tracking support with multiple loggers (i.e. Tensorboard, wandb) #3420

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log evaluation results to MLflow #2337

Log evaluation results to MLflow #2337

tstadel commented Mar 21, 2022 •

edited

Loading

julian-risch left a comment

julian-risch commented Apr 25, 2022

tstadel commented Apr 25, 2022 •

edited

Loading

julian-risch left a comment

julian-risch commented Apr 25, 2022

Log evaluation results to MLflow #2337

Log evaluation results to MLflow #2337

Conversation

tstadel commented Mar 21, 2022 • edited Loading

julian-risch left a comment

Choose a reason for hiding this comment

julian-risch commented Apr 25, 2022

tstadel commented Apr 25, 2022 • edited Loading

julian-risch left a comment

Choose a reason for hiding this comment

julian-risch commented Apr 25, 2022

tstadel commented Mar 21, 2022 •

edited

Loading

tstadel commented Apr 25, 2022 •

edited

Loading