Improve experiment stats #1038

notoraptor · 2022-12-08T16:48:34Z

Description

Hi @bouthilx ! This is a PR to extend experiment stats. Latest commit discards dashboard changes, so that this PR only contains backend/Python modifications.

Changes

Add new entries to experiment stats
Add new entry to web API to get experiment stats.
Return trial status in trial web API endpoint

Checklist

Tests

I added corresponding tests for bug fixes and new features. If possible, the tests fail without the changes
All new and existing tests are passing ($ tox -e py38; replace 38 by your Python version if necessary)

Documentation

I have updated the relevant documentation related to my changes

Quality

I have read the CONTRIBUTING doc
My commits messages follow this format
My code follows the style guidelines ($ tox -e lint)

Further comments

NB: After checking the doc, it seems API endpoint experiments/:name also returns various experiment info, but not all that are in stats. I wonder if we may just merge both entries in a future PR (ie. /experiments/:name and newly added /experiments/status/:name).

…clock time.

Do not display time info around status bar in experiments nav bar Show gray animated stripped bar when loading Show red bar on error

- Add a new property Trial.duration to compute trial duration - Compute and return only trials count per status and total trials count - Use Experiment.stats.duration as current_execution_time - Use Experiment.stats.duration to compute ETA - Compute whole clock time directly into Experiment.stats Dashboard: - Use a specific color for each trial status in progress bar - Add tooltips for bar info

Store info into Experiment.stats

- Clean-up database before new insertion - Add a start time for all non-new trials - set experiment metadata datetime - Make each completed trial duration to 2 minutes

When we click on a part of the bar, corresponding trials are filtered in trials table When we click again on same part, it is deselected, and all trials are displayed again in trials table Selected part is displayed with stripped bar

…_EXPERIMENT from experiment 2-dim-shape-exp.1 read from db_dashboard_full.pkl

- change "duration" to "elapsed time" - change "current execution time for all completed trials" to "Time elapsed since the beginning of the HPO execution"

…al execution - Rename ExperimentStats.to_dict() to to_json() - Compute experiment duration using all trials that have an execution interval - Compute ETA using only completed trials, and also in corner cases - Set max_trials to 200 for testing experiment `uncompleted_experiment`

…ted end date. We need to add `eta_milliseconds` in ExperimentStats and send it to dashboard to compute estimated end date.

…ials - Display correct ETA for each corner case - Generate an ExperimentStats object even if there are no completed trials

Remove unused file Display progress as unknown if null Add TODO

…progress bar

…nstead of just nb_trials If max_trials is infinite, display empty bar with label `N/A`

…ment changes. Previously, a same component was used and reloaded each time an experiment is selected. So, if we select an experiment, then another one just after without waiting for the former to load, thus the former may replace the later in the component when loading request is completed. To prevent this, we make sure a fully new component is recreated each time an experiment is selected.

PS: It seems `/experiments/:name` endpint has similar purpose to `/experiments/status/:name`. Should we merge both ?

…:experiment_name/:trial_id`

…e it work with Python 3.7

bouthilx · 2022-12-09T14:52:11Z

docs/src/user/web_api.rst

+        "duration": "2 days, 5:11:24.006755",
+        "whole_clock_time": "8 days, 23:15:15.594405",


It would be good to normalize these names with the labels in the frontend before we release it.

Ok, done ! duration -> elapsed_time and whole_clock_time -> sum_of_trials_time. I note that duration was already used elsewhere in the code (e.g. in module format_terminal). I updated everywhere.

bouthilx · 2022-12-09T14:53:35Z

src/orion/core/worker/experiment.py

@@ -1,4 +1,4 @@
-# pylint:disable=protected-access,too-many-public-methods,too-many-lines
+# pylint:disable=protected-access,too-many-public-methods,too-many-lines,too-many-branches


Is this exception specific to stats property? If yes it should be added there specifically.

bouthilx · 2022-12-09T14:59:05Z

src/orion/core/worker/experiment.py

+            # If max_trials is None, 0 or infinite, we cannot compute ETA
+            eta = None
+        elif len(completed_trials) > self.max_trials:
+            # If there are more completed trials than max trials, then ETA should be 0 (?)


Suggested change

# If there are more completed trials than max trials, then ETA should be 0 (?)

# If there are more completed trials than max trials, then ETA should be 0

Yep!

src/orion/core/worker/experiment.py

tests/functional/serving/test_experiments_resource.py

tests/unittests/core/worker/test_trial.py

bouthilx · 2022-12-09T15:08:31Z

tests/unittests/core/worker/test_experiment.py

    NUM_COMPLETED = 3
    statuses = (["completed"] * NUM_COMPLETED) + (["reserved"] * 2)


Suggested change

NUM_COMPLETED = 3

statuses = (["completed"] * NUM_COMPLETED) + (["reserved"] * 2)

NUM_COMPLETED = 3

NUM_RESERVED = 2

statuses = (["completed"] * NUM_COMPLETED) + (["reserved"] * NUM_RESERVED)

Same thing for tests above, setting number of reserved trials using a variable.

bouthilx · 2022-12-09T15:08:45Z

tests/unittests/core/worker/test_experiment.py

+        assert stats.duration == datetime.timedelta(seconds=3)
+        assert stats.whole_clock_time == datetime.timedelta(seconds=3)
+        assert stats.nb_trials == NUM_COMPLETED + 2
+        assert stats.trial_status_count == {"completed": NUM_COMPLETED, "reserved": 2}


Suggested change

assert stats.trial_status_count == {"completed": NUM_COMPLETED, "reserved": 2}

assert stats.trial_status_count == {"completed": NUM_COMPLETED, "reserved": NUM_RESERVED}

bouthilx · 2022-12-09T15:08:53Z

tests/unittests/core/worker/test_experiment.py

-        assert stats.duration == stats.finish_time - stats.start_time
+        assert stats.duration == datetime.timedelta(seconds=3)
+        assert stats.whole_clock_time == datetime.timedelta(seconds=3)
+        assert stats.nb_trials == NUM_COMPLETED + 2


Suggested change

assert stats.nb_trials == NUM_COMPLETED + 2

assert stats.nb_trials == NUM_COMPLETED + NUM_RESERVED

bouthilx · 2022-12-09T15:12:57Z

tests/unittests/core/worker/test_experiment.py

+        assert stats.trial_status_count == {"completed": NUM_COMPLETED, "reserved": 2}
+        # If max trials < completed trials, then ETA is 0, and progress is relative to nb trials
+        assert stats.max_trials == 2
+        assert stats.progress == 0.6


This test should include new, broken and interrupted trials as well. The progress should not take into account the broken trials, because they will not be executed anymore, but the others yes.

Fix comment. Add a web API test to get experiment stats when max_trials is infinite. Test that trial.execution_interval uses heartbeat if end_time is None. In test_experiment, use a variable NUM_RESERVED to set number of reserved trials.

…ime -> sum_of_trials_time

bouthilx

LGTM, thanks! :)

notoraptor added 30 commits December 2, 2022 09:55

Add web api entry to get experiment status.

0f4f079

Add a new component ExperimentStatusBar.

066739c

[web api] return trial status in /trial

3b3fabf

[dashboard] Display trial status

5ea026c

Fix typo in frontend

15aa04f

Reformat code

d091ce3

Use only executed trials to compute current execution time and whole …

585eb6a

…clock time.

Re-use experiment status bar in experiments nav bar

c1c722c

Do not display time info around status bar in experiments nav bar Show gray animated stripped bar when loading Show red bar on error

Add a copy of test DB to be extended with uncompleted experiment.

e43c344

Add uncompleted experiment into new test database.

c762498

Rename info

a9bea8f

Store info into Experiment.stats

Update script add_uncompleted_experiment:

7f3cd1f

- Clean-up database before new insertion - Add a start time for all non-new trials - set experiment metadata datetime - Make each completed trial duration to 2 minutes

Make bar clickable

401d30a

When we click on a part of the bar, corresponding trials are filtered in trials table When we click again on same part, it is deselected, and all trials are displayed again in trials table Selected part is displayed with stripped bar

In script add_uncompleted_experiment, generate experiment UNCOMPLETED…

3ba9c0d

…_EXPERIMENT from experiment 2-dim-shape-exp.1 read from db_dashboard_full.pkl

[dashboard] In progress bar:

fcabfa6

- change "duration" to "elapsed time" - change "current execution time for all completed trials" to "Time elapsed since the beginning of the HPO execution"

[dashboard] Display both ETA (esimated remaining duration) and estima…

7c01198

…ted end date. We need to add `eta_milliseconds` in ExperimentStats and send it to dashboard to compute estimated end date.

- Add more uncompleted experiments for corner cases related to max_tr…

0db2c34

…ials - Display correct ETA for each corner case - Generate an ExperimentStats object even if there are no completed trials

Add a color legend for progress bar in database page

216e254

[dashboard]

3b84803

Remove unused file Display progress as unknown if null Add TODO

Add max_trials and documentation to experiment stats.

041b7a8

Handle infinite max_trials in experiment stats

588527a

[dashboard] Display max_trials and additional experiment stats above …

b07db38

…progress bar

Compute progress part and progress using max(nb_trials, max_trials) i…

a79eed5

…nstead of just nb_trials If max_trials is infinite, display empty bar with label `N/A`

[dashboard] Fix ancien tests.

f45cc84

Add documentation for new endpoint.

e3d513f

PS: It seems `/experiments/:name` endpint has similar purpose to `/experiments/status/:name`. Should we merge both ?

Test new web API entry /experiments/status

077619b

Test that trial status is well returned in web API endpoint `/trials/…

2ef1dac

…:experiment_name/:trial_id`

notoraptor added 7 commits December 8, 2022 10:34

Test experiment.stats

64ff509

Test trial properties execution_time and duration

d09ead9

Fix pylint

d553a52

Recreate new test DB with uncompleted experiments.

61b89a2

Revert dashboard related changes.

bb042f5

Pass start to sum() as positional argument instead of keyword, to mak…

ef660f1

…e it work with Python 3.7

Replace boolean stats check with boolean stats.trials_completed check

8d125df

bouthilx reviewed Dec 9, 2022

View reviewed changes

src/orion/core/worker/experiment.py Show resolved Hide resolved

bouthilx reviewed Dec 9, 2022

View reviewed changes

tests/functional/serving/test_experiments_resource.py Show resolved Hide resolved

bouthilx reviewed Dec 9, 2022

View reviewed changes

tests/unittests/core/worker/test_trial.py Show resolved Hide resolved

bouthilx reviewed Dec 9, 2022

View reviewed changes

notoraptor added 6 commits December 9, 2022 11:05

In ExperimentStats, rename duration -> elapsed_time and whole_clock_t…

eab42e6

…ime -> sum_of_trials_time

Move experiment progress in a specific property.

ea16bb6

Fix test_coverage_of_access_tests

da60151

Do not count broken trials when computing experiment progress.

a28c66b

Test Trial.duration when end_time is None and heartbeat is available

babbf8b

bouthilx approved these changes Dec 9, 2022

View reviewed changes

bouthilx added the enhancement Improves a feature or non-functional aspects (e.g., optimization, prettify, technical debt) label Dec 19, 2022

bouthilx merged commit f6ba53b into Epistimio:develop Dec 19, 2022

notoraptor deleted the experiment-progress-bar-backend branch January 19, 2023 17:34

notoraptor mentioned this pull request Mar 2, 2023

Release 0.2.7rc #1087

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve experiment stats #1038

Improve experiment stats #1038

notoraptor commented Dec 8, 2022

bouthilx Dec 9, 2022

notoraptor Dec 9, 2022 •

edited

Loading

bouthilx Dec 9, 2022

notoraptor Dec 9, 2022

bouthilx Dec 9, 2022

notoraptor Dec 9, 2022

bouthilx Dec 9, 2022

bouthilx Dec 9, 2022

notoraptor Dec 9, 2022

bouthilx Dec 9, 2022

notoraptor Dec 9, 2022

bouthilx Dec 9, 2022

notoraptor Dec 9, 2022

bouthilx Dec 9, 2022

notoraptor Dec 9, 2022

bouthilx left a comment

		"duration": "2 days, 5:11:24.006755",
		"whole_clock_time": "8 days, 23:15:15.594405",

		@@ -1,4 +1,4 @@
		# pylint:disable=protected-access,too-many-public-methods,too-many-lines
		# pylint:disable=protected-access,too-many-public-methods,too-many-lines,too-many-branches

	# If there are more completed trials than max trials, then ETA should be 0 (?)
	# If there are more completed trials than max trials, then ETA should be 0

		NUM_COMPLETED = 3
		statuses = (["completed"] * NUM_COMPLETED) + (["reserved"] * 2)

	assert stats.trial_status_count == {"completed": NUM_COMPLETED, "reserved": 2}
	assert stats.trial_status_count == {"completed": NUM_COMPLETED, "reserved": NUM_RESERVED}

	assert stats.nb_trials == NUM_COMPLETED + 2
	assert stats.nb_trials == NUM_COMPLETED + NUM_RESERVED

Improve experiment stats #1038

Improve experiment stats #1038

Conversation

notoraptor commented Dec 8, 2022

Description

Changes

Checklist

Tests

Documentation

Quality

Further comments

Choose a reason for hiding this comment

notoraptor Dec 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bouthilx left a comment

Choose a reason for hiding this comment

notoraptor Dec 9, 2022 •

edited

Loading