Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve unschedulable task warning messages by integrating with the autoscaler #18724

Merged
merged 50 commits into from
Sep 24, 2021

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Sep 17, 2021

Why are these changes needed?

Today, we raise warnings when tasks / actors are not schedulable immediately. These warnings are confusing since they don't take into account future possible autoscaling, and hence can be false positives. False positives are bad since:

  • The user is often confused ("shouldn't my cluster autoscale?" "this warning doesn't make sense")
  • We can't raise exceptions since it could be a false positive ("user ignores warning and is confused when their app hangs")

Library users like Serve today disable these warnings by using placement groups, which is not ideal.

This PR eliminates these false positives via integration with the autoscaler. Instead of the raylet printing messages when resources are not schedulable, it defers to the autoscaler. The autoscaler can determine if a task will be infeasible even after autoscaling. This PR:

  • Makes the autoscaler always active (in "readonly" mode for laptop / manually setup clusters)
  • Defers responsibility for scheduler warning messages to autoscaler

In future PRs, we can close the loop by raising exceptions for "permanently infeasible" tasks. This would require the autoscaler to send statuses back to the scheduler about what task types are infeasible.

PRD doc: https://docs.google.com/document/d/1OT6m4xQDN8UtsBgnAMpX6nhXpNAfdeHJVve-iGhw1WI/edit#

Sample output on laptop (autoscaler output is unchanged):

======== Autoscaler status: 2021-09-23 17:45:56.525566 ========
Node status
---------------------------------------------------------------
Healthy:
 1 node_777cd260045578b90970679657460908a1ef8285ed248a093e79cc72
 1 node_2509c7c51cfb659b77700cb34c2035df2cf016a67a1868864da6d4b4
 1 node_50973ed2e3eb5a30f64a6e107ec76d9aec46b7cfdd502a66135aa2a4
 1 node_a7cf546146ef87737902b042ed0fb73619913d197617bca7899beb73
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 60.0/64.0 CPU
 0.00/106.382 GiB memory
 0.00/0.586 GiB object_store_memory

Demands:
 {'CPU': 4.0}: 92+ pending tasks/actors
 {'CPU': 3.0}: 193+ pending tasks/actors
 {'CPU': 1.0, 'foo': 1.0}: 2+ pending tasks/actors
 {'CPU': 30.0}: 2+ pending tasks/actors

Related issue number

Closes #15933

TODO:

  • Add integration tests for emitted log messages
  • Add integration test for ray status with readonly provider
  • Update unit tests for resource demand scheduler
  • Add unit test for use of readonly node provider

Copy link
Contributor

@sasha-s sasha-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 17 of 17 files at r1, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @AmeerHajAli, @DmitriGekhtman, @ericl, @ijrsvt, @pcmoritz, @raulchen, @robertnishihara, and @wuisawesome)


python/ray/worker.py, line 1091 at r1 (raw file):

                yield ("Tip: use `ray status` to view detailed "
                       "cluster status. To disable these "
                       "messages, set RAY_SCHEDULER_EVENTS=0.")

Right now AUTOSCALER_EVENTS are not documented.
Do we want to document RAY_SCHEDULER_EVENTS?


python/ray/autoscaler/_private/autoscaler.py, line 255 at r1 (raw file):

        def schedule_node_termination(node_id: NodeID,
                                      reason_opt: Optional[str]) -> None:
            if self.provider.is_readonly():

It is a bit confusing that we allow mutable operations for readonly providers, ignoring them.
Also, I think that we re-check is_readonly() below (which obsolete if we skip here).


python/ray/autoscaler/_private/monitor.py, line 237 at r1 (raw file):

        mirror_node_types = {}
        resource_deadlock = False

resource_deadlock sounds like a bug, maybe rename to something like
not_enought_resources?

@ericl
Copy link
Contributor Author

ericl commented Sep 23, 2021

Agree to support legacy logs for now, I'll add a feature flag.

@ericl
Copy link
Contributor Author

ericl commented Sep 24, 2021

Done with pass on comments. @AmeerHajAli I attached a "ray status" output in the PR description for readonly cluster status (autoscaler status is unchanged). You can also check out the asserts in test_cli.py and test_output.py

@ericl ericl removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Sep 24, 2021
Copy link
Contributor

@DmitriGekhtman DmitriGekhtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!
There are some tests to patch up.

@DmitriGekhtman DmitriGekhtman added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Sep 24, 2021
@ericl ericl merged commit 11a2dfc into ray-project:master Sep 24, 2021
@ericl
Copy link
Contributor Author

ericl commented Sep 24, 2021

Windows build seems to not trigger correctly, but yolo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[core] Better error message for task/actors when unschedulable (integrate with autoscaler)
6 participants