Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce CuratorTaskManager for make an active job be curated by only one scheduler #130

Closed
yahoNanJing opened this issue Aug 12, 2022 · 0 comments · Fixed by #153
Closed
Labels
enhancement New feature or request

Comments

@yahoNanJing
Copy link
Contributor

yahoNanJing commented Aug 12, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

After introducing #59, the previous cache layer is almost ineffective, which will downgrade the task scheduling performance very much especially for scheduling thousands of tasks.

Describe the solution you'd like

It's better to introduce CuratorTaskManager for make an active job be curated by only one scheduler. Then we can leverage cache for the active jobs to avoid serialization and deserialization cost.

To achieve this, we need the following things:

  1. Introduce scheduler id for execution graph as its curator
  2. Extract task status from ExecutionStage
  3. Extract job status from ExecutionGraph
  4. Introduce cache for the active execution graph in TaskManager
  5. Make the executor grpc server able to know which scheduler the requests are from
  6. Make the executor able to update task status to its curator scheduler
  7. Error handling:
    • When one scheduler is dead and executors fail to update tasks to this scheduler, at the first stage, when other scheduler receives such task status update request, it can simply mark its related job failed. Later we can improve by stage-based recovering.

Describe alternatives you've considered

Additional context

@yahoNanJing yahoNanJing added the enhancement New feature or request label Aug 12, 2022
@yahoNanJing yahoNanJing changed the title refine the scheduler state cache layer Introduce CuratorTaskManager for make an active job be curated by only one scheduler Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant