Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIP-44 Remove job_runner from BaseJob as a final step of decoupling #30325

Closed
potiuk opened this issue Mar 27, 2023 · 0 comments · Fixed by #30376
Closed

AIP-44 Remove job_runner from BaseJob as a final step of decoupling #30325

potiuk opened this issue Mar 27, 2023 · 0 comments · Fixed by #30376
Assignees
Labels
AIP-44 Airflow Internal API

Comments

@potiuk
Copy link
Member

potiuk commented Mar 27, 2023

As discussed in #30255 (comment) and #30255 (comment) - I think eventually (after #30308 especially, we should be able to get rid of the job_runner back-reference to the job_runner in the BaseJob - this will simplify the relation.

It will be easier however to do it in those steps we have now (each of those changes is relatively small (in terms of base code) and separately reviewable - and we can make sure Airlfow continues to work after each of those steps which makes it easier to implement and merge those changes faster):

#30255 -> #30302 -> #30308 -> this one

CC: @uranusjr @mhenc

@potiuk potiuk converted this from a draft issue Mar 27, 2023
@potiuk potiuk changed the title AIP-44 Remove runner from BaseJob as a final step of decoupling AIP-44 Remove job_runner from BaseJob as a final step of decoupling Mar 27, 2023
@potiuk potiuk added the AIP-44 Airflow Internal API label Mar 27, 2023
@potiuk potiuk self-assigned this Mar 27, 2023
@potiuk potiuk moved this from 📋 Backlog to 🔖 Ready in AIP-44 - Internal API Mar 27, 2023
@potiuk potiuk moved this from 🔖 Ready to 🏗 In progress in AIP-44 - Internal API Mar 30, 2023
potiuk added a commit to potiuk/airflow that referenced this issue Apr 11, 2023
This is the final step of decoupling of the job runner from ORM
based BaseJob. After this change, finally we rich the state that
the BaseJob is just a state of the Job being run, but all
the logic is kept in separate "JobRunner" entity which just
keeps the reference to the job. Also it makes sure that
job in each runner is defined as appropriate for each job type:

* SchedulerJobRunner, BackfillJobRunner can only use BaseJob
* DagProcessorJobRunner, TriggererJobRunner and especially the
  LocalTaskJobRunner can keep both BaseJob and it's Pydantic
  BaseJobPydantic representation - for AIP-44 usage.

The highlights of this change:

* Job does not have job_runner reference any more
* Job is a mandatory parameter when creating each JobRunner
* run_job method takes as parameter the job (i.e. where the state
  of the job is called) and executor_callable - i.e. the method
  to run when the job gets executed
* heartbeat callback is also passed a generic callable in order
  to execute the post-heartbeat operation of each of the job
  type
* there is no more need to specify job_type when you create
  BaseJob, the job gets its type by a simply creating a runner
  with the job

This is the final stage of refactoring that was split into
reviewable stages: apache#30255 -> apache#30302 -> apache#30308 -> this PR.

Closes: apache#30325
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in AIP-44 - Internal API Apr 11, 2023
potiuk added a commit that referenced this issue Apr 11, 2023
This is the final step of decoupling of the job runner from ORM
based BaseJob. After this change, finally we rich the state that
the BaseJob is just a state of the Job being run, but all
the logic is kept in separate "JobRunner" entity which just
keeps the reference to the job. Also it makes sure that
job in each runner is defined as appropriate for each job type:

* SchedulerJobRunner, BackfillJobRunner can only use BaseJob
* DagProcessorJobRunner, TriggererJobRunner and especially the
  LocalTaskJobRunner can keep both BaseJob and it's Pydantic
  BaseJobPydantic representation - for AIP-44 usage.

The highlights of this change:

* Job does not have job_runner reference any more
* Job is a mandatory parameter when creating each JobRunner
* run_job method takes as parameter the job (i.e. where the state
  of the job is called) and executor_callable - i.e. the method
  to run when the job gets executed
* heartbeat callback is also passed a generic callable in order
  to execute the post-heartbeat operation of each of the job
  type
* there is no more need to specify job_type when you create
  BaseJob, the job gets its type by a simply creating a runner
  with the job

This is the final stage of refactoring that was split into
reviewable stages: #30255 -> #30302 -> #30308 -> this PR.

Closes: #30325
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-44 Airflow Internal API
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant