Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split procrastinate_finish_job into two functions #336

Merged
merged 3 commits into from
Nov 13, 2020
Merged

Conversation

elemoine
Copy link
Contributor

Closes #332

This PR splits the procrastinate_finish_job SQL functions into two functions: procrastinate_finish_job and procrastinate_retry_job.

To be discussed:

For backward compatibility reasons I've kept procrastinate_finish_job as-is, and added procrastinate_finish_job_1. In this way one can apply the schema migration (the delta_0.16.0_001 SQL file in this PR) before upgrading the Procrastinate code.

I think that the schema of Procrastinate version N should work for any previous version (< N) of Procrastinate. This is to make it easy to upgrade Procrastinate, and in a "blue/green" way. And I think we should do that until 1.0. In 1.0 we would remove all the backward compatibility stuff from the schema. For example, we would rename procrastinate_finish_job_1 to procrastinate_finish_job, and remove the old procrastinate_finish_job function. And we would apply the same backward compatibility principle within the 1.x.y series (until 2.0.0).

Successful PR Checklist:

  • Tests
    • (not applicable?)
  • Documentation
    • (not applicable?)
  • Had a good time contributing?

@codecov
Copy link

codecov bot commented Oct 16, 2020

Codecov Report

Merging #336 (ac0807a) into master (3c376dd) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #336   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           29        29           
  Lines         1570      1575    +5     
  Branches       173       172    -1     
=========================================
+ Hits          1570      1575    +5     
Impacted Files Coverage Δ
procrastinate/manager.py 100.00% <100.00%> (ø)
procrastinate/testing.py 100.00% <100.00%> (ø)
procrastinate/worker.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c376dd...ac0807a. Read the comment docs.

await self.job_manager.finish_job(
job=job, status=status, scheduled_at=next_attempt_scheduled_at
)
if retry_at:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

juste une question svp :) : est-ce que cette gestion faite en python pourrait pas être faite plutôt dans le schema ?
et même que la procédure stockée en db fasse elle même un pg cron pour faire un notify à la date voulue ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey :) This is a public project, I know you know I speak French, but it's easier for other people if we try and stick to English ;)

As of today, determining if a task should be retried or not is done in Python, because it is an operation that depends on the Python outcome of running the task. Could you tell us more on what you're suggesting ? Also, I'm not really sure it belongs to this PR, but maybe I misunderstood your comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. yes sorry ;) just wondering if the logic to retry of not could be handled in the
procrastinate_finish_job stored procedure. So that, depending if next_scheduled_at is null or not , the procedure would call procrastinate_finish_job_1 or procrastinate_retry_job .
Moreother, inside the procrastinate_retry_job, we could imagine that a notify (call to procrastinate_notify_queue()) could be done at the right next_scheduled_at thanks to a pg cron ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering if the logic to retry of not could be handled in the procrastinate_finish_job stored procedure. So that, depending if next_scheduled_at is null or not , the procedure would call procrastinate_finish_job_1 or procrastinate_retry_job

Well, that's basically how it's implemented currently (before this PR). Except that the the status should always be passed, even when scheduled_at is specified. See https://github.com/peopledoc/procrastinate/blob/ddd2d19794227b4178084d59f56f3f38f09873c8/procrastinate/sql/schema.sql#L175-L185.

And we thought that it would make for a cleaner and more explicit API if we had two separate functions. We want to distinguish the action of "finishing a job", which involves changing its status to either "succeeded" or "failed", from the action of "retrying a job", which involves resetting its status to "todo", with a scheduling in the future.

Moreother, inside the procrastinate_retry_job, we could imagine that a notify (call to procrastinate_notify_queue()) could be done at the right next_scheduled_at thanks to a pg cron ?

Are you referring to pg_cron?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. thanks for your answer. yes, i'm refering to pg_cron.
What i m' suggesting is something like :

CREATE FUNCTION procrastinate_finish_job (job_id integer, end_status procrastinate_job_status, next_scheduled_at timestamp with time zone) RETURNS void
BEGIN
   IF (next_scheduled_at IS NULL) 
         CALL procrastinate_finish_job_1(job_id, end_status)
  ELSE
         CALL procrastinate_retry_job(job_id, next_scheduled_at)
END

so that to minimize the code that the client has to do. The client only has to call procrastinate_finish_job
Remark : if guess we want to retry only on on failed status ?
Remark2: calling only the retry procedure with status='todo', you won't have anymore the procrastinate_events failed status triggered by ref

@elemoine elemoine force-pushed the ele_finish-job branch 6 times, most recently from 2434177 to 0016474 Compare November 13, 2020 13:29
@elemoine elemoine marked this pull request as ready for review November 13, 2020 13:30
@elemoine
Copy link
Contributor Author

This is now ready for review!

@ewjoachim
Copy link
Member

ewjoachim commented Nov 13, 2020

@@ -48,8 +48,17 @@ WHERE id IN (

-- finish_job --
-- Stop a job, free the lock and record the relevant events
-- Old version, kept for backward compatibility reasons
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- Old version, kept for backward compatibility reasons
-- Deprecated, remove after 1.0.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... It's actually not necessary to keep it.

queries.sql is at the version of the code, not the DB. We'll never need that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes. Good catch!

@elemoine
Copy link
Contributor Author

@ewjoachim ewjoachim merged commit 9553b9e into master Nov 13, 2020
@ewjoachim ewjoachim deleted the ele_finish-job branch November 13, 2020 14:13
@ewjoachim ewjoachim mentioned this pull request Nov 27, 2020
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split the procrastinate_finish_job SQL function into two functions
3 participants