Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

RFC: Add RetryAfter to dbworker.StoreOptions #13457

Merged
merged 2 commits into from
Sep 1, 2020
Merged

Conversation

mrnugget
Copy link
Contributor

@mrnugget mrnugget commented Aug 28, 2020

In campaigns we need the ability to retry jobs multiple times. (See https://github.com/sourcegraph/sourcegraph/issues/12700#issuecomment-671798531 for additional context.)

This is what I think is the easiest-to-understand and simplest solution.

I did have another solution that involved a PreDequeue hook (that
returned the custom conditions you see here now) and boolean in the
StoreOptions to switch between AND'ing or OR'ing the custom conditions
to the selectCandidateQuery.

This felt a bit hacky. It was less code, but also easier to miss and
misudnerstand.

What do you think of this?

In campaigns we need the ability to retry jobs multiple times. (See
https://github.com/sourcegraph/sourcegraph/issues/12700#issuecomment-671798531
for additional context.)

This is what I think is the easiest-to-understand and simplest solution.

I did have another solution that involved a PreDequeue hook (that
returned the custom conditions you see here now) and boolean in the
StoreOptions to switch between AND'ing or OR'ing the custom conditions
to the selectCandidateQuery.

This felt a bit hacky. It was less code, but also easier to miss and
misudnerstand.

What do you think of this?
@mrnugget mrnugget requested review from efritz and a team August 28, 2020 15:01
@mrnugget mrnugget requested a review from emidoots as a code owner August 28, 2020 15:01
@mrnugget mrnugget removed the request for review from emidoots August 28, 2020 15:01
@codecov
Copy link

codecov bot commented Aug 28, 2020

Codecov Report

Merging #13457 into main will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main   #13457   +/-   ##
=======================================
  Coverage   51.50%   51.51%           
=======================================
  Files        1496     1496           
  Lines       83303    83315   +12     
  Branches     6798     6798           
=======================================
+ Hits        42905    42918   +13     
  Misses      36805    36805           
+ Partials     3593     3592    -1     
Flag Coverage Δ *Carryforward flag
#go 52.60% <100.00%> (+0.01%) ⬆️
#integration 29.38% <ø> (ø) Carriedforward from 4e02361
#typescript 48.56% <ø> (ø) Carriedforward from 4e02361
#unit 33.85% <ø> (ø) Carriedforward from 4e02361

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
internal/workerutil/dbworker/store/store.go 73.07% <100.00%> (+2.73%) ⬆️
internal/db/helpers.go 100.00% <0.00%> (ø)
internal/gosrc/stdlib.go 100.00% <0.00%> (ø)
internal/conf/platform.go 100.00% <0.00%> (ø)
internal/routevar/util.go 100.00% <0.00%> (ø)
internal/timeutil/week.go 100.00% <0.00%> (ø)
internal/vcs/git/mocks.go 100.00% <0.00%> (ø)
internal/db/query/query.go 100.00% <0.00%> (ø)
internal/highlight/mocks.go 100.00% <0.00%> (ø)
internal/routevar/regexp.go 100.00% <0.00%> (ø)
... and 737 more

Copy link
Contributor

@efritz efritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm think I'm good with this approach. Can we add a test that hits the new query?

@mrnugget
Copy link
Contributor Author

I'm think I'm good with this approach. Can we add a test that hits the new query?

Yeah, absolutely. I played around with this manually/interactively (prefer that over tests when doing SQL), so I can confirm it Works On My Machine At Least ™️.

If you think this approach is good, though, and I shouldn't move this to somewhere else, I'm going to add tests.

Copy link
Member

@eseliger eseliger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach LGTM, but have one inline question re fairness

internal/workerutil/dbworker/store/store.go Show resolved Hide resolved
internal/workerutil/dbworker/store/store.go Show resolved Hide resolved
@efritz
Copy link
Contributor

efritz commented Aug 28, 2020

If you think this approach is good, though, and I shouldn't move this to somewhere else, I'm going to add tests.

Nah I think the behavior described here is good. It may be worth it to try to merge the dequeue queries together (but maybe this is more difficult than just leaving it as-is), but I think tests should be added here first because the code is currently clear.

// more than RetryAfter ago.
// If RetryAfter is a non-zero duration, the store dequeues records where
// - the state is 'errored'
// - the failed attempts counter hasn't reached MaxNumResets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_resets does not track failures but how many times the handling process loses the transaction holding the record (from processing -> queued). Maybe we should add a new field that increments each time it goes to errored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had the same thought: retries would be better. I'm happy to add the field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can also be a post-merge enhancement since this is probably a good heuristic for now. Should definitely be done at some point but no need to pile it onto this v1 effort.

@mrnugget
Copy link
Contributor Author

It may be worth it to try to merge the dequeue queries together (but maybe this is more difficult than just leaving it as-is)

I tried this and... it wasn't nice :(

@mrnugget mrnugget force-pushed the mrn/worker-retry-after branch from 93ffcab to c417c46 Compare August 31, 2020 10:10
@mrnugget
Copy link
Contributor Author

Added tests and changed the code to only include dbworker things. I'll add the campaign specific things in another PR based on this one.

@mrnugget
Copy link
Contributor Author

mrnugget commented Sep 1, 2020

Not sure whether I can/should merge now, @efritz, so I'll hold off until I get explicit approval.

@mrnugget mrnugget added the batch-changes Issues related to Batch Changes label Sep 1, 2020
@mrnugget mrnugget added this to the 3.20 milestone Sep 1, 2020
@chrispine chrispine mentioned this pull request Sep 1, 2020
52 tasks
@mrnugget mrnugget merged commit bdfd096 into main Sep 1, 2020
@mrnugget mrnugget deleted the mrn/worker-retry-after branch September 1, 2020 12:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
batch-changes Issues related to Batch Changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants