Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Use RetryAfter in campaigns workers #13478

Merged
merged 4 commits into from
Sep 1, 2020
Merged

Conversation

mrnugget
Copy link
Contributor

@mrnugget mrnugget commented Aug 31, 2020

This is based on and requires #13457.

In combination with #13457 this PR fixes the first and most important part of https://github.com/sourcegraph/sourcegraph/issues/12700 by adding automatic retrying of failed changesets to the reconciler.

It retries failed changesets every 5 seconds, if there are no newer, non-errored changesets to be dequeued.

The new order clause is strictly speaking not necessary yet, but I think it's a bug that we don't update a changeset's UpdatedAt when saving it back to the database. I want to tackle that in another PR.

In campaigns we need the ability to retry jobs multiple times. (See
https://github.com/sourcegraph/sourcegraph/issues/12700#issuecomment-671798531
for additional context.)

This is what I think is the easiest-to-understand and simplest solution.

I did have another solution that involved a PreDequeue hook (that
returned the custom conditions you see here now) and boolean in the
StoreOptions to switch between AND'ing or OR'ing the custom conditions
to the selectCandidateQuery.

This felt a bit hacky. It was less code, but also easier to miss and
misudnerstand.

What do you think of this?
@codecov
Copy link

codecov bot commented Aug 31, 2020

Codecov Report

Merging #13478 into mrn/worker-retry-after will decrease coverage by 0.00%.
The diff coverage is 0.00%.

@@                    Coverage Diff                     @@
##           mrn/worker-retry-after   #13478      +/-   ##
==========================================================
- Coverage                   51.51%   51.50%   -0.01%     
==========================================================
  Files                        1496     1496              
  Lines                       83315    83316       +1     
  Branches                     6798     6798              
==========================================================
- Hits                        42918    42911       -7     
- Misses                      36805    36811       +6     
- Partials                     3592     3594       +2     
Flag Coverage Δ *Carryforward flag
#go 52.58% <0.00%> (-0.02%) ⬇️
#integration 29.38% <ø> (ø) Carriedforward from c417c46
#typescript 48.56% <ø> (ø) Carriedforward from c417c46
#unit 33.85% <ø> (ø) Carriedforward from c417c46

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
enterprise/internal/campaigns/workers.go 0.00% <0.00%> (ø)
.../internal/codeintel/resolvers/graphql/locations.go 81.81% <0.00%> (-3.64%) ⬇️
cmd/frontend/graphqlbackend/zoekt.go 76.08% <0.00%> (-0.82%) ⬇️
web/src/org/OrgAvatar.tsx 100.00% <0.00%> (ø)
web/src/util/features.tsx 100.00% <0.00%> (ø)
web/src/backend/graphql.ts 100.00% <0.00%> (ø)
web/src/marketing/util.tsx 100.00% <0.00%> (ø)
web/src/enterprise/main.tsx 100.00% <0.00%> (ø)
web/src/marketing/Toast.tsx 100.00% <0.00%> (ø)
shared/src/backend/errors.ts 100.00% <0.00%> (ø)
... and 735 more

@mrnugget mrnugget requested a review from a team August 31, 2020 12:08
@mrnugget mrnugget marked this pull request as ready for review August 31, 2020 12:12
OrderByExpression: sqlf.Sprintf("reconciler_state = 'errored', changesets.updated_at DESC"),

StalledMaxAge: 60 * time.Second,
MaxNumResets: 60,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will a specific error be logged that tells that retry won't happen anymore? If not, should we Wrap the last error in a crashloop error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean logged to the failure_message column, I assume? No. But I think that's possible and I'll take a look at it in when I fix the updated_at.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks!

@mrnugget mrnugget added the batch-changes Issues related to Batch Changes label Sep 1, 2020
@mrnugget mrnugget added this to the 3.20 milestone Sep 1, 2020
@chrispine chrispine mentioned this pull request Sep 1, 2020
52 tasks
Base automatically changed from mrn/worker-retry-after to main September 1, 2020 12:10
@mrnugget mrnugget requested a review from emidoots as a code owner September 1, 2020 12:10
@mrnugget mrnugget merged commit ab1da6f into main Sep 1, 2020
@mrnugget mrnugget deleted the mrn/campaigns-retry-after branch September 1, 2020 12:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
batch-changes Issues related to Batch Changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants