Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record failed jobs when saving the result fails (dont retry) #28629

Merged
merged 9 commits into from
Jan 23, 2019
Merged

Record failed jobs when saving the result fails (dont retry) #28629

merged 9 commits into from
Jan 23, 2019

Conversation

joelgriffith
Copy link
Contributor

@joelgriffith joelgriffith commented Jan 11, 2019

Summary

This was a fun one to dig up, for only a one-line PR :/

The TL;DR is that ES-QUEUE was swallowing errors when the final output of the work was saved back into ES. Even more precarious is when content-bodies are bigger than ES is allowed (set via http.max_content_length), and those errors couldn't be saved since the error message included the output body and that failed to save since it was too big (again).

Anywho, this marks those issues as failed immediately and save the response without re-trying. We could fork the retry-action based upon the HTTP code, but that feels pre-carious and has my spidey-sense tingling. I think it's better to fail quickly here and let folks know vs reporting retrying again and again for the same outcome.

I'm currently trying to test this in a spec, but it's turning into a rabbit-hole given the lengthy promise chain, so I wanted to gather feedback before spending more time on it!

screen shot 2019-01-14 at 8 59 36 am

Checklist

Use strikethroughs to remove checklist items you don't feel are applicable to this PR.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@tsullivan
Copy link
Member

those errors couldn't be saved since the error message included the output body and that failed to save since it was too big (again).

OMG 🙈 :D

@tsullivan
Copy link
Member

I think it's better to fail quickly here and let folks know vs reporting retrying again and again for the same outcome.

I agree with this as well

@joelgriffith
Copy link
Contributor Author

Added a test for this corner-case, should be good to go now

@joelgriffith joelgriffith added review (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead v6.7.0 v7.0.0 labels Jan 15, 2019
@elasticmachine
Copy link
Contributor

💔 Build Failed

Copy link
Member

@tsullivan tsullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@joelgriffith
Copy link
Contributor Author

retest

@elasticmachine
Copy link
Contributor

💔 Build Failed

@joelgriffith
Copy link
Contributor Author

I'm unclear as to why this job is hanging, but going to try and get it figured out + merged

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

const failStub = getFailStub(worker);

await worker._performJob(job);
worker.destroy();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was the missing piece.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@joelgriffith
Copy link
Contributor Author

Backport: #29211

@joelgriffith
Copy link
Contributor Author

Backported! #29211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport pending backported (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead review v6.7.0 v7.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants