Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-475] Fail-fast writes no run_results.json on error #3600

Closed
Tracked by #7301
jtcohen6 opened this issue Jul 20, 2021 · 9 comments
Closed
Tracked by #7301

[CT-475] Fail-fast writes no run_results.json on error #3600

jtcohen6 opened this issue Jul 20, 2021 · 9 comments

Comments

@jtcohen6
Copy link
Contributor

dbt run --fail-fast does not write run_results.json if it encounters any error, for any model node, during its run.

I know --fail-fast is common in some development + CI setups. Metadata in those environments is lower-stakes than metadata from prod, but I believe it still has value.

We need to decide on and document the expected behavior. Perhaps it's right to consider --fail-fast like a keyboard interrupt mid-run, such that the invocation exits without completing.

This will also be less pressing in a future where we've instrumented dbt with a streaming event system, because a single JSON file at the end of a batch process will no longer be our only principal source of metadata.

@jtcohen6 jtcohen6 added bug Something isn't working artifacts labels Jul 20, 2021
@jtcohen6 jtcohen6 changed the title Fail-fast writes no run_results.json Fail-fast writes no run_results.json on error Jul 20, 2021
@jtcohen6
Copy link
Contributor Author

It would be desirable to write run_results.json for runs that are cancelled partway through.

Where we write those artifacts them currently, within the run step:

if flags.WRITE_JSON:
self.write_manifest()
self.write_result(result)

That result is the returned value from execute_with_hooks, which calls (in turn) execute_nodes. That's the method that both collect results and handles cancellation due to --fail-fast/KeyboardInterrupt:

except KeyboardInterrupt:
self._cancel_connections(pool)
print_run_end_messages(self.node_results, keyboard_interrupt=True)
raise
pool.close()
pool.join()
return self.node_results

Is there a way we can replumb this to write partially complete results? I think the right next step might be to spike the difficulty of that replumbing.

@jtcohen6 jtcohen6 added the jira label Apr 11, 2022
@github-actions github-actions bot changed the title Fail-fast writes no run_results.json on error [CT-475] Fail-fast writes no run_results.json on error Apr 11, 2022
@iknox-fa iknox-fa removed the bug Something isn't working label May 2, 2022
@iknox-fa
Copy link
Contributor

iknox-fa commented May 2, 2022

I think the right next step might be to spike the difficulty of that replumbing.

In BLG we all agreed. See: #5204

@iknox-fa
Copy link
Contributor

iknox-fa commented May 2, 2022

No estimate 5/2 /22 BLG, awaiting spike

@github-actions
Copy link
Contributor

github-actions bot commented Nov 2, 2022

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Nov 2, 2022
@github-actions
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

@kieronellis
Copy link

Having errors included in the run_results.json when there's a fail fast would be super helpful for my team as we are using fail fast in our CI setup - Is there any chance of this being implemented? Thanks.

@jtcohen6 jtcohen6 reopened this Feb 17, 2023
@jtcohen6
Copy link
Contributor Author

Discussing with @ChenyuLInx, an answer that doesn't feel so terrible: write run_results.json after every node completes. That way, we don't need to gracefully handle graceless exits (whether due to --fail-fast, keyboard interrupt, process termination...)

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Apr 10, 2023

Closing in favor of #7302

Update: Reopening per #7302 (comment)

@jtcohen6 jtcohen6 reopened this Apr 24, 2023
@iknox-fa
Copy link
Contributor

Closed per #7302 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants