Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task][Go SDK]: Dataflow Go PostCommits timing out #23422

Closed
lostluck opened this issue Sep 29, 2022 · 4 comments
Closed

[Task][Go SDK]: Dataflow Go PostCommits timing out #23422

lostluck opened this issue Sep 29, 2022 · 4 comments

Comments

@lostluck
Copy link
Contributor

What needs to happen?

Currently the postcommits are failing. We need to investigate where the time is being spent and if possible, either parallelized it better (WRT beam project Dataflow quotas), or consolidate some test pipelines into single jobs (better handling quota), or reduce the number of specific tests we're running against dataflow post commits (eg. remove low signal tests).

As a stop gap, double the 2.5h timeout to 5h.

Issue Priority

Priority: 1

Issue Component

Component: sdk-go

@lostluck
Copy link
Contributor Author

lostluck commented Sep 29, 2022

Specifically, we could reduce the number of pipelines by consolidating them into a single pipeline for the specific area. The we have Dataflow filtering out the individual pipelines where the consolidated ones are used instead. This may require some test prefix renaming to simplify.

var dataflowFilters = []string{

eg. All the State Pipelines could be consolidated into a single pipeline:
TestConsolidated_StateAPIPipeline

And the State tests renamed and prefixed, eg. TestStateAPI_MapState

Allowing the other runners to filter out TestConsolidated.* and Dataflow to filter out TestStateAPI* (along with the indication that it's covered by the consolidated test)

And similarly across common areas in the integration tests.

@Abacn
Copy link
Contributor

Abacn commented Sep 29, 2022

Seen this before and was transient #23311

@lostluck
Copy link
Contributor Author

Yeah, transient or not, it's a flake, which means it interrupts velocity.

Since our apache-beam-testing project for Dataflow is a shared resource and we have only so many simultaneous job slots, I'd rather get the post commit back down under an hour.

@github-actions github-actions bot added the stale label Nov 29, 2022
@damccorm damccorm removed the stale label Dec 2, 2022
@lostluck
Copy link
Contributor Author

lostluck commented Sep 19, 2023

Looks like this has stabilized.

@github-actions github-actions bot added this to the 2.51.0 Release milestone Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants