The Go tests job is flaky #32627

github-actions · 2024-10-02T15:35:36Z

The Go tests is failing over 50% of the time.
Please visit https://github.com/apache/beam/actions/workflows/go_tests.yml?query=is%3Afailure+branch%3Amaster to see all failed workflow runs.
See also Grafana statistics: http://metrics.beam.apache.org/d/CTYdoxP4z/ga-post-commits-status?orgId=1&viewPanel=10&var-Workflow=Go%20tests

damccorm · 2024-10-03T16:00:41Z

@lostluck @damondouglas would you mind taking a look at this one?

lostluck · 2024-10-03T17:56:42Z

There are a bunch that were failing due to staticcheck being out of date, but that's already fixed by #32614. This is probably why this issue was filed, since that made all runs fail for a day or two.

At least one run was failing due to timing out after 10m. We can always extend that.

https://github.com/apache/beam/actions/runs/11144947852/job/30973377990

TestMatchAll/Error_-_no_matches_for_glob_without_wildcard I haven't seen this one before. It's supposed to fail with an error and didn't for some reason. Not sure what happened there. Worth looking into where prism dropped the ball here.

https://github.com/apache/beam/actions/runs/11136655545/job/30948804004

--- FAIL: TestElementChan (0.00s)
--- FAIL: TestElementChan/FillBufferThenAbortThenRead (0.00s)
datamgr_test.go:412: got sum 13, count 13, want sum 20, count 20

This one is known to be flaky. It's trying to test the harness/DataManager logic, (eg how data gets to the actual DataSource for dealing with certain weirder failure conditions that can't be exercised at a higher abstraction level. This leads to inconsistent data.

I can't remember the specifics but those tests could probably be re-written if possible, to not rely on the precise counts. They should be deleted otherwise, since it's unlikely we'd take another look at them.

https://github.com/apache/beam/actions/runs/11050905935/job/30699726834

--- FAIL: TestElementChan (0.00s)
--- FAIL: TestElementChan/SomeTimersAndADataThenReaderThenCleanup (0.00s)
datamgr_test.go:412: got sum 3, count 2, want sum 6, count 3

https://github.com/apache/beam/actions/runs/10926875777/job/30331762994

--- FAIL: TestServer_RunThenCancel (0.00s)
server_test.go:142: server.GetState() = CANCELLING, want CANCELLED

Neat. this is the cancellation test.

Recommended actions:

We should bump the test timeout in the coverage action to 25m (it defaults to 10m).
And then investigate the ElementChan and Cancelation flakes a bit.

kennknowles · 2024-11-26T16:52:12Z

@damondouglas I noticed after I opened my PR that you hold the mutex on this. Apologies. Hopefully my trivial change does not negatively impact you. If you actually aren't active on it, you could release it.

damondouglas · 2024-11-26T19:23:53Z

I had assigned myself from the last interrupts and then got pulled into another area. I'll take a look. If I can't get any insight by next week, I will release the ticket but either way submit any notes on my findings / solution here.

kennknowles · 2024-12-02T15:44:00Z

Increasing the timeout appears to have deflaked it.

github-actions bot added bug flaky_test P1 workflow_id: 14199170 labels Oct 2, 2024

damondouglas self-assigned this Oct 3, 2024

kennknowles mentioned this issue Nov 26, 2024

Increase Go test coverage run timeout to 25 minutes #33223

Merged

3 tasks

kennknowles closed this as completed Dec 2, 2024

github-actions bot added this to the 2.62.0 Release milestone Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Go tests job is flaky #32627

The Go tests job is flaky #32627

github-actions bot commented Oct 2, 2024

damccorm commented Oct 3, 2024

lostluck commented Oct 3, 2024

kennknowles commented Nov 26, 2024

damondouglas commented Nov 26, 2024 •

edited

Loading

kennknowles commented Dec 2, 2024

The Go tests job is flaky #32627

The Go tests job is flaky #32627

Comments

github-actions bot commented Oct 2, 2024

damccorm commented Oct 3, 2024

lostluck commented Oct 3, 2024

kennknowles commented Nov 26, 2024

damondouglas commented Nov 26, 2024 • edited Loading

kennknowles commented Dec 2, 2024

damondouglas commented Nov 26, 2024 •

edited

Loading