-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Failing Test]: :sdks:go:test:ulrValidatesRunner appears to be very flaky at master #26061
Comments
cc: @lostluck |
Note that this is happening on the Python Portable Runner, but not on any of the other validates runner suites, so it doesn't seem like it's Go side. I was chasing this down from a different series. It's failing on artifact upload, which doesn't have any recent work on the Go side (https://github.com/apache/beam/tree/master/sdks/go/pkg/beam/artifact). I'd be concerned it was the datalayer rewrite (#25982), but that wasn't merged until the 28th, and this started happening on the 25th. (The datalayer rewrite does have a flake, but it's in a unit test #26057, not a runner test). But python's portable runner doesn't seem to have anything new there either. Very confusing. But then I'd expect to see non-infra flakes in the other runner suite tests. |
Same thing for some of the others. But it's very odd that it's happening to the Python runner and not Flink/Spark/Samza. Feels sort of like a grpc thing, but again, not sure why it's only started recently. |
That path isn't doing the correct thing WRT the error on Send. The EOF means close the stream and see what the server is returning. Typically the EOF means that the server side closed for some reason. |
Well that's unexpected:
So, the Server doesn't seem to have it implemented when the SDK connects? Very strange. |
That's because if the "Portable" artifact upload fails, we don't log any of those errors, we just log that the old legacy method doesn't exists. Adding in that logging to see why it's a failing... |
I'm not having any luck replicating that failure right now. Always after debugging is added. I suspect it's related to Jenkins Machine load, so this will have to wait until next week. |
What happened?
Sample failure runs: https://ci-beam.apache.org/job/beam_PreCommit_GoPortable_Cron/2996
Also failed on some in-flight PRs.
Issue Failure
Failure: Test is flaky
Issue Priority
Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)
Issue Components
The text was updated successfully, but these errors were encountered: