-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
beam_PostCommit_XVR_GoUsingJava_Dataflow fails on some test transforms #21645
Comments
The previous error message appears to be a red herring, as it occurs in test logs that have successful runs. These seem to continue to execute until the container is started up. This message, on the other hand, appears to be the common denominator in failing runs:
With the containerID being indicated as started in the log message before. |
@jrmccluskey so is this still an issue? are you working on it or should we unassign? |
This went on the back burner for other work, hoping to dedicate some time to it next week |
Is this fixed? Or perhaps so back-burner that it should be unassigned for someone else to grab? Is there a mitigation so that it would not impact test signal and not be a stale P1? |
This suite is perma-red. Should we disable a test? I don't think we should just burn jenkins CPU and person time triaging this repeatedly, considering how long it has been going on. |
Yeah we're probably at that point |
@chamikaramj - could you please triage this? Thanks! |
There are currently 3 failing tests:
In particular, the worker log for TestXLang_Partition has
The xlang harness fails to "logging message over FnAPI" |
Cham - should this be disabled? I am happy to do the PR if you want to comment and assign to me. |
An example failed job: Check failed: absl::OkStatus() == ::dist_proc::dax::PrintableStatus(status) (OK vs. generic::failed_precondition: PaneInfo truncated Seems like UW is failing. I think there was a recent fix to UW related to this that is not in prod yet. cc: @robertwb |
Any update on this P1? |
Seems like Just Kafka test is failing now but other tests are passing. Have to check closely to see why the Kafka Go test is failing. 14:25:33 --- PASS: TestBigtableIO_BasicWriteRead (632.92s) |
It has been perma-red for a long time though (https://ci-beam.apache.org/job/beam_PostCommit_XVR_GoUsingJava_Dataflow/) I think we should downgrade this to P2 since it seems to be more of a "new feature" and is not monitored by a human. |
Closing this issue since we're off jenkins and different issues now occur. #28339 tracks more recently. |
Example failure: https://ci-beam.apache.org/job/beam_PostCommit_XVR_GoUsingJava_Dataflow/7/
I couldn't find accurate details about why the tests are failing, but TestXLang_Prefix, TestXLang_Multi, and TestXLang_Partition are failing while running for some reason. Investigating the Dataflow logs, we can see SDK harnesses are failing to connect for some reason. For example:
However I haven't been able to find any further details showing why the harness fails, and the tests keep running beyond that for a while with other errors that are also pretty inscrutable.
Imported from Jira BEAM-14214. Original Jira may contain additional context.
Reported by: danoliveira.
The text was updated successfully, but these errors were encountered: