Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pubsub: many tests failed #8520

Closed
flaky-bot bot opened this issue Sep 2, 2023 · 4 comments · Fixed by #8526
Closed

pubsub: many tests failed #8520

flaky-bot bot opened this issue Sep 2, 2023 · 4 comments · Fixed by #8526
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API. flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@flaky-bot
Copy link

flaky-bot bot commented Sep 2, 2023

Many tests failed at the same time in this package.

  • I will close this issue when there are no more failures in this package and
    there is at least one pass.
  • No new issues will be filed for this package until this issue is closed.
  • If there are already issues for individual test cases, I will close them when
    the corresponding test passes. You can close them earlier, if you prefer, and
    I won't reopen them while this issue is still open.

Here are the tests that failed:


commit: a05b6ed
buildURL: Build Status, Sponge
status: failed

@flaky-bot flaky-bot bot added flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Sep 2, 2023
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the Pub/Sub API. label Sep 2, 2023
@flaky-bot flaky-bot bot added the flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. label Sep 2, 2023
@flaky-bot
Copy link
Author

flaky-bot bot commented Sep 2, 2023

Looks like this issue is flaky. 😟

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (a05b6ed), this test passed in one build (Build Status, Sponge) and failed in another build (Build Status, Sponge).

@hongalex
Copy link
Member

hongalex commented Sep 5, 2023

This seems to be failing on one test panic that causes the remaining tests to fail:

=== RUN   TestIntegration_OrderedKeys_JSON
    integration_test.go:1311: CreateTopic error: key18: got -, want +
        	  []string{
          	"message6019",
        + 	"message6019",
          }
panic: sync: negative WaitGroup counter

@hongalex
Copy link
Member

hongalex commented Sep 5, 2023

In the above, the error string is incorrectly named, it should actually be related to VerifyKeyOrdering.

In addition, the panic is caused by the negative WaitGroup, which itself was caused by a message redelivery. This makes sense given that Pub/Sub is at-least once delivery by default. However, it's strange that this is happening, given that messages should be deduped by this CL.

@hongalex
Copy link
Member

hongalex commented Sep 5, 2023

Some additional findings

  1. This has happened before in pubsub: many tests failed #8333, but that occurred in July which means it happens fairly infrequently
  2. Running this continually locally does not reproduce the issue, so there might be additional concerns with tests running in noisier machines that doesn't happen locally
  3. As mentioned above, message redelivery is possible for Pub/Sub, but we are deduping via message ID (which is unique per message and doesn't change during redeliveries). The fact that we are receiving multiple messages with the same data but different ID is potentially because Publish was called twice with the same data, perhaps because of retries.

For now, I will resolve this issue by cleaning up this test and deduping via message data rather than ID (which is guaranteed to be unique per the test data). We should also log any Publish fails inside a goroutine (can't use test failure inside goroutine, but logging should be sufficient).

@hongalex hongalex added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant