Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: JMSIO drops messages when autoscaling down. #30054

Closed
3 of 16 tasks
jlampek opened this issue Jan 19, 2024 · 1 comment
Closed
3 of 16 tasks

[Bug]: JMSIO drops messages when autoscaling down. #30054

jlampek opened this issue Jan 19, 2024 · 1 comment

Comments

@jlampek
Copy link

jlampek commented Jan 19, 2024

What happened?

Steps to Reproduce

  1. Create a Dataflow job that uses JMSIO to read from an MQ queue.
  2. Configure the Dataflow job to use stream engine with number of workers initially set to 5 and minimum to 1.
  3. Use the DefaultAutoscaler so the workers will auto downgrade from 5 to 1 when Dataflow cannot determine the backlog.
  4. Publish messages to the MQ topic.

Expected Behavior
Workers should scale down, no messages should be dropped.

Current Behavior

  1. The dataflow job is consistently dropping transactions at the point in time the workers autoscale down.
  2. GCP Dataflow logs do not indicate that the runner is downscaling.

Issue Priority

Priority: 3 (minor)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@Abacn
Copy link
Contributor

Abacn commented Jan 30, 2024

May or may not be related to #25945. I was suspecting using connection pool that closes only in tearDown could cause problem #25945 (comment). If the publish is cached and only get flushed on closing session then data loss might happen, as teardown is not guaranteed to be called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants