-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: KinesisIO source on FlinkRunner initializes the same splits twice #31313
Comments
Adding dump of replication Flink code here: pom.xml
Java class
|
I suppose this is (similar, but) different issue, probably caused by the same underlying bug. #30903 fixed Impulse only. Does using |
@je-ik Hi, yes that did fix the issue. thank you! For my understanding, what does this option do exactly? And should I expect any performance degradation? |
I am noticing actually a lot of back pressure using this approach despite downstream operators having low CPU usage. Is the fix to the root cause relatively straight forward in which case I can implement it in a forked version of the repo? or is it more involved? |
I don't know the root cause, it seems that Flink does not send the snapshot state after restore from savepoint. I observed this on the Impulse (I suspected that it affects only bounded sources running in unbounded mode, but it seems it is not the case). It might be a Beam bug or a Flink bug.
The flag turns on different expansion for Read transform - it uses splittable DoFn (SDF), which uses Impulse which was fixed earlier. Performance should be similar to classical Read. |
Can you please provide a minimal example and setup to reproduce the behavior?
You can drain the Pipeline, see https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/cli/#terminating-a-job
This is related to how Flink computes target splits. It is affected by maximal parallelism (which is computed automatically, if not specified). You can try increasing it via |
Thanks for the suggestions, will give them a try. I believe the first comment of the ticket provides a simple pipeline that exhibits this behavior on the flink runner but if that doesn’t work, happy to provide another. The example also submits the job in detached mode which may be related, although have seen similar behavior without it. Appreciate your help looking into this, if there’s anything I can assist with, please let me know |
Just to mimic the local setup I used: I ran used the and then stopped the job with a savepoint and then restarted using run with the savepoint path. When doing this, I looked inside the task manager logs and searched for I also switched to kafka and noticed the same behavior so it seems to be related to the runner. I was unable to fix the performance issues with beam_fn_api and notice the backpressure was causing my data to come in waves. Looking at a cpu chart, it was very cyclic with peaks of 99% cpu and troughs of 8% cpu leading me to believe that this pipeline option was causing some sort of build up and then a rush of data causing the cpu to spike. I can make do with kafka offset commits for now, but if there are any pointers on how to fix this in the beam source code, id be happy to take a look and even submit a PR to be included in version 2.57. Although still hoping the issue is somewhere on my end that can be fixed fairly easily |
Hi @akashk99, just to be sure, do you observe the same behavior when not using |
Hi @je-ik , was just able to reproduce the issue by manually running the jar file. Started the job by running
this was a few seconds after the job was submitted. I trimmed the output, but these two logs were there for all of my shards. |
Hi, we are seeing the same behavior on our pipeline.
In our case we are using:
|
We met the same issue, is there any workaround available?
|
@weijiequ Do you use |
hi @je-ik , tried with Taking further look into the stacktrace, the first time adding splits is at Then the normal code path of adding splits (the same as run without savepoint) will add a duplicate split again. How about adding a duplicate check by split id here? |
I just find a possibly related SO question: https://stackoverflow.com/questions/79088562/flink-job-processes-kafka-messages-twice-after-jobmanager-failover-in-ha-mode Could verify if the behavior is the same in your case? I.e. killing TMs after savepoint inicializes splits only once (because they now start from checkpoint, not savepoint)? |
Looks like very similar issue. We do normal shutdown with savepoint and then restore from savepoint. It causes the duplicate message issue (all new produced messages after restart will be consumed twice, existing messages are fine). |
@je-ik verified, it works, thank you!! |
@je-ik After applying this change, I noticed a side-effect: when I scale out my job (for example, increasing the parallelism from 4 to 8) and then restart from a savepoint, the additional splits (indexes 4, 5, 6, 7) never start, while the original four splits (indexes 0, 1, 2, 3) continue running as expected. |
Good catch! 👍 |
Thanks @je-ik , once you have the updated patch, I can also verify on my local. |
@weijiequ can you try setting |
@je-ik Confirmed with |
You can actually use lower number. Something that can fit into maximal scale you can reach. |
Got it, double confirm on the suggested solution at the moment
|
Yes. |
What happened?
Bug description
Setup details:
beam-sdks-java-io-amazon-web-services2
)Bug details:
org.apache.beam.sdk.io.aws2.kinesis.KinesisReader
is assigned the same splits twice, once with snapshot state, and once without. This leads to duplicate data being processed.Replication steps:
Logs:
shardId-000000000000
toshardId-000000000003
are first initialized with checkpoint stateAFTER_SEQUENCE_NUMBER
(correct).AT_TIMESTAMP
(not correct).Issue Priority
Priority: 3 (minor)
Issue Components
The text was updated successfully, but these errors were encountered: