-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync job fails/retries itself after successfully transferring all the data. #5870
Comments
@sherifnada also experiencing this randomly now on subsequent syncs |
Reading the logs, I can't see anything wrong with the sync itself which makes it look like a worker coordination issue. @danieldiamond can you share logs from your failing instance? cc @jrhizor do you have any ideas about what might be going on here? |
looking back at these logs, they actually start with the error
but the job continues ahead and appears to "succeed" whilst still causing a retry |
@danieldiamond what's the sync frequency on this one? this sounds like the sync waited "too long" and now the data is no longer in the binlog. Alternatively, are you re-using the same CDC source in multiple connections, potentially having overlapping consumers of that binlog? |
attached sync timestamps
separately. on point 2. I do have multiple connections with the same CDC source to the same destination. is this not allowed? I run these two multiple connections at the same time and they appeared to work as expected although I run into the "hanging" issue, where it doesnt actually COPY the data after reading it. |
They shouldn't send into the same schema otherwise you'll get overwrites but it shouldn't cause the source connector to fail in either case. |
@sherifnada @karinakuz are there any plans to fix this? |
Seeing the same error with big query as a destination, the data is in the bq dataset, looks like dbt was also run, however seeing logs like this in the attempts marked as failed:
|
I had the same thing with Klaviyo - MongoDb. here's the logs:
|
Seeing the same with Zendesk on a medium size sync 70k records (although slow 1h41m).
also
|
FYI you should be able to fix this in current state by upping the |
@lukeolson13 thanks, we'll give it a try by manually applying the fix from #10614 |
Another failure, it has to be something else that's deleting the pod. We're running on Kubernetes in GCP, maybe there is another default sweeper there |
i had this happeining in the different jobs. |
problem is not solved by changing |
@alvaroqueiroz for us it ended up being because of the GKE GC kicking in: #10934 |
Facing the same issue and the second retry finished successfully but it cause duplications in the destination since the curser didn't update once the first job failed but successfully ingest the data... any estimation when this will get priority? or any suggestions for workaround? |
It seems, Facing same issue in latest version of airbyte (0.40.18) . I am using sql server (msssql) connector. Retry gets success and i am handling duplicate using dbt. job in snowflake Error: 2022-08-10 22:17:50 INFO i.a.w.p.KubePodProcess(destroy):662 - (pod: product-analytics / source-mssql-read-16787-0-jgvhm) - Destroying Kube process. |
I think, error is misleading . I increased CPU of pod then I donot see error message/attempt. |
Should be resolved long ago. |
Enviroment
Current Behavior
Sync job fails after successfully transferring all the data. The kubernetes pods are terminated gracefully with completed status.
Expected Behavior
Sync job should not fail after successfully transferring all data.
Logs
logs-185-0.txt
Steps to Reproduce
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: