Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Respect source transaction boundaries #2335

Merged
merged 3 commits into from
Feb 9, 2024
Merged

refactor: Respect source transaction boundaries #2335

merged 3 commits into from
Feb 9, 2024

Conversation

chubei
Copy link
Contributor

@chubei chubei commented Jan 19, 2024

This PR changes the mechanism of pipeline Commit message generation.

Connector behaviour change

- Connector's "state" used to be defined per table as HashMap<String, TableState>, now it's a single SourceState.

Pipeline behaviour change

Commit used to be generated whenever a source has ingested a certain number of operations, or certain time has passed. Now it's generated when all sources happen to be at a transaction boundary.

Note that if there are multiple sources and data keep flow in, it's possible that pipeline Commit is never generated, because when source A is at transaction boundary, source B isn't necessarily that.

@karolisg I need your review on the postgres connector.
@abcpro1 I need your review on the mysql connector.

@karolisg
Copy link
Contributor

For mysql, I think we need to send a commit on XID_EVENT.

For example
Screenshot 2024-01-19 at 17 31 31

@chubei
Copy link
Contributor Author

chubei commented Jan 19, 2024

For mysql, I think we need to send a commit on XID_EVENT.

Can you help fix that?

@karolisg
Copy link
Contributor

For mysql, I think we need to send a commit on XID_EVENT.

Can you help fix that?

Sure

@chubei chubei linked an issue Jan 19, 2024 that may be closed by this pull request
@chubei chubei enabled auto-merge January 22, 2024 13:06
@Jesse-Bakker
Copy link
Contributor

In stateless replication, every DAG with be a path, so the case of commits never happening won't occur, right?

@chubei
Copy link
Contributor Author

chubei commented Jan 22, 2024

In stateless replication, every DAG with be a path, so the case of commits never happening won't occur, right?

Not if we start doing denormalization.

auto-merge was automatically disabled February 3, 2024 14:24

Merge queue setting changed

@chubei chubei merged commit 02c44ea into getdozer:main Feb 9, 2024
4 checks passed
@chubei chubei deleted the refactor/transaction branch February 9, 2024 04:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transactional pipeline
3 participants