-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial cdc options for sql databases #1643
Conversation
stop_event: Event, | ||
identifier: 'str' = '', | ||
timeout: t.Optional[float] = None, | ||
strategy: t.Dict = {'strategy': 'polling', 'options': {'frequency': 3600, 'auto_increment_field': None}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this get injected into the class? From CFG
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uhm .. good question
no its not injected in CFG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, is the strategy to use the current CDC code, but simply with a new producer?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1643 +/- ##
===========================================
- Coverage 80.33% 67.42% -12.92%
===========================================
Files 95 118 +23
Lines 6602 8371 +1769
===========================================
+ Hits 5304 5644 +340
- Misses 1298 2727 +1429 ☔ View full report in Codecov by Sentry. |
@blythed exactly! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work.
I think we should add a simple integration test to test the fault tolerance of CDC tasks, such as the behavior after obtaining the latest batch of data and predicting errors in certain two data.
It doesn’t have to be added now, it can be used as a TODO item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good!
Some questions:
How do we activate this class via configuration?
Can we document the developer journey to setting up this class to work
with some SQL database?
@blythed so following is basic usage: from superduperdb import superduper
db = superduper('SQL/URI')
db.cdc.start()
#or
table = Table('my_table')
strategy = PollingStrategy(type='incremental', frequency=0.5, auto_incremental_field='id_field')
'''
# Here type could be either
`incremental` meaning user has a incremental field in table
or
`join_id` meaning user does not have an incremental field in table and we create separate metadata table where we store processed ids and `ANTI left join` on user table.
'''
db.cdc.listen(on=table, strategy=strategy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work.
Description
#1614
Related Issues
Checklist
make unit-testing
andmake integration-testing
successfully?Additional Notes or Comments