-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: don't start queries when corruption is detected during startup #7821
feat: don't start queries when corruption is detected during startup #7821
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I am not sure the Create state always needs to be mapped to error but I think it should work.
a6d494b
to
6f02540
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a small concern about mapping CREATED to ERROR: the transition of CREATED to REBALANCING is timing dependent, i.e. it would transit after it reached the broker (group coordinator) for the first time since starting up. In practice it is usually very fast, maybe in a few ms, but still before that the CREATED state is actually valid.
For list-queries / query-status purposes, if the broker was temporarily unavailable after the KS runtime is up and running, then the state may stuck in CREATED for a while, causing them to report false positives.
Some background context about this on JIRA: https://issues.apache.org/jira/browse/KAFKA-6520?src=confmacro
ksqldb-common/src/main/java/io/confluent/ksql/util/KsqlConstants.java
Outdated
Show resolved
Hide resolved
a397978
to
b683053
Compare
@guozhangwang @wcarlson5 I updated the PR so we're not updating the Streams state to Query state mapping. Instead, I just set a corruptionDetection variable in QueryMetadataImpl and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea
Description
If ksqlDB detects corruption during a restore it shouldn't start queries, because the queries may be terminated by a command following the offset at which we detect corruption. Instead, we should only apply DDLs and all queries should be reported as "failing/error" from the API.
Testing done
Unit test
Also, started up a server with backup enables, and created 3 queries. Stopped the server and modified the backup file so that the last query created was different from what's in the command topic. Started up server again and verified that the corruption log was present and the
show queries
reported the 2 pre-corruption queries in ERROR state.Reviewer checklist