-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: change default exception handling for timestamp extractors #4632
Conversation
...s/src/main/java/io/confluent/ksql/execution/streams/timestamp/LoggingTimestampExtractor.java
Outdated
Show resolved
Hide resolved
if (failOnError) { | ||
throw e; | ||
} | ||
return -1L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this allow KSQL to display the row with ROWTIME = -1 in case of a failure?
I'm trying to understand how the FailOnInvalidTimestamp
will work in this situation. If a -1 is found on the row, then the FailOnInvalidTimestamp
will throw an exception which this code will catch and log it to the processing log. But it returns a -1 back to stream, so the stream will set it to as ROWTIME, won't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any negative timestamp causes the event to be ignored by Kafka Streams. That means that if this wraps FailOnInvalidTimestamp
but you set ksql.timestamp.throw.on.invalid
to false
then it will suppress the throw and just ignore the message
ksql-streams/src/main/java/io/confluent/ksql/execution/streams/SourceBuilder.java
Outdated
Show resolved
Hide resolved
return message(exception, () -> record.map(Base64.getEncoder()::encodeToString).orElse(null)); | ||
} | ||
|
||
public static Function<ProcessingLogConfig, SchemaAndValue> deserializationErrorMsg( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use a different message type for this error. Deserialization error messages generally mean the serde couldn't deserialize the record in the topic, and contain a base64 encoded representation of the data inside (in a field called recordB64
). The data written out here represents something different - KSQL's view of the row once it's been deserialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to use the existing record processing error (EngineProcessingLogMessageFactory.recordProcessingError
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was debating which I should do since this felt like a type of deserialization error to me, but I'll update it to a new type (I can see both sides)
bd10cfb
to
ba8cea0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
571760b
to
a28de97
Compare
a28de97
to
2272e45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
fixes #2609
Description
ksqlDB currently fails violently whenever a timestamp cannot be extracted from a record. Perhaps more fatally, it also does not log anything to the processing log meaning users that are using the cloud offering of ksql will not be able to see what happened to their queries.
This PR addresses both of those concerns and gives users the ability to toggle whether they want the failure to be violent or graceful so that they can choose not to lose data.
Testing done
New unit tests and end to end testing:
Reviewer checklist