-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fixes the single host lag reporting case #4494
fix: Fixes the single host lag reporting case #4494
Conversation
ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/HeartbeatAgent.java
Outdated
Show resolved
Hide resolved
ksql-rest-app/src/main/java/io/confluent/ksql/rest/server/MaximumLagFilter.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me overall, in the context of the issue. Another pair of eyes from @vpapavas on the review and we can add a test before merging?
Thank you @AlanConfluent! Does it make sense to add a functional test with a single server that catches this corner-case (and maybe others we have not found yet?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in this PR itself sg to me. +1 to @vpapavas on adding more cases to functional test. Wondering if that can happen in another PR though..
Also could you share any local testing you may have done around this?
@@ -223,6 +223,9 @@ private void processHeartbeats(final long windowStart, final long windowEnd) { | |||
} | |||
return status; | |||
}); | |||
for (HostStatusListener listener : hostStatusListeners) { | |||
listener.onHostStatusUpdated(getHostsStatus()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any exception handling needed here to ensure we don't stop notifying if one of the callbacks throw an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Factored out notification to a method where I now catch and log any errors.
aad6b64
to
9c126ea
Compare
I have a functional test I'm working on. Won't take long, but can make it another PR so we're not blocked on it. I tested by modifying one of my functional tests to have one host and verified that it reported lag. I also built the server and ran it locally with a single host and then looked at the clusterStatus as I produced messages to a topic and saw that the lag increased. |
Description
Currently, when a single host exists, lag doesn't get reported. This ensures lag always gets reported. Also changes the default to not filter if there is no lag data.
Testing done
Describe the testing strategy. Unit and integration tests are expected for any behavior changes.
Reviewer checklist