-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve docs of leaderelection configuration #20601
Conversation
Signed-off-by: chrismark <[email protected]>
Pinging @elastic/integrations-platforms (Team:Platforms) |
Following up #20512 (comment)
I'm not sure we should chase complex use cases but focus on onboarding instead. With the current approach, users don't benefit from the simplified configuration provided by leader election out of the box: on the contrary, they have to learn about specifics of the feature through the comments and change the manifest in several points to simplify the setup, which is a bit of a paradox. If I'm running a big, complex cluster requiring to fine-tune observability, chances are I won't even use the manifests as they are in the repo because the default resources provided to the standalone pods won't fit any possible case. To sum up, I think with the current setup we're not providing a turnkey solution to anybody, while with leader election we have the chance to build a good user story to simplify onboarding. |
💚 Build SucceededExpand to view the summary
Build stats
Test stats 🧪
Steps errorsExpand to view the steps failures
|
@exekias @jsoriano what do you think about @masci 's proposal of completely removing the old proposed manifests that include the Deployment and keep only the new approach with the leader election configuration within the Daemonset? I'm ok with this, however I'm only concerned about users that already use Deployment manifests (breaking change ?). |
Maybe we have to check how big a deployment needs to be to require a dedicated pod. There are two main reasons why this may be needed:
If we are more or less sure that for most of the cases this is not going to be a problem, then I am ok with keeping the new approach only.
We can keep a dummy deployment with the same name and zero replicas to replace the previous one avoiding a breaking change. |
My feeling is that this kind of performance issues will hit users with real big clusters and if they have to fine tune their setup to resolve this issues it shouldn't be a big deal to evaluate the solution of the singleton Pod using a deployment. For most of our users keeping only the new approach from now on would improve their onboarding experience.
I think that having a dummy Deployment spec inside our manifests making things more complicated for users that are not so familiar with the whole setup. On the other hand I'm not sure if this can be considered a breaking change since docs are linked to the versions of the manifests and only old users that might want adopt the new approach might end up with "orphan" Deployment Pods on their clusters however this is something that not so unusual and it's not that bad. In this, I'm leaning towards keeping only the new approach. |
It would sound like the general case would benefit from going only for the DaemonSet, and then we can document how to switch to a cluster-wide Deployment for big clusters. This is not really a breaking change, as it won't break existing deployments, we are just changing how the new ones will happen. |
Signed-off-by: chrismark <[email protected]>
Ok, should we mention something in the docs in case someone upgrades by using the new manifests?
Well, these orphan deployments will be collecting duplicated metrics, with an old version. |
New docs will only point to new manifests and only refer to the new approach. And wouldn't be an overkill to mention this in the old versions' docs?
|
I meant to mention only something like "if you deployed these manifests before 7.10, you may need to manually remove the deployment". But we are not telling anything about upgrades so far, so maybe it is ok to don't say anything and users just change the versions in the manifests they are using. |
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few wording suggestions for the docs but this LGTM!
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
Co-authored-by: Massimiliano Pippi <[email protected]>
(cherry picked from commit 6d7213f)
…ne-2.0 * upstream/master: (87 commits) [packaging] Normalise GCP bucket folder structure (elastic#20903) [Metricbeat] Add billing metricset into googlecloud module (elastic#20812) Include python docs in devguide index (elastic#20917) Avoid generating incomplete configurations in autodiscover (elastic#20898) Improve docs of leaderelection configuration (elastic#20601) Document how to set the ES host and Kibana URLs in Ingest Manager (elastic#20874) docs: Update beats for APM (elastic#20881) Adding cborbeat to community beats (elastic#20884) Bump kibana version to 7.9.0 in x-pack/metricbeat (elastic#20899) Kubernetes state_daemonset metricset for Metricbeat (elastic#20649) [Filebeat][zeek] Add new x509 fields to zeek (elastic#20867) [Filebeat][Gsuite] Add note about admin in gsuite docs (elastic#20855) Ensure kind cluster has RFC1123 compliant name (elastic#20627) Setup python paths in test runner configuration (elastic#20832) docs: Add `processor.event` info to Logstash output (elastic#20721) docs: update cipher suites (elastic#20697) [ECS] Update ecs to 1.6.0 (elastic#20792) Fix path in hits docs (elastic#20447) Update filebeat azure module documentation (elastic#20815) Remove duplicate ListGroupsForUsers in aws/cloudtrail (elastic#20788) ...
Follow up of #20512.