-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NodePool support to Cluster resource #188
Conversation
b4b88ea
to
386ec21
Compare
aa676ba
to
50b59d3
Compare
c9402d4
to
aeb451f
Compare
e7b94a0
to
351497d
Compare
55a961b
to
19a6c99
Compare
kind ping for review @chrisseto |
// It returns 1 when not initialized (as fresh clusters start from 1 replica) | ||
func (r *Cluster) GetCurrentReplicas() int32 { | ||
if r == nil { | ||
func (r *Cluster) SumNodePoolReplicas() int32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This doesn't indicate if it's returning the current replicas or desired replicas. Please add a comment or rename the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to GetDesiredReplicas. it looks like it in the diff, but this function is not a replacement for GetCurrentReplicas - this can be seen in metric_controller.go, where a reference to Spec.Replicas is replaced with a call to this function here.
@@ -1422,3 +1419,41 @@ func (r *Cluster) GetDecommissionBrokerID() *int32 { | |||
func (r *Cluster) SetDecommissionBrokerID(id *int32) { | |||
r.Status.DecommissioningNode = id | |||
} | |||
|
|||
// getNodePoolsFromSpec returns the NodePools defined in the spec. | |||
// This contains the primary NodePool (driven by spec.replicas for example), and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This contains the primary NodePool (driven by spec.replicas for example), and the | |
// This contains the default NodePool (driven by spec.replicas for example), and the |
Not sure what the terminology is but let's be consistent. If the primary pool has the name "default", let's rename the constant otherwise update this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is an oversight. default nodepool is the correct terminology. Updated the comment
operator/pkg/nodepools/pools.go
Outdated
"k8s.io/apimachinery/pkg/api/resource" | ||
"k8s.io/utils/ptr" | ||
"sigs.k8s.io/controller-runtime/pkg/client" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra newline (we'll need to reconfigure out linters).
- role: worker | ||
image: kindest/node:v1.29.8@sha256:d46b7aa29567e93b27f7531d258c372e829d7224b25e3fc6ffdefed12476d3aa | ||
- role: worker | ||
image: kindest/node:v1.29.8@sha256:d46b7aa29567e93b27f7531d258c372e829d7224b25e3fc6ffdefed12476d3aa | ||
- role: worker | ||
image: kindest/node:v1.29.8@sha256:d46b7aa29567e93b27f7531d258c372e829d7224b25e3fc6ffdefed12476d3aa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Rafal. This isn't super sustainable. Can we not just schedule multiple brokers onto a single node for the purpose of testing?
operator/pkg/admin/admin.go
Outdated
@@ -17,11 +17,14 @@ import ( | |||
"io" | |||
"time" | |||
|
|||
corev1 "k8s.io/api/core/v1" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
operator/pkg/labels/labels.go
Outdated
// NodePoolKey is used to document the node pool associated with the StatefulSet. | ||
NodePoolKey = "cluster.redpanda.com/nodepool" | ||
|
||
// PodLabelNodeIDKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: incomplete comment. Seems like PodNodeIDKey
would be more consistent with the naming convention here? Label
is already implied by the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
operator/pkg/labels/labels.go
Outdated
@@ -30,6 +30,11 @@ const ( | |||
PartOfKey = "app.kubernetes.io/part-of" | |||
// The tool being used to manage the operation of an application | |||
ManagedByKey = "app.kubernetes.io/managed-by" | |||
// NodePoolKey is used to document the node pool associated with the StatefulSet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// NodePoolKey is used to document the node pool associated with the StatefulSet. | |
// NodePoolKey is used to denote the node pool associated with the StatefulSet. |
Struggling to explain why document isn't the right word here. I think document implies a separation of information and thing that's being informed about. A label on something wouldn't document something about itself but rather denote something about itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Thanks for the ping :) |
The function is about desired replicas. Rename it to make this clear.
the correct term is "default".
it's not required to use status anymore, as GetNodePools contains also deleted pools.
Thanks for the review! Addressed most of the comments. Mostly, these are open and awaiting further feedback:
also waiting for tests to finish. there is some problem with flakiness (here, but also in main) |
- Change param to non-pointer, do conversion from pointer at the caller's site, where they already checked for nil - Use loop to simplify
- The term Label is obsolete, as it's in the labels package. - Update comment
@chrisseto kind ping for next round. would be really important to bring this over the finish line this week! |
ab001c3
to
9e08973
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving so I'm not in a blocking path. Some refactoring of GetNodePools
before merging would be greatly appreciated.
I can't seem to find the comment again but I'm still curious about the management of the seed servers list and why it's populated from Status instead of the Spec.
Before merging could you either point me to the answer or comment again?
}) | ||
} | ||
|
||
// Also add "virtual NodePools" based on StatefulSets found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AH. Wow, I really misread this function the first time around. I think what's really throwing me for a loop (aside from using side by side view in github) is the inconsistency of breaking loops within this loop and the number at that.
Could we pick a single method of iterating and short circuiting be it labeled breaks, slices.Contains
, or a helper findFunc[T any]([]T, func(T) bool) (*T, bool)
function?
operator/pkg/nodepools/pools.go
Outdated
for i := range stsList.Items { | ||
sts := stsList.Items[i] | ||
|
||
for _, ownerRef := range sts.OwnerReferences { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this check and the later check for the existence of a redpanda
container a bit odd. We're already using a label selector to filter our list of Sts's; is there a reason we're performing these extra checks? Are they checking for different things? If they're not, why does one error but the other does not?
I like to avoid "anxious" code when possible as it creates an ethical dilemma when it comes time to refactor. I'll have all the above questions with potentially no one to answer them nor test cases to validate them.
If these checks are in place for a reason, document what exactly they're checking for and ideally the situations in which they might trigger.
If these checks are just anxiety or paranoia that you can't shake, group them together and label them as such. A simple // Paranoid check to make sure we don't delete other STSs
can save so much head-aching in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Ownerref check
i replaced it with strings.ContainsFunc, and improved the comment.
- Redpanda container check
There is no guarantee that a redpanda container exists. It's unlikely, but the STS could just not have it for some reason. So i check for it and error out. It's not specifically paranoid, it's just that the container is required to build the NodePoolSpec of that deleted (extract resource requirements from it). I'll keep that unchanged
based on your comment, i refactored it. It's now not based on status anymore. It used to use status, because spec does not contain deleted nodepools. however, later, a helper function was added, that always returns all nodepools, including these "virtual pools" based on the deleted ones. now, we're using that, status is not required anymore. |
Operator v1: support for multiple Node Pools
Overall design notes
This avoids problems with multiple node pools: bytes-based is very different if disk size changes (move between different tiers in cloud, for example).Percentage based is kept stable mostly (even exactly the same most of the time, 15% typically), and will not cause immediate churn on the old nodePool if disk size is going down.
Clusters <22.3.0 are not supported anymore. This significantly reduces complexity
with multi-nodepool-support. We would have to start the first node in initial startup,
and wait for this node. However, with multiple nodepools, all nodepools would have to
coordinate this. Since this is a very old version, that is not supported anymore,
support if therefore removed to simplify.
Not directly related changes, to reduce test flakiness: