-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support scale up of nodepool with zero nodes with ephemeral storage r… #6607
Support scale up of nodepool with zero nodes with ephemeral storage r… #6607
Conversation
1d86b7a
to
25b927f
Compare
node.Status.Capacity[apiv1.ResourceCPU] = *resource.NewQuantity(int64(shape.CPU), resource.DecimalSI) | ||
node.Status.Capacity[apiv1.ResourceMemory] = *resource.NewQuantity(int64(shape.MemoryInBytes), resource.DecimalSI) | ||
node.Status.Capacity[ipconsts.ResourceGPU] = *resource.NewQuantity(int64(shape.GPU), resource.DecimalSI) | ||
|
||
ephemeralStorage, err := getEphemeralResourceRequestsInBytes(freeformTags) | ||
if err != nil { | ||
klog.V(4).Info(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, curious, why do we log the error, but not return the error? And why is this Info level? Shouldn't it at least be a warning?
If we are doing this on purpose, can we add a comment explaining why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was that, since it is not mandatory for users to add the required freeform tag, so we could just leave it as an info log with higher verbosity level so it does not spam the logs. This would only be needed if customers want CA to spin up nodes with a specific ephemeral-storage size (especially when they want to scale up the pool from 0).
Let me know if you would still like me to make modifications here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if a user is specifying a value, but it is not getting parsed correctly, I personally think we should error out so that the user can quickly know that something is wrong and correct it.
Otherwise, they will think everything is successful, even when it's not.
However, if we don't do this, at the very least we should have it as a warn level. I could see arguments where it would be okay to warn the user. For example, the autoscaler is working, then the user mistypes a free form tag which will cause a failure, but doesn't see a failure until days later when a scale operation actually occurs. Or the fact that specifying one wrong value will cause scaling to not happen at all.
Overall, I'd be okay with erroring out or warning (but slightly prefer erroring).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, thanks. I've changed the logic here.
} | ||
unit := value[valueLength-3:] | ||
if unit != "GiB" && unit != "MiB" { | ||
return 0, fmt.Errorf("invalid value set for ephemeral-storage requests") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, we should make the error measurement easier for the user to know what to do.
"I put a wrong value, but the error doesn't indicate what the value should be"
This comment might be irrelevant if we use the kubernetes resource library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yes, I have made the changes to use the k8s library.
if err != nil { | ||
return 0, err | ||
} | ||
if unit == "GiB" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://pkg.go.dev/k8s.io/apimachinery/pkg/api/resource
We should be using this resource library, so that it will match everything kubernetes does.
Test file has examples: https://github.com/kubernetes/apimachinery/blob/master/pkg/api/resource/quantity_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, updated this.
@@ -25,4 +25,7 @@ const ( | |||
// ToBeDeletedByClusterAutoscaler is the taint used to ensure that after a node has been called to be deleted | |||
// no more pods will schedule onto it | |||
ToBeDeletedByClusterAutoscaler = "ignore-taint.cluster-autoscaler.kubernetes.io/oke-impending-node-termination" | |||
|
|||
// EphemeralStorageSize is the freeform tag key that would be used to determine the ephemeral-storage size of the node | |||
EphemeralStorageSize = "cluster-autoscaler/node-ephemeral-storage" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is an OKE specific implementation, I'm on the fence whether we should make the name indicate that it's custom/OKE specific. 🤔
25b927f
to
80cd80f
Compare
func getEphemeralResourceRequestsInBytes(tags map[string]string) (int64, error) { | ||
for key, value := range tags { | ||
if key == npconsts.EphemeralStorageSize { | ||
klog.V(0).Infof("ephemeral-storage size set with value : %v", value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps change this to klog 4 (and the log below)? Do we we need to see these logs every time (log level 0)?
} | ||
}() | ||
|
||
resourceSize := resource.MustParse(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://pkg.go.dev/k8s.io/apimachinery/pkg/api/resource#ParseQuantity
We should use ParseQuantity instead of MustParse so it doesn't panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, made the changes.
eba5612
to
6d1a03f
Compare
Changes look good to me. |
@@ -25,4 +25,7 @@ const ( | |||
// ToBeDeletedByClusterAutoscaler is the taint used to ensure that after a node has been called to be deleted | |||
// no more pods will schedule onto it | |||
ToBeDeletedByClusterAutoscaler = "ignore-taint.cluster-autoscaler.kubernetes.io/oke-impending-node-termination" | |||
|
|||
// EphemeralStorageSize is the freeform tag key that would be used to determine the ephemeral-storage size of the node | |||
EphemeralStorageSize = "cluster-autoscaler/node-ephemeral-storage" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using free-form tags to provide local disk information? Is there something special about the scale from 0
scenario that prevents the OKE and InstancePool implementations from using the LocalDisksTotalSizeInGBs
and LocalDisks
fields of core.shape when populating the node template in buildNodeFromTemplate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming ephemeral storage (like memory and CPU) can be determined from the shape, it is still useful to optionally support using free-form tags to override shape specific values like CPU, memory, ephemeral-storage, GPU, etc..
However, in my opinion, the values from core.shape should remain the default source resource information for a node when they are available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK... after looking into this more, it looks like most shapes use "block storage only". This means that most of the time LocalDisksTotalSizeInGBs
and LocalDisks
will not be set.
I still think it's a good idea for both the node-pool and instance-pool implementations to check for and use shape.LocalDisksTotalSizeInGBs
from core.shape by default. However, given that it's going to be unset most of the time, it seems like a good idea to add the ability to receive these values via free-form tags.
@@ -542,12 +545,22 @@ func (m *ociManagerImpl) buildNodeFromTemplate(nodePool *oke.NodePool) (*apiv1.N | |||
}) | |||
} | |||
|
|||
freeformTags, err := m.ociTagsGetter.GetFreeformTags(nodePool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the discussion in the other thread, I would expect our shape struct to have a new field, say, EphemeralStorageInGB
, which may or may not be set after the call to m.ociShapeGetter.GetNodePoolShape(nodePool)
based on disk info of the node shape.
It is at this point that buildNodeFromTemplate()
would set the value of node.Status.Capacity[apiv1.ResourceEphemeralStorage]
to either shape.EphemeralStorageInGB
or the value from the free-form tag.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We get the LocalDiskSize of a shape from the Core Service/Compute API - (LINK).
But this is not part of the NodeShapeConfig.
We would need to work on adding the changes to oci-go-sdk so that the NodeShapeConfig has the LocalDiskSize, but this would be a whole api level change and would involve a lot of time and effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Since there are two implementations of this provider (and because the core API may evolve to offer ephemeral storage), I think it still makes sense to expose this attribute to both implementation using our common.Shape struct alongside the other capacity attributes (CPU, memory, GPU).
So, for now, you still fetch the value from the free-form tag using a common.getEphemeralResourceRequestsInBytes(freeformTags)
function, but instead of setting a local / private variable, you'd set shape.EphemeralStorageGB.
cluster-autoscaler/cloudprovider/oci/common/oci_shape.go
type Shape struct {
Name string
CPU float32
GPU int
MemoryInBytes float32
++ EphemeralStorageGB float32
}
As a followup MR, we'd need to ensure the instancepool implementation is also using the common.getEphemeralResourceRequestsInBytes(freeformTags)
function to check for ans set shape.EphemeralStorageGB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I have updated the PR with the above suggested changes.
Please take a look.
6d1a03f
to
cc73806
Compare
1c02dfc
to
4b9cba3
Compare
48ea7e7
to
5c7dc04
Compare
|
||
// TagsGetter returns the oci tags for the pool. | ||
type TagsGetter interface { | ||
GetFreeformTags(*oke.NodePool) (map[string]string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there will be two separate implementations, please rename this function to GetNodePoolFreeformTags(*oke.NodePool) (map[string]string, error)
.
Later, we can add a second function to the interface called GetInstancePoolFreeformTags(*core.InstancePool) (map[string]string, error)
and the associated implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the requested rename of GetFreeformTags
to GetNodePoolFreeformTags
, this change is looking good.
5c7dc04
to
cd077ce
Compare
cd077ce
to
e6291c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Vijay.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlamillan, vbhargav875 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
With OCI Nodepools current implementation, we do not support scaling up of a nodepool with zero nodes when the pod spec was ephemeral-storage requests since does not know what ephemeral-storage size should be set to the node. With this PR we allow users to set the ephemeral-storage of a nodepool using OCI freeform tags.