Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📖 Add cluster autoscaler scale from zero ux proposal #2530

Conversation

michaelgugino
Copy link
Contributor

What this PR does / why we need it:
Add proposal to document expected scale from zero semantics for integration between cluster autoscaler and cluster-api.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes # #2527

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 4, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: michaelgugino
To complete the pull request process, please assign detiber
You can assign the PR to them by writing /assign @detiber in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 4, 2020
@elmiko
Copy link
Contributor

elmiko commented Mar 4, 2020

subscribing to this, thanks @michaelgugino !

@vincepri vincepri added the kind/proposal Issues or PRs related to proposals. label Mar 4, 2020
@vincepri vincepri added this to the Next milestone Mar 4, 2020
Well defined API.

#### Cons
Users can't easily update this field if they want to set values themselves for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just that kubectl makes it hard to update status? Personally I'd rather tackle that (e.g. kubectl plugin) vs working around it with the override, if so!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was just a matter of updating status, it'd be one thing. The other issue is that status is meant to be "recreatable" and if the resources are backed up/restored or migrated from one cluster to another the status will be lost.

like that type of thing.

### Spec Field
We can add fields to the spec of a particular object to set these items.
Copy link
Contributor

@justinsb justinsb Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note the clever ambiguity in "particular object" - were you thinking of the MachineSet/MachineDeployment? We could also do a "sidecar" object, similar to how we did the infrastructureRef. We could then also put additional autoscaling controls here e.g.

kind: MachineDeploymentAutoscaling
spec:
  maxReplicas: 10
  minReplicas: 0
  scaleDownDelay: 10m
  scaleUpDelay: 0s
status:
  nodeShape:
     memory: 1Gi
     gpus: 1
     labels:
      foo: bar

I'm not sure where we're currently putting minReplicas / maxReplicas? Also it's a little awkward because I would imagine I'd want to set these on the MachineDeployment level, but I'd imagine node-attributes should technically be on the MachineSet level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One out of the box idea I had that seems related to this one was removing the integration between the cluster autoscaler and the cluster-api bits.

We could come up with a new 'external scale group' object. The autoscaler increments or decrements that object, and something on the cluster-api watches that object and takes the appropriate actions. This would allow people to build their own automation after the recommendations of the autoscaler, but it's not entirely clear what this should look like.

Copy link
Member

@enxebre enxebre Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minReplicas / maxReplicas are known annotations which apply to any scalable resource i.e machineSet/machineDeployment which makes them discoverable for the cluster autoscaler. They could be owned by e.g a machineAutoscaler resource:
#2369

cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size
cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size

apiVersion: "cluster.x-k8s.io"
kind: "MachineAutoscaler"
metadata:
  name: "worker-us-east-1a" 
  namespace: "cluster-api"
spec:
  minReplicas: 1 
  maxReplicas: 12 
  scaleTargetRef: 
    apiVersion: cluster.x-k8s.io
    kind: MachineSet 
    name: worker-us-east-1a

I would imagine I'd want to set these on the MachineDeployment level,

I would expect this info to be accessible through any scalable resource liable to be autoscaled but decoupled from it. This is purely provider specific compute info which does not necessarily needs a replicas controller on top to be meaningful. In this line I'd see the current providerMachineTemplates or something like a machineClass good candidates to infer this information as they see fit for each specific provider and expose it in a common way.

We have to build something that copies these values into the staus. This seems
like it could be an easily shared piece of library code.

Multiple controllers updating the same field on the status object, need to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid the potential for conflicting controllers attempting to manage the same field in a resource's Status field.

We also need to keep in mind that the controller for the resource "owns" the status of that resource. Ideally we'd be able to copy these values from a resource that is owned by the controller that is generating them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not really any place to copy the information from. Unless we create some sort of new 'scale record' or some crazy thing like that, and then that's yet-another-crd.

# Cluster Autoscaler Scale from Zero UX

## Summary
Cluster Autoscaler supports scale from zero for cloud-native providers. This is done via compiling CPU, GPU, and RAM info into the cluster autoscaler at build time for each supported cloud.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done via compiling CPU, GPU, and RAM info into the cluster autoscaler at build time for each supported cloud.

I'd say this is actually a provider implementation choice on how they decide to implement the TemplateNodeInfo interface, e.g gce does it dynamically https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/gce/gce_cloud_provider.go#L332
https://github.com/kubernetes/autoscaler/blob/589cd2af9c25088ce3408eeb88ba9239201d005d/cluster-autoscaler/cloudprovider/gce/gce_manager.go#L491-L504

@enxebre
Copy link
Member

enxebre commented Mar 5, 2020

To synthesize my comment here #2530 (comment)
To me there’re two design invariants that the final solution should be compliant with:
1 - This info should be accessible through any scalable resource liable to be autoscaled but decoupled from it.
2 - This is purely provider specific compute info (CPU, GPU, and RAM) which does not necessarily needs a replicas reconciler controller on top to be meaningful.

To satisfy 1 and 2 -> It should live along with the other provider specific compute info, e.g providerMachineTemplates or something like a machineClass.

@detiber
Copy link
Member

detiber commented Mar 5, 2020

We could always add a Status to the InfraMachineTemplate contract and require infrastructure providers to populate the Status with the appropriate information (possibly opt in, if they want to support autoscaler integration)

@michaelgugino
Copy link
Contributor Author

Status

@detiber

I don't know if it needs to be a require but it should be a should.

The next question is, where is the appropriate place to specify override values, or otherwise populate values that are present.

If we utilize annotations or even a spec field on the new object @enxebre suggests, we can sync values from the spec into the status of that new scaler object, and also sync the status fields from the infra-template. This would allow people that haven't implemented the logic or updated status fields in the infra-template to take advantage of the scale-from-zero by annotation or updating the spec on the scaler object.

@detiber
Copy link
Member

detiber commented Mar 5, 2020

If we utilize annotations or even a spec field on the new object @enxebre suggests, we can sync values from the spec into the status of that new scaler object, and also sync the status fields from the infra-template. This would allow people that haven't implemented the logic or updated status fields in the infra-template to take advantage of the scale-from-zero by annotation or updating the spec on the scaler object.

I think that makes sense (with my preference being towards using a Spec field rather than an annotation).

## Summary
Cluster Autoscaler supports scale from zero for cloud-native providers. This is done via compiling CPU, GPU, and RAM info into the cluster autoscaler at build time for each supported cloud.

We are attempting to add cluster-api support to the cluster-autoscaler to scale machinesets, machinedeployments, and/or other groupings of machines as appropriate. We are trying to keep an abstraction which is isolated from particular cloud implementations. We need a way to inform the cluster-autoscaler of the value of machine resources so it can make proper scheduling decisions when there are not currently any machines in a particular scale set (machineset/machinedeployment, etc).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

machinepools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's covered under 'and/or other groupings of machines,' although, machinepools don't necessarily group machines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Machinepools is it's own type of "machine", it doesn't group individual machines. I thought it would be good to reference it to have it in mind in the design since some of the concepts used for machines below might not apply the same way.


We will need input from the autoscaler team, as well as input from api reviewers to ensure we obtain the best solution.

These ideas require a particular (possibly new) cluster-api cloud provider controller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
These ideas require a particular (possibly new) cluster-api cloud provider controller
These ideas require a particular (possibly new) cluster-api infrastructure provider controller


#### Cons
We're essentially creating an API around an annotation, and the api folks don't
like that type of thing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we elaborate on the downside of doing that? Not sure "the api folks don't
like that type of thing" counts :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm think they don't like to have specific annotations be part of a defined api. We might be able to do just annotations if this was all for a single controller, but with an external controller looking at our object(s), it's unlikely to pass API review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issues with annotations is that they are an informal api that is detached from the actual api, so enforcing forward/backward compatibility, migrations for support, etc are quite difficult.

For example, the cluster-autoscaler currently uses the deprecated cluster.k8s.io/delete-machine annotation rather than cluster.x-k8s.io/delete-machine and in order to not break the integration we need to handle the backwards/forwards compatibility both in Cluster API and also in cluster-autoscaler: kubernetes/autoscaler#3161.

If the integration was done using an API field, then conversion webhooks would help ease the migration.

Comment on lines +32 to +34
### Goals

### Non-Goals
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is only because I haven't been following prior conversations about this (sorry!), but I think I would be better able to understand the proposal if the motivation section included some goals and non-goals.

For example, is it a goal to minimize (or not have any?) impact on any cluster API provider which does not support autoscale from zero? Is it a goal to minimize impact/knowledge burden for cluster API users who are not using this feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to support scale from zero in an automated way. We'll be relying on providers to provide the attributes (CPU, RAM, GPU) for their respective instance types. We also want to build a mechanism to support scale from zero on cloud providers that don't have common billing (such as an openstack cloud or some other abstraction) by providing a field (either spec or annotation on some object) that allows an administrator to define the attributes themselves.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 8, 2020
@elmiko
Copy link
Contributor

elmiko commented Jun 8, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 8, 2020
Clean separation of discovered values and user intent.

#### Cons
We have to build something that copies these values into the staus. This seems
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We have to build something that copies these values into the staus. This seems
We have to build something that copies these values into the status. This seems

@CecileRobertMichon
Copy link
Contributor

@michaelgugino is this proposal still active/up to date?

@ncdc
Copy link
Contributor

ncdc commented Jul 15, 2020

I seem to recall we had a discussion where we talked about an infra provider providing the CPU/RAM/GPU information somewhere (InfraMachineTemplate?), having that information percolate up to the MachineSet/MachineDeployment, and then the autoscaler could look at that information to make its scaling decisions. Am I remembering that anywhere near correctly? 😄

@seh
Copy link

seh commented Jul 15, 2020

Am I remembering that anywhere near correctly?

There's some oblique mention of a solution in this discussion with @elmiko: kubernetes/autoscaler#3150 (comment).

@elmiko
Copy link
Contributor

elmiko commented Jul 15, 2020

I seem to recall we had a discussion where we talked about an infra provider providing the CPU/RAM/GPU information somewhere (InfraMachineTemplate?), having that information percolate up to the MachineSet/MachineDeployment, and then the autoscaler could look at that information to make its scaling decisions. Am I remembering that anywhere near correctly? smile

There's some oblique mention of a solution in this discussion with @elmiko: kubernetes/autoscaler#3150 (comment).

yes, i believe these details might have come from the discussion about how we have implemented scale from zero for openshift. we use annotations on MachineSets to percolate the details back to the autoscaler. at the minimum we need a way to communicate the cpu/ram/gpu as well as a way for taints to be consumed from the Machine[Set|Deployment], or similar method.

@ncdc
Copy link
Contributor

ncdc commented Jul 16, 2020

@elmiko what taints are you referring to here?

@k8s-ci-robot k8s-ci-robot modified the milestones: Next, v0.4.0 Sep 16, 2020
@elmiko
Copy link
Contributor

elmiko commented Sep 16, 2020

given the conversation we had today in the capi meeting and re-reading this proposal again, i like the option of using the status field with an override. we could start with an implementation of the status updates for the core information and then add an override capability.

for the override, i don't yet have a strong opinion but i think if we used a field in the spec (of Machine[Set|Deployment]) then we could also give a good place for the user to add the taints as well.

i am going to continue to do some research on the providers and what details we might need to implement. i would love to hear more thoughts and ideas here.

@ncdc
Copy link
Contributor

ncdc commented Sep 16, 2020

Some thoughts / let me know if this is what is being asked:

  • Request: ability to set taints on Nodes that are created from Machines when the autoscaler scales up
    • Idea: add taints to MachineSpec. Contract update: bootstrap providers MAY (SHOULD? MUST?) use the MachineSpec's taints when registering a new node
  • Request: ability to know the cpu/memory/gpu specs for Nodes that are created from Machines when the autoscaler scales up from 0
    • Idea:
      • Contract update: infrastructure providers MAY include cpu/memory/gpu information in well-defined status fields in the InfraMachineTemplate (e.g. AWSMachineTemplate.Status (which does not currently exist))
      • Add cpu/memory/gpu fields to ScalableThingStatus (e.g. MachineSetStatus)
      • The ScalableThing (e.g. MachineSet) controller MUST copy cpu/memory/gpu information from well-defined InfraMachineTemplate.Status fields if they exist
      • The autoscaler looks at ScalableThing.Status for the well-known cpu/memory/gpu information, if present
      • The user can override the status values using well-known spec fields (optional)

@elmiko
Copy link
Contributor

elmiko commented Sep 16, 2020

Some thoughts / let me know if this is what is being asked:

* Request: ability to set taints on Nodes that are created from Machines when the autoscaler scales up
  
  * Idea: add taints to MachineSpec. Contract update: bootstrap providers MAY (SHOULD? MUST?) use the MachineSpec's taints when registering a new node

this sounds ok to me, but i need to read up on the MachineSpec a little more.

* Request: ability to know the cpu/memory/gpu specs for Nodes that are created from Machines when the autoscaler scales up from 0
  
  * Idea:
    
    * Contract update: infrastructure providers MAY include cpu/memory/gpu information in well-defined status fields in the InfraMachineTemplate (e.g. AWSMachineTemplate.Status (which does not currently exist))
    * Add cpu/memory/gpu fields to ScalableThingStatus (e.g. MachineSetStatus)
    * The ScalableThing (e.g. MachineSet) controller MUST copy cpu/memory/gpu information from well-defined InfraMachineTemplate.Status fields if they exist
    * The autoscaler looks at ScalableThing.Status for the well-known cpu/memory/gpu information, if present
    * The user can override the status values using well-known spec fields (optional)

++, i think we are on the same page here. i like this option as well.

edit:

forgot to add, yes @ncdc i think you've captured the requests accurately.

@michaelgugino
Copy link
Contributor Author

Request: ability to set taints on Nodes that are created from Machines when the autoscaler scales up

* Idea: add taints to MachineSpec. Contract update: bootstrap providers MAY (SHOULD? MUST?) use the MachineSpec's taints when registering a new node

Maybe we could do this with an OnCreate mutating webhook for node objects. This would require cloud provider info to be available on the node at first join, not sure if that's the case today or not.

@ncdc
Copy link
Contributor

ncdc commented Sep 17, 2020

How would that work exactly?

@michaelgugino
Copy link
Contributor Author

How would that work exactly?

Nodes are objects just like anything else. Something creates the node object. If it's the kubelet itself, then this idea could work. If it's something else and the kubelet is given permissions on a preexisting node object to pencil in the details, it won't work. I don't know the specifics off the top of my head on how the kubelet gets created. If it were to work this way, webhook looks for corresponding machine (using same logic as nodelink controller) and applies relevant taints/labels from machine object.

We might be able to do it later in the node's life cycle with UPDATE but that seems like it might be a lot of noise and add latency to the node object. Health status is on a dedicated lease object, so it shouldn't have a terrible impact, but either way, less than idea to catch all those calls.

@ncdc
Copy link
Contributor

ncdc commented Sep 17, 2020

Yes, the kubelet is typically responsible for creating a Node to represent itself (although it could be created by someone/something else).

I think a webhook could work for a self-managed cluster, where Cluster API is running on the cluster itself. I don't think it would work for a management cluster/workload cluster split, which is quite common for some Cluster API consumers (because the webhook runs in the workload cluster, but it would need access to the management cluster for Machine data).

Having a bootstrap provider apply taints defined in the Machine spec when registering the node would work for both self-managed and management cluster configurations, but it does require every bootstrap provider to implement this functionality. This would be my preferred approach.

@michaelgugino
Copy link
Contributor Author

Yes, the kubelet is typically responsible for creating a Node to represent itself (although it could be created by someone/something else).

I think a webhook could work for a self-managed cluster, where Cluster API is running on the cluster itself. I don't think it would work for a management cluster/workload cluster split, which is quite common for some Cluster API consumers (because the webhook runs in the workload cluster, but it would need access to the management cluster for Machine data).

Having a bootstrap provider apply taints defined in the Machine spec when registering the node would work for both self-managed and management cluster configurations, but it does require every bootstrap provider to implement this functionality. This would be my preferred approach.

If you don't care about the taints or labels being applied immediately, we use a controller to do that today: https://github.com/openshift/machine-api-operator/blob/master/pkg/controller/nodelink/nodelink_controller.go#L244 This is probably adapted from the v1alpha1 code from here at some point. This would be a nice stand-in for bootstrap providers that don't expose kubelet options. This usually isn't an issue because the node takes a little time to go ready, so nothing will be scheduled there that shouldn't be (maybe daemonsets? TBD).

Generally speaking, I would expect bootstrap providers to support setting kubelet options on an override basis. It's a never ending road of configuration management, though. As soon as you expose this functionality, it's another place to handle on upgrade. What options did the user set? Any deprecated options? Warn? Fail? Their custom options conflict with the options you set by default?

My rule of thumb is don't allow any changes to configuration by the user unless it's absolutely necessary. Taints in the kubelet args would be the best way, but it's also the worst way.

@ncdc
Copy link
Contributor

ncdc commented Sep 18, 2020

I think it does make sense to apply taints and labels at creation time, to avoid any possible gap where something might be able to "slip in" and get scheduled onto the node that otherwise wouldn't be allowed.

Generally speaking, I would expect bootstrap providers to support setting kubelet options on an override basis. . . [What if t]heir custom options conflict with the options you set by default?

Could you expand a bit on who the actors are in this scenario, what's being overridden, and who is "you" ("the options you? set by default")? CAPI's stance on node settings such as taints and labels is that a bootstrap provider such as CAPBK sets them at node registration time, and any further administration of these things must be done by some other actor (a user, an authorized controller, etc.).

@michaelgugino
Copy link
Contributor Author

Could you expand a bit on who the actors are in this scenario, what's being overridden, and who is "you" ("the options you? set by default")? CAPI's stance on node settings such as taints and labels is that a bootstrap provider such as CAPBK sets them at node registration time, and any further administration of these things must be done by some other actor (a user, an authorized controller, etc.).

The "you" here is the cluster-api project. If you provide these options, users will expect upgrades to still work. I suppose what I'm saying in regard to this is that it's not so much about taints and labels as it is other on-disk configuration and how this change might fit into that larger picture.

@elmiko
Copy link
Contributor

elmiko commented Oct 7, 2020

i just wanted to update this pr with some of the ideas we have agreed upon during discussions.

for the cpu, memory, and gpu fields which are needed by the autoscaler to create new instances from zero, we agreed that the information should be stored in the status field of the MachineDeployment and MachineSet resources. they would be populated by each cloud provider as defined by the instances which are available on that platform. these values should be scoped under a specific field in the status (eg status.instanceSpecs).

for the taints, we agreed that this functionality could be added in a follow up. there are some issues around how we persist these taints down the kubelets and what contracts this imposes on the bootstrap provders. additionally, the ability to define taints on the MachineDeployments and MachineSets is a feature for CAPI and we should separate from the core functionality of scaling to and from zero.

@ncdc has given me some good ideas about building a poc to demonstrate how this could work, and i am working to get that in place to demonstrate at a future meeting.

i'm going to confer with @michaelgugino about updating this proposal.

@seh
Copy link

seh commented Oct 7, 2020

for the taints, we agreed that this functionality could be added in a follow up.

Do you mean a follow-up posted here, or in a separate proposal? I just want to make sure that we recognize that scaling from zero in AWS can't work correctly without these taints being available for the autoscaler to consider.

@elmiko
Copy link
Contributor

elmiko commented Oct 7, 2020

Do you mean a follow-up posted here, or in a separate proposal? I just want to make sure that we recognize that scaling from zero in AWS can't work correctly without these taints being available for the autoscaler to consider.

i had meant in a follow up proposal, but if taints are a hard blocker for aws we might need to discuss this more.

edit:

is it possible we could do something on the cluster-autoscaler side that would make it appear as if there are no taints (an empty set) until we could add the extra functionality to properly recognize them?

@seh
Copy link

seh commented Oct 7, 2020

is it possible we could do something on the cluster-autoscaler side that would make it appear as if there are no taints (an empty set) until we could add the extra functionality to properly recognize them?

I assume that's how it would work today. In AWS, if your ASGs lack the taint-related resource tags, the autoscaler assumes that the nodes will bear no taints. It will create instances for nodes to satisfy pods based on other constraints, but could mistakenly create an instance for a pod with no tolerations, and wind up with a node with taints that repel that pod.

If you don't taint your nodes, though, of course this doesn't present a problem.

@elmiko
Copy link
Contributor

elmiko commented Oct 7, 2020

I assume that's how it would work today. In AWS, if your ASGs lack the taint-related resource tags, the autoscaler assumes that the nodes will bear no taints. It will create instances for nodes to satisfy pods based on other constraints, but could mistakenly create an instance for a pod with no tolerations, and wind up with a node with taints that repel that pod.

makes sense to me, but i have a question.

if your ASGs lack the taint-related resource tags, the autoscaler assumes that the nodes will bear no taints.

does this mean that a user could not add taints after creation?

If you don't taint your nodes, though, of course this doesn't present a problem.

i guess this is the question, could we move forward with an initial implementation that does not contain taints?
and we would clearly signal to users that they are not included yet

@seh
Copy link

seh commented Oct 7, 2020

could we move forward with an initial implementation that does not contain taints?
and we would clearly signal to users that they are not included yet

Yes, I think if there are no taints involved, and you either can't create nodes with taints or are warned of the dangers of doing so, then we'd still have something useful to those who don't need that capability.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2021
@unixfox
Copy link

unixfox commented Jan 5, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2021
@elmiko
Copy link
Contributor

elmiko commented Mar 10, 2021

for those following this effort, i have created #4283 based on a proof of concept i am working on.

@fabriziopandini
Copy link
Member

@elmiko is #4283 going to replace this proposal? should we close this one?

@elmiko
Copy link
Contributor

elmiko commented Mar 12, 2021

@fabriziopandini while i think there are some different ideas covered by this proposal, i am going to try and capture some of these ideas as future work in the other. so, i would be ok closing this pr, unless @michaelgugino has a strong objection.

@vincepri
Copy link
Member

Hey folks, let's please dedupe in one PR

@elmiko
Copy link
Contributor

elmiko commented Mar 23, 2021

i talked with @michaelgugino and he has no objection to closing this.
/close

@k8s-ci-robot
Copy link
Contributor

@elmiko: Closed this PR.

In response to this:

i talked with @michaelgugino and he has no objection to closing this.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/proposal Issues or PRs related to proposals. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.