Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/asset: Add asset for Worker machinesets #468

Merged

Conversation

abhinavdahiya
Copy link
Contributor

/cc @wking

This allows combining values for machine configuration from various
sources like,
1. Defaults decided by installer
2. Platform level defaults in InstallConfig
3. Per pool options.
@abhinavdahiya
Copy link
Contributor Author

/cc @crawford @trawler

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 15, 2018
@abhinavdahiya abhinavdahiya changed the title pkg/asset: Add assest for Worker machinesets pkg/asset: Add asset for Worker machinesets Oct 15, 2018
@abhinavdahiya
Copy link
Contributor Author

abhinavdahiya commented Oct 15, 2018

The workers did not come up because:

Seeing this error when creating machines on aws

error creating EC2 instance: InvalidParameterValue: Duplicate tag
        key 'kubernetes.io/cluster/ci-op-nn8nk5b5-1d3f3' specified.\n\tstatus code:
        400, request id: a22b595d-6be4-4fe1-b358-c50029d330f9

The machine object:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  creationTimestamp: 2018-10-15T23:02:01Z
  finalizers:
  - machine.cluster.k8s.io
  generateName: ci-op-nn8nk5b5-1d3f3-worker-0-
  generation: 1
  labels:
    sigs.k8s.io/cluster-api-cluster: ci-op-nn8nk5b5-1d3f3
    sigs.k8s.io/cluster-api-machine-role: worker
    sigs.k8s.io/cluster-api-machine-type: worker
    sigs.k8s.io/cluster-api-machineset: worker
  name: ci-op-nn8nk5b5-1d3f3-worker-0-jjpq4
  namespace: openshift-cluster-api
  ownerReferences:
  - apiVersion: cluster.k8s.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: MachineSet
    name: ci-op-nn8nk5b5-1d3f3-worker-0
    uid: 42eb4d20-d0ce-11e8-a52e-0a580a020206
  resourceVersion: "275"
  selfLink: /apis/cluster.k8s.io/v1alpha1/namespaces/openshift-cluster-api/machines/ci-op-nn8nk5b5-1d3f3-worker-0-jjpq4
  uid: 5321c7b1-d0ce-11e8-a52e-0a580a020206
spec:
  metadata:
    creationTimestamp: null
  providerConfig:
    ValueFrom: null
    value:
      ami:
        filters:
        - name: name
          values:
          - rhcos*
        - name: architecture
          values:
          - x86_64
        - name: virtualization-type
          values:
          - hvm
        - name: owner-id
          values:
          - "531415883065"
        - name: image-type
          values:
          - machine
        - name: state
          values:
          - available
      apiVersion: aws.cluster.k8s.io/v1alpha1
      iamInstanceProfile:
        id: ci-op-nn8nk5b5-1d3f3-worker-profile
      instanceType: m2.medium
      kind: AWSMachineProviderConfig
      placement:
        region: us-east-1
      securityGroups:
      - filters:
        - name: tag:Name
          values:
          - ci-op-nn8nk5b5-1d3f3_worker_sg
      subnet:
        filters:
        - name: tag:Name
          values:
          - ci-op-nn8nk5b5-1d3f3-worker-*
      tags:
      - name: expirationDate
        value: 2018-10-16T02:52+0000
      - name: kubernetes.io/cluster/ci-op-nn8nk5b5-1d3f3
        value: owned
      - name: tectonicClusterID
        value: 8c846f3f-db33-4707-bb01-e8604f6976f3
      userDataSecret:
        name: worker-user-data
  versions:
    kubelet: ""
status:
  lastUpdated: 2018-10-15T23:12:21Z
  providerStatus:
    apiVersion: aws.cluster.k8s.io/v1alpha1
    conditions:
    - lastProbeTime: 2018-10-15T23:12:21Z
      lastTransitionTime: 2018-10-15T23:02:06Z
      message: "error creating EC2 instance: InvalidParameterValue: Duplicate tag
        key 'kubernetes.io/cluster/ci-op-nn8nk5b5-1d3f3' specified.\n\tstatus code:
        400, request id: a22b595d-6be4-4fe1-b358-c50029d330f9"
      reason: MachineCreationFailed
      status: "True"
      type: MachineCreation
    instanceId: null
    instanceState: null
    kind: AWSMachineProviderStatus

It looks like this code https://github.com/openshift/cluster-api-provider-aws/blob/master/cloud/aws/actuators/machine/actuator.go#L337-L345 is not de-deduplicating the tags.

/cc @bison


tags := map[string]string{
"tectonicClusterID": ic.ClusterID,
fmt.Sprintf("kubernetes.io/cluster/%s", ic.ObjectMeta.Name): "owned",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is more correct here, the actuator would need to ensure the tag is there, you shouldn't need to set this here, and it would be ideal if the actuator caught the duplicate. I'd just drop here and file a request for cloud team to check for dupes.

@abhinavdahiya
Copy link
Contributor Author

apiVersion: cluster.k8s.io/v1alpha1
kind: MachineSet
metadata:
name: {{.ClusterName}}-worker-0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's up the -0 suffix on these? This will be copied into generateName by the controller, so the nodes will end up with names like openshift-worker-0-abdef.

Copy link
Contributor Author

@abhinavdahiya abhinavdahiya Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's up the -0 suffix on these

We there are going to have multiple machinesets (in aws per az).

so the nodes will end up with names like openshift-worker-0-abdef

I think the node names is not dictated by the machine object name. it will be whatever kubelet decides. on aws it is usually internal name, on libvirt host name.

From the ci:

NAME                           STATUS    ROLES       AGE       VERSION
ip-10-0-13-26.ec2.internal     Ready     master      10m       v1.11.0+d4cacc0
ip-10-0-14-47.ec2.internal     Ready     bootstrap   10m       v1.11.0+d4cacc0
ip-10-0-151-203.ec2.internal   Ready     worker      5m        v1.11.0+d4cacc0
ip-10-0-152-41.ec2.internal    Ready     worker      5m        v1.11.0+d4cacc0
ip-10-0-155-96.ec2.internal    Ready     worker      4m        v1.11.0+d4cacc0
ip-10-0-16-202.ec2.internal    Ready     master      10m       v1.11.0+d4cacc0
ip-10-0-39-160.ec2.internal    Ready     master      10m       v1.11.0+d4cacc0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, forgot about the multiple AZ thing.

So, yeah, this is kind of confusing. The node object's name is determined by what the kubelet registers. The machine object's name is this string copied into generateName, so basically this with some extra characters.

@bison
Copy link
Contributor

bison commented Oct 16, 2018

I think @trawler is working on the duplicate tags issue.

@wking
Copy link
Member

wking commented Oct 16, 2018

From 16e86c941:

This depends on the cluster-apiserver to be deployed by the machine-api-operator...

I still have a pretty fuzzy grasp of how all these operators fit together :p. We could push most of these without waiting on the cluster-API server, right? We just can't push the kind: Cluster template without the custer-API server.

Also, do we want to address "worker" vs. "compute" for new-to-this-PR names?

@abhinavdahiya
Copy link
Contributor Author

We could push most of these without waiting on the cluster-API server, right? We just can't push the kind: Cluster template without the custer-API server.

The api for all these objects is provided by the aggregated apiserver installed by MAO.

Also, do we want to address "worker" vs. "compute" for new-to-this-PR names?

We still call them workers.

filters:
- name: "name"
values:
- rhcos*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want all of this filtering, do we? I think we want to calculate the AMI ID in Go and inject it directly, like we're currently doing over here. That will also make it easier to allow callers to override the AMI ID if/when we restore support for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstallConfig does not have AMI override. So we don't support that for workers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstallConfig does not have AMI override. So we don't support that for workers.

But we do support AMI lookup in Go. And we use that lookup to set the master/bootstrap AMI ID here. I think we want to use the same AMI for workers, which means we should pull that lookup out to a higher level so we don't have to perform it twice. InstallConfig seems like a reasonable place to put it, but I'm ok if you want to stick it somewhere else for the time being. I don't think we need to supply a user prompt for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to use the same AMI for workers.

That's not true. New workers created by scaling the machineset should not be stuck with old amis.
Bootstrap and master is separate because we create them.
But in future we want masters to be adopted by the something similar

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to use the same AMI for workers.

That's not true. New workers created by scaling the machineset should not be stuck with old amis.
Bootstrap and master is separate because we create them.

And with this PR, we're effectively creating the workers as well.

But in future we want masters to be adopted by the something similar

Bootstrap doesn't matter; since it's a throw-away node. I expect we'll want a machine-set for masters too (or is there a different approach to scaling masters?). And however we handle rolling updates for workers, I expect we'll want the same handling for masters. Isn't that what the machine-config operator is going to handle? Can't it update the machine-sets when it decides to bump to a new AMI? For now, I thought we'd want to punt on all of this and just pin to the creation-time AMI for both masters and workers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what the machine-config operator is going to handle? Can't it update the machine-sets when it decides to bump to a new AMI?

it doesn't control machine characteristics. So no.

I thought we'd want to punt on all of this and just pin to the creation-time AMI for both masters and workers.

I would still want installer to not choose the ami for worker.

Copy link
Member

@wking wking Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what the machine-config operator is going to handle? Can't it update the machine-sets when it decides to bump to a new AMI?

it doesn't control machine characteristics. So no.

Who is (or will be?) in charge of rolling AMI updates out across masters/workers? Or maybe with #281 in the pipe, nobody will care about AMIs at all?

I thought we'd want to punt on all of this and just pin to the creation-time AMI for both masters and workers.

I would still want installer to not choose the ami for worker.

But you're fine having it chose the AMI for the masters? They feel like the same thing to me. Would you rather we drop the AMI lookup from Go and just perform the lookup via tags? See also openshift/os#353 about long-term issues with the tag-based approach and release channels. It's not clear to me how the approach you have here will transition to the eventual non-tag AMI lookup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is (or will be?) in charge of rolling AMI updates out across masters/workers?

That is TBD.

But you're fine having it chose the AMI for the masters?

I don't want to do that for masters too. but we cannot do that now as we cannot run the cluster-api stack during bootstrap, because of various reasons, to give us masters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to specific AMI to keep consistency with current MAO implementation.

pods:
cidrBlocks:
- {{.PodCIDR}}
serviceDomain: unused
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had this since openshift/machine-api-operator@04253cb9, but I'm not clear on why we aren't using our base domain or some such. @enxebre, do you remember why you chose this value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the cluster object fields itself is used. However it's currently coupled in the actuator interface upstream hence we need it to exist for now https://github.com/kubernetes-sigs/cluster-api/blob/master/pkg/controller/machine/actuator.go#L25

kind: LibvirtMachineProviderConfig
domainMemory: 2048
domainVcpu: 2
ignKey: /var/lib/libvirt/images/worker.ign
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this just worker.ign? More on why in openshift/machine-api-operator#70, in case you want any fodder for the explanatory commit message (although we probably don't have time to explain everything in these big "drop in a bunch of stuff which was worked up somewhere else" commits).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with changing this to worker.ign.

Long term i want consistency between libvirt and aws, as AWS uses a secret to specify the useradata.

Copy link
Contributor Author

@abhinavdahiya abhinavdahiya Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking this doesn't work

The actuator doesn't support this:
locally

Coud not create libvirt machine: error creating domain: error creating libvirt domain: virError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2018-10-16T17:59:15.773168Z qemu-system-x86_64: -fw_cfg name=opt/com.coreos/config,file=worker.ign: can't load worker.ign')

@abhinavdahiya abhinavdahiya force-pushed the worker_machinesets branch 3 times, most recently from a4e70fb to b9ef603 Compare October 16, 2018 18:54
Add assets that can create the machinesets for workers.
This depends on the cluster-apiserver to be deployed by the machine-api-operator,
so the tectonic.sh will block on creating these machinesets objects until then.

Using specific AMI for worker machinesets to keep consistency with
current MAO implementation.

ctx, cancel := context.WithTimeout(context.TODO(), 60*time.Second)
defer cancel()
ami, err := rhcos.AMI(ctx, rhcos.DefaultChannel, config.Region)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the later lookup for master/bootstrap AMIs. Are we ok with the redundant request (and possible race coming up with two different AMIs), or do we want to pull this lookup back to a separate location where it can be shared by both consumers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not different than what we have today, in MAO here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also when we move masters to machinesets, it will end up in same location.

@wking
Copy link
Member

wking commented Oct 16, 2018

I'll follow up on the /var/lib/libvirt stuff later.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 16, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [abhinavdahiya,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 079fe51 into openshift:master Oct 16, 2018
wking added a commit to wking/openshift-installer that referenced this pull request Oct 20, 2018
Adds ClusterK8sIO from e2dc955 (pkg/asset: add ClusterK8sIO,
machines.Worker assets, 2018-10-15, openshift#468) and Master from 586ad45
(pkg/asset: Add asset for Master machines, 2018-10-18, openshift#491).  Removes
KubeCoreOperator from c9b0e2f (manifests: stop using kube core
operator, 2018-10-08, openshift#420).

Generated with:

  $ openshift-install graph | dot -Tsvg >docs/design/resource_dep.svg

using:

  $ dot -V
  dot - graphviz version 2.30.1 (20170916.1124)
wking added a commit to wking/openshift-installer that referenced this pull request Oct 24, 2018
We'd added this in 4d636d3 (asset/manifests: bootstrap manifest
generation, 2018-08-31, openshift#286) to support the machine-API operator
which had been generating worker machine-sets.  But since e2dc955
(pkg/asset: add ClusterK8sIO, machines.Worker assets, 2018-10-15, openshift#468),
we've been creating those machine-sets ourselves.  And the machine-API
operator dropped their consuming code in
openshift/machine-api-operator@c59151f6 (delete machine/cluster object
loops, 2018-10-22, openshift/machine-api-operator#286).

Dropping this dependency fixes bootstrap Ignition config edits during
a multi-step deploy.  For example, with:

  $ openshift-install --dir=wking create ignition-configs
  $ tree wking
  wking
  ├── bootstrap.ign
  ├── master-0.ign
  └── worker.ign

before this commit, any edits to bootstrap.ign were clobbered in a
subsequent 'create cluster' call, because:

1. The bootstrap Ignition config depends on the manifests.
2. The manifests depended on the worker Ignition config.
3. The worker Ignition config is on disk, so it gets marked dirty.
wking added a commit to wking/openshift-installer that referenced this pull request Dec 17, 2018
Workers have not had public IPs since (at least) we moved them to
cluster-API creation in e2dc955 (pkg/asset: add ClusterK8sIO,
machines.Worker assets, 2018-10-15, openshift#468).  But it turns out a number
of e2e tests assume SSH access to workers (e.g. [1]), and we don't
have time to fix those tests now.  We'll remove this once the tests
have been fixed.

[1]: https://github.com/kubernetes/kubernetes/blob/v1.13.1/test/e2e/node/ssh.go#L43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants