Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial cluster Terraform configuration #306

Merged
merged 2 commits into from
Aug 23, 2019

Conversation

cblecker
Copy link
Member

@cblecker cblecker commented Jul 11, 2019

Prerequisites:

Instructions:

credentials "app.terraform.io" {
  token = "XXXX"
}
  • Clone branch, cd into k8s-tf-cluster/ folder
  • terraform plan will print changes needed to create cluster
  • terraform apply will apply them
  • terraform destroy will destroy and clean up all created resources

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 11, 2019
@k8s-ci-robot k8s-ci-robot requested review from dims and nikhita July 11, 2019 00:30
@@ -0,0 +1,47 @@
FROM alpine:3.9 as installer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A big part of me wishes that we build deterministic Docker images with Bazel, since all we're doing is extracting a known zip and just setting some ENV vars.

But I'm fine with a Dockerfiles for now.

}

data "google_container_engine_versions" "us-central1" {
project = data.google_project.project.id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the doc, there is no attribute id. Is terraform trying to extract this information from the GCP API ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs are inaccurate. Using a data resource for this confirms that A) the project exists and B) that you have access to that project.

This value is computed to the GCP project ID (e.g. k8s-infra-dev-cluster-turnup).


variable "cluster_name" {
type = string
description = <<EOT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why are you using here docs for the descriptions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows for multi-line descriptions. Only one variable uses it, but it keeps things consistent.

@hh
Copy link
Member

hh commented Jul 24, 2019

@cblecker this might be of use for patterns: https://github.com/crosscloudci/cross-cloud

@hh
Copy link
Member

hh commented Jul 24, 2019

/cc @denverwilliams
Could use some terraform eyes.

@k8s-ci-robot
Copy link
Contributor

@hh: GitHub didn't allow me to request PR reviews from the following users: denverwilliams.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @denverwilliams
Could use some terraform eyes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hh
Copy link
Member

hh commented Jul 24, 2019

/assign

@cblecker
Copy link
Member Author

@hh let me know if there's anything specific from the GKE module you'd like to bring in. Having a very quick look at it, it seems like a pretty basic template, but forces certain things like a hardcoded admin password (we just disable this local account), hard coded scopes, and forced use of a single node pool.

resource "google_container_node_pool" "pool-1" {
provider = google-beta

name_prefix = "pool-1-"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to put this as a variable.


// Set machine type, and enable all oauth scopes tied to the service account
node_config {
machine_type = "n1-standard-4"
Copy link
Member

@ameukam ameukam Jul 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to put this as a variable.

}

resource "google_bigquery_dataset" "usage_metering" {
dataset_id = "usage_metering"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include the name of the cluster in the id so we don't have the same dataset for differents clusters.

}

// Create GKE cluster, but with no node pools. Node pools can be provisioned below
resource "google_container_cluster" "cluster" {
Copy link
Member

@ameukam ameukam Jul 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no NetworkPolicy Provider deployed. Shouldn't it be deployed ?

@ameukam
Copy link
Member

ameukam commented Jul 24, 2019

I was able to run this PR. Awesome!

google_container_cluster.cluster: Creation complete after 8m42s [id=k8s-services-cluster]
google_container_node_pool.pool-1: Creating...
google_container_node_pool.pool-1: Still creating... [10s elapsed]
google_container_node_pool.pool-1: Still creating... [20s elapsed]
google_container_node_pool.pool-1: Still creating... [30s elapsed]
google_container_node_pool.pool-1: Still creating... [40s elapsed]
google_container_node_pool.pool-1: Still creating... [50s elapsed]
google_container_node_pool.pool-1: Still creating... [1m0s elapsed]
google_container_node_pool.pool-1: Still creating... [1m10s elapsed]
google_container_node_pool.pool-1: Still creating... [1m20s elapsed]
google_container_node_pool.pool-1: Still creating... [1m30s elapsed]
google_container_node_pool.pool-1: Still creating... [1m40s elapsed]
google_container_node_pool.pool-1: Still creating... [1m50s elapsed]
google_container_node_pool.pool-1: Still creating... [2m0s elapsed]
google_container_node_pool.pool-1: Creation complete after 2m8s [id=us-central1/k8s-services-cluster/pool-1-20190724224037646600000001]

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

It would be amazing to see the description of this PR in the README, also add the command that generate the application_default_credentials.json file.
Since i think this PR is a terraform version of #250, can you add the workload_metadata_config ?

Edit: Also, can you please specify in the README that the cluster is only accessible through gcloud shell ?


// Start with a single node
initial_node_count = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @thockin also wanted to default max pods per node to 32 (default_max_pods_per_node = 32)

@scottilee
Copy link

Two things I saw on https://github.com/kubernetes/k8s.io/pull/250/files that aren't here:

  1. discussion about whether or not to make this a private cluster, this can be done with:
private_cluster_config = {
  enable_private_nodes = true
}
  1. @ameukam had done bigquery setup, is that still relevant here?

@ameukam
Copy link
Member

ameukam commented Jul 25, 2019

Two things I saw on https://github.com/kubernetes/k8s.io/pull/250/files that aren't here:

AFAIR we didn't have consensus about this option. Enable this have some limitations and require additional configurations

  1. discussion about whether or not to make this a private cluster, this can be done with:
private_cluster_config = {
  enable_private_nodes = true
}

@scottilee
Copy link

@cblecker Is there anything else I can do to help on this ticket? If it's just setting up a GKE cluster the majority of the work seems to be done. If there's other Terraform code you need written please let me know and I'll be happy to add it.

@spiffxp
Copy link
Member

spiffxp commented Aug 1, 2019

What needs to be done / where do we need to get this to for this to reach "not perfect but let's merge and iterate"?

@spiffxp
Copy link
Member

spiffxp commented Aug 7, 2019

@thockin to review this one last time, and we'll need to drop the [wip] before we can reach "merge and iterate" for this

@cblecker cblecker changed the title [WIP] Terraform GKE Config Initial cluster Terraform configuration Aug 21, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 21, 2019
clusters/image/Dockerfile Outdated Show resolved Hide resolved
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few non-blocking comments, this looks great!

EOT
}

variable "region" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rework the folder structure to like clusters/<environment>/<region>/<name>/..., this has a few benefits like for example we can setup and generate a terraform.tfvars (https://www.terraform.io/docs/configuration/variables.html#variable-definitions-tfvars-files) from the structure and automatically populate the variables.

}
```
- Ensure you are logged into your GCP account with `gcloud auth application-default login`
- `terraform plan` will print changes needed to create cluster
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer using -out= and then apply the generated plan using terraform apply <path>.

@@ -0,0 +1,2 @@
project = "k8s-infra-dev-cluster-turnup"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fair to say that k8s-infra is a prefix, dev is the environment and cluster-turnup is what gives the unique name to the project? If so, I'd split these into multiple variables and document the naming convention for the project somewhere

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a great suggestion! I think it's better to put this as an issue since we have no real hierarchy or convention and it may take some time to implement this.

@ameukam
Copy link
Member

ameukam commented Aug 21, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 21, 2019
@cblecker
Copy link
Member Author

/approve
/hold

@thockin Any objections to this as-is for the dev cluster? If you're good, feel free to remove the hold.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 21, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cblecker

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 21, 2019
@cblecker
Copy link
Member Author

/hold cancel

Merging and we can iterate!

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 23, 2019
@k8s-ci-robot k8s-ci-robot merged commit 4dd90f2 into kubernetes:master Aug 23, 2019
@cblecker cblecker deleted the tf-wip branch August 23, 2019 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants