Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling private nodes for GKE clusters #538

Merged
merged 32 commits into from
Aug 4, 2021
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
67a168b
Enable private nodes in GKE cluster
sgibson91 Jul 21, 2021
e170da4
Disable private endpoint in GKE cluster
sgibson91 Jul 21, 2021
f0b2ecb
Set a minimum version of terraform
sgibson91 Jul 27, 2021
07efc60
Add a boolean flag to enable private clusters
sgibson91 Jul 27, 2021
894a7e4
Dynamically provision cluster attribute blocks
sgibson91 Jul 27, 2021
bf19129
Add networking support for private clusters
sgibson91 Jul 27, 2021
fe42c83
Restructure code block so terraform lint passes
sgibson91 Jul 27, 2021
51664ee
Restructure code block so terraform lint passes
sgibson91 Jul 27, 2021
c5bd53d
Merge branch 'tf-private-nodes' of github.com:sgibson91/pilot-hubs in…
sgibson91 Jul 27, 2021
8a4d9c2
Add links to module docs in network.tf
sgibson91 Jul 28, 2021
dbf3bcd
Tweak condition for ip_allocation_policy in cluster.tf
sgibson91 Jul 28, 2021
47eacfa
Set some empty labels/tags to prevent "changed outside of terraform"
sgibson91 Jul 29, 2021
dd17497
Simplify network setup
sgibson91 Aug 2, 2021
8a16a9f
Update self_links to network and subnet
sgibson91 Aug 2, 2021
f941f0c
Fix linting error
sgibson91 Aug 2, 2021
e2c9169
Update variable docstring
sgibson91 Aug 2, 2021
39d9649
Add encrypted deployer SA for pangeo-hubs
sgibson91 Aug 2, 2021
554a386
Convert deployer key to JSON format instead of YAML
sgibson91 Aug 2, 2021
20d2a38
Merge branch 'master' into tf-private-nodes
sgibson91 Aug 2, 2021
51e1c0b
Remove terraform version pinning from main.tf
sgibson91 Aug 2, 2021
405ad13
Merge branch 'master' into tf-private-nodes
sgibson91 Aug 2, 2021
d41b1c5
Merge branch 'tf-private-nodes' of github.com:sgibson91/pilot-hubs in…
sgibson91 Aug 2, 2021
da2b3c5
Move network file under GCP dir
sgibson91 Aug 2, 2021
8cd8202
Shouldn't add the deployer key in this PR
sgibson91 Aug 2, 2021
333e141
Add source IP ranges to firewall rule
sgibson91 Aug 3, 2021
87b6fb6
Deploy and Cloud Router and Cloud NAT to handle outbound traffic
sgibson91 Aug 3, 2021
96ec6b9
Change name of firewall rule to be more descriptive
sgibson91 Aug 3, 2021
d3bc70c
Use name attribute instead of self_link for consistency
sgibson91 Aug 3, 2021
a3a17c6
Document source ranges for IAP SSH ingress firewall rule
sgibson91 Aug 3, 2021
632048d
Use the project's default network and subnetwork
sgibson91 Aug 3, 2021
4eccd46
Explicitly set more "flappy" variables
sgibson91 Aug 3, 2021
48d86ee
Update doc for enable_private_cluster
yuvipanda Aug 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions terraform/gcp/buckets.tf
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ resource "google_storage_bucket" "user_buckets" {
name = "${var.prefix}-${each.key}"
location = var.region
project = var.project_id

// Set these values explicitly so they don't "change outside terraform"
labels = {}
}

resource "google_storage_bucket_iam_member" "member" {
Expand Down
39 changes: 38 additions & 1 deletion terraform/gcp/cluster.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,32 @@ resource "google_container_cluster" "cluster" {
initial_node_count = 1
remove_default_node_pool = true

// For private clusters, pass the name of the network and subnetwork created
// by the VPC
network = var.enable_private_cluster ? google_compute_network.vpc_network[0].self_link : null
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
subnetwork = var.enable_private_cluster ? google_compute_subnetwork.subnetwork[0].self_link : null

// Dynamically provision the private cluster config when deploying a
// private cluster
dynamic "private_cluster_config" {
for_each = var.enable_private_cluster == "" ? [] : [1]

content {
// Decide if this CIDR block is sensible or not
master_ipv4_cidr_block = "172.16.0.0/28"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave this unset?

Copy link
Member Author

@sgibson91 sgibson91 Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, no :( #538 (comment)

enable_private_nodes = true
enable_private_endpoint = false
}
}

// Dynamically provision the IP allocation policy when deploying a
// private cluster. This allows for IP aliasing and makes the cluster
// VPC-native
dynamic "ip_allocation_policy" {
for_each = var.enable_private_cluster ? [1] : []
content {}
}

addons_config {
http_load_balancing {
// FIXME: This used to not work well with websockets, and
Expand Down Expand Up @@ -58,6 +84,9 @@ resource "google_container_cluster" "cluster" {
# DO NOT TOUCH THIS BLOCK, IT REPLACES ENTIRE CLUSTER LOL
service_account = google_service_account.cluster_sa.email
}

// Set these values explicitly so they don't "change outside terraform"
resource_labels = {}
}

resource "google_container_node_pool" "core" {
Expand Down Expand Up @@ -97,6 +126,9 @@ resource "google_container_node_pool" "core" {
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]

// Set these values explicitly so they don't "change outside terraform"
tags = []
}
}

Expand Down Expand Up @@ -155,6 +187,9 @@ resource "google_container_node_pool" "notebook" {
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]

// Set these values explicitly so they don't "change outside terraform"
tags = []
}
}

Expand Down Expand Up @@ -218,6 +253,8 @@ resource "google_container_node_pool" "dask_worker" {
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]

// Set these values explicitly so they don't "change outside terraform"
tags = []
}
}

62 changes: 62 additions & 0 deletions terraform/gcp/network.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
/**
* Networking to support private clusters
*
* This config is only deployed when the enable_private_cluster variable is set
* to true
*/

resource "google_compute_network" "vpc_network" {
count = var.enable_private_cluster ? 1 : 0

name = "${var.prefix}-vpc-network"
project = var.project_id
auto_create_subnetworks = false
}

resource "google_compute_subnetwork" "subnetwork" {
count = var.enable_private_cluster ? 1 : 0

name = "${var.prefix}-subnetwork"
project = var.project_id
region = var.region
network = google_compute_network.vpc_network[0].id
private_ip_google_access = true

// Decide if this is a sensible IP CIDR range or not
ip_cidr_range = "10.2.0.0/16"
}

resource "google_compute_firewall" "firewall_rules" {
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
count = var.enable_private_cluster ? 1 : 0

name = "allow-ssh"
project = var.project_id
network = google_compute_network.vpc_network[0].name

allow {
protocol = "tcp"
ports = ["22"]
}

source_ranges = ["35.235.240.0/20"]
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
}

resource "google_compute_router" "router" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to see if there exists a default router we can use too, and based on reading the docs and running gcloud compute routers list I've come to the conclusion we have to create our own. So this LGTM

count = var.enable_private_cluster ? 1 : 0

name = "${var.prefix}-router"
project = var.project_id
region = var.region
network = google_compute_network.vpc_network[0].id
}

resource "google_compute_router_nat" "nat" {
count = var.enable_private_cluster ? 1 : 0

name = "${var.prefix}-router-nat"
project = var.project_id
region = var.region
router = google_compute_router.router[0].name
nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
}
3 changes: 3 additions & 0 deletions terraform/gcp/registry.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,7 @@ resource "google_artifact_registry_repository" "registry" {
repository_id = "${var.prefix}-registry"
format = "DOCKER"
project = var.project_id

// Set these values explicitly so they don't "change outside terraform"
labels = {}
}
13 changes: 13 additions & 0 deletions terraform/gcp/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -171,3 +171,16 @@ variable "user_buckets" {
default = []
description = "Buckets to create for the project, they will be prefixed with {var.prefix}-"
}

variable "enable_private_cluster" {
type = bool
default = false
description = <<-EOT
Enable deployment of GKE into a private cluster.

For projects that are managed by universities and such, we may find that they
have enabled certain constraints and controls that mean our usual method of
deployment fails. Enabling a private cluster tends to satisfy many of these
controls.
yuvipanda marked this conversation as resolved.
Show resolved Hide resolved
EOT
}