Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gcp, dask-worker-nodes: pangeo-hubs to use single dask worker node type #3024

Merged
merged 4 commits into from
Aug 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion config/clusters/pangeo-hubs/cluster.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: pangeo-hubs
provider: gcp # https://console.cloud.google.com/kubernetes/clusters/details/us-central1-b/pangeo-hubs-cluster/nodes?project=columbia
provider: gcp # https://console.cloud.google.com/kubernetes/clusters/details/us-central1-b/pangeo-hubs-cluster/nodes?project=pangeo-integration-te-3eea
account: columbia
gcp:
key: enc-deployer-credentials.secret.json
Expand Down
12 changes: 11 additions & 1 deletion terraform/gcp/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,18 @@ provider "google" {
# Configuration reference:
# https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#user_project_override
#
# FIXME: Erik concluded that billing_project could be set to var.project_id at
# least for one cluster, but it required that the project where the
# cluster lived first enabled the GCP API: https://console.cloud.google.com/apis/library/cloudresourcemanager.googleapis.com
#
# So, we should probably not reference a new variable here, but enable
# the API for all our existing GCP projects and new GCP projects, and
# then reference var.project_id instead.
#
# But who knows, its hard to understand whats going on.
#
user_project_override = true
billing_project = "two-eye-two-see"
billing_project = var.billing_project_id
}

data "google_client_config" "default" {}
Expand Down
68 changes: 17 additions & 51 deletions terraform/gcp/projects/pangeo-hubs.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,28 @@
# -------------------------------------------------------------------------------
#
# The terraform state associated with this file is stored in a dedicated GCP
# bucket, so in order to work with this file you need to do the following after
# clearing a local .terraform folder.
# bucket, so a new terraform backend has to be chosen. Also, you will need to
# authenticate with a @columbia.edu account as our @2i2c.org accounts don't have
# access.
#
# terraform init -backend-config backends/pangeo-backend.hcl
# terraform workspace list
# terraform workspace select <...>
# This can look something like this:
#
# The GCP project having the bucket is https://console.cloud.google.com/?project=columbia
# gcloud auth login --update-adc
#
# cd terraform/gcp
# rm -rf .terraform
#
# terraform init -backend-config backends/pangeo-backend.hcl
# terraform workspace select pangeo-hubs
#
# terraform apply --var-file projects/pangeo-hubs.tfvars
#

prefix = "pangeo-hubs"
project_id = "pangeo-integration-te-3eea"
billing_project_id = "pangeo-integration-te-3eea"
zone = "us-central1-b"
region = "us-central1"
core_node_machine_type = "n2-highmem-4"
core_node_machine_type = "n2-highmem-8"
enable_private_cluster = true

# Multi-tenant cluster, network policy is required to enforce separation between hubs
Expand Down Expand Up @@ -94,52 +101,11 @@ notebook_nodes = {
# A not yet fully established policy is being developed about using a single
# node pool, see https://github.com/2i2c-org/infrastructure/issues/2687.
#
# TODO: Transition to a single n2-highmem-16 worker node pool to be able to
# provide standardized worker pod config for all daskhubs.
#
# Tracked in https://github.com/2i2c-org/infrastructure/issues/2687
#
# The node pool to setup should look like this:
#
# "worker" : {
# min : 0,
# max : 100,
# machine_type : "n2-highmem-16",
# },
#
dask_nodes = {
"small" : {
min : 0,
max : 100,
machine_type : "n1-standard-4",
labels : {},
gpu : {
enabled : false,
type : "",
count : 0
}
},
"medium" : {
"worker" : {
min : 0,
max : 100,
machine_type : "n1-standard-8",
labels : {},
gpu : {
enabled : false,
type : "",
count : 0
}
},
"large" : {
min : 0,
max : 100,
machine_type : "n1-standard-16",
labels : {},
gpu : {
enabled : false,
type : "",
count : 0
}
machine_type : "n2-highmem-16",
},
}

Expand Down
14 changes: 14 additions & 0 deletions terraform/gcp/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,20 @@ variable "project_id" {
EOT
}

variable "billing_project_id" {
type = string
default = "two-eye-two-see"
description = <<-EOT
This should be a GCP Project ID, not a GCP Billing Account ID as the name
indicates. It should be to a project that has a GCP API called Cloud Resource
Manager enabled. That can be enabled on a project via the link below:
https://console.cloud.google.com/apis/library/cloudresourcemanager.googleapis.com

What goes on here is confusing, see the comments about the confusion in main.tf
for more details.
EOT
}

variable "k8s_version_prefixes" {
type = set(string)
# Available minor versions are picked from the GKE regular release channel. To
Expand Down