Skip to content

Commit

Permalink
Merge pull request #1322 from yuvipanda/s3-terraform
Browse files Browse the repository at this point in the history
Add scratch bucket functionality for AWS
  • Loading branch information
yuvipanda authored May 25, 2022
2 parents 7d04ae9 + 39a74cb commit da5c92c
Show file tree
Hide file tree
Showing 10 changed files with 241 additions and 24 deletions.
1 change: 0 additions & 1 deletion config/clusters/uwhackweeks/common.values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ basehub:
name: ICESat Hackweek
url: https://icesat-2.hackweek.io
singleuser:
serviceAccountName: cloud-user-sa
defaultUrl: /lab
initContainers:
# Need to explicitly fix ownership here, since EFS doesn't do anonuid
Expand Down
7 changes: 7 additions & 0 deletions config/clusters/uwhackweeks/prod.values.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
basehub:
userServiceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::740010314650:role/uwhackweeks-prod
jupyterhub:
singleuser:
extraEnv:
SCRATCH_BUCKET: s3://uwhackweeks-scratch/$(JUPYTERHUB_USER)
PANGEO_SCRATCH: s3://uwhackweeks-scratch/$(JUPYTERHUB_USER)
hub:
config:
GitHubOAuthenticator:
Expand Down
7 changes: 7 additions & 0 deletions config/clusters/uwhackweeks/staging.values.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
basehub:
userServiceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::740010314650:role/uwhackweeks-staging
jupyterhub:
singleuser:
extraEnv:
SCRATCH_BUCKET: s3://uwhackweeks-scratch-staging/$(JUPYTERHUB_USER)
PANGEO_SCRATCH: s3://uwhackweeks-scratch-staging/$(JUPYTERHUB_USER)
hub:
config:
GitHubOAuthenticator:
Expand Down
71 changes: 54 additions & 17 deletions docs/howto/features/cloud-access.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ improving the security posture of our hubs.
This page lists various features we offer around access to cloud resources,
and how to enable them.

## GCP
## How it works

### How it works
### GCP

On Google Cloud Platform, we use [Workload Identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity)
to map a particular [Kubernetes Service Account](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/)
Expand All @@ -21,8 +21,19 @@ as well as dask worker pods)
will have the permissions assigned to the Google Cloud Service Account.
This Google Cloud Service Account is managed via terraform.

(howto:features:cloud-access:gcp:access-perms)=
### Enabling specific cloud access permissions
### AWS

On AWS, we use [IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html)
to map a particular [Kubernetes Service Account](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/)
to a particular [AWS IAM Role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html).
All pods using the Kubernetes Service Account (user's jupyter notebook pods
as well as dask worker pods)
will have the permissions assigned to the AWS IAM Role.
This AWS IAM Role is managed via terraform.


(howto:features:cloud-access:access-perms)=
## Enabling specific cloud access permissions

1. In the `.tfvars` file for the project in which this hub is based off
create (or modify) the `hub_cloud_permissions` variable. The config is
Expand All @@ -44,17 +55,17 @@ This Google Cloud Service Account is managed via terraform.
and the cluster name together can't be more than 29 characters. `terraform`
will complain if you go over this limit, so in general just use the name
of the hub and shorten it only if `terraform` complains.
2. `requestor_pays` enables permissions for user pods and dask worker
2. (GCP only) `requestor_pays` enables permissions for user pods and dask worker
pods to identify as the project while making requests to Google Cloud Storage
buckets marked as 'requestor pays'. More details [here](topic:features:cloud:gcp:requestor-pays).
3. `bucket_admin_access` lists bucket names (as specified in `user_buckets`
terraform variable) all users on this hub should have full read/write
access to. Used along with the [user_buckets](howto:features:cloud-access:gcp:storage-buckets)
terraform variable to enable the [scratch buckets](topic:features:cloud:gcp:scratch-buckets)
access to. Used along with the [user_buckets](howto:features:cloud-access:storage-buckets)
terraform variable to enable the [scratch buckets](topic:features:cloud:scratch-buckets)
feature.
3. `hub_namespace` is the full name of the hub, as hubs are put in Kubernetes
3. (GCP only) `hub_namespace` is the full name of the hub, as hubs are put in Kubernetes
Namespaces that are the same as their names. This is explicitly specified here
because `<hub-name-slug>` could possibly be truncated.
because `<hub-name-slug>` could possibly be truncated on GCP.

2. Run `terraform apply -var-file=projects/<cluster-var-file>.tfvars`, and look at the
plan carefully. It should only be creating or modifying IAM related objects (such as roles
Expand All @@ -69,12 +80,24 @@ This Google Cloud Service Account is managed via terraform.
4. Run `terraform output kubernetes_sa_annotations`, this should
show you a list of hubs and the annotation required to be set on them:

```{tabbed} GCP
<pre>
$ terraform output kubernetes_sa_annotations
{
"prod" = "iam.gke.io/gcp-service-account: [email protected]"
"staging" = "iam.gke.io/gcp-service-account: [email protected]"
}
</pre>
```

```{tabbed} AWS
<pre>
$ terraform output kubernetes_sa_annotations
{
"prod" = "iam.gke.io/gcp-service-account: meom-ige-prod@meom-ige-cnrs.iam.gserviceaccount.com"
"staging" = "iam.gke.io/gcp-service-account: meom-ige-staging@meom-ige-cnrs.iam.gserviceaccount.com"
"prod" = "eks.amazonaws.com/role-arn: arn:aws:iam::740010314650:role/uwhackweeks-prod"
"staging" = "eks.amazonaws.com/role-arn: arn:aws:iam::740010314650:role/uwhackweeks-staging"
}
</pre>
```

This shows all the annotations for all the hubs configured to provide cloud access
Expand All @@ -85,10 +108,20 @@ This Google Cloud Service Account is managed via terraform.

6. Specify the annotation from step 4, nested under `userServiceAccount.annotations`.

```yaml
```{tabbed} GCP
<pre>
userServiceAccount:
annotations:
iam.gke.io/gcp-service-account: [email protected]"
</pre>
```
```{tabbed} AWS
<pre>
userServiceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::740010314650:role/uwhackweeks-staging
</pre>
```
```{note}
Expand All @@ -98,10 +131,10 @@ This Google Cloud Service Account is managed via terraform.
7. Get this change deployed, and users should now be able to use the requestor pays feature!
Currently running users might have to restart their pods for the change to take effect.
(howto:features:cloud-access:gcp:storage-buckets)=
### Creating storage buckets for use with the hub
(howto:features:cloud-access:storage-buckets)=
## Creating storage buckets for use with the hub
See [the relevant topic page](topic:features:cloud:gcp:scratch-buckets) for more information
See [the relevant topic page](topic:features:cloud:scratch-buckets) for more information
on why users want this!
1. In the `.tfvars` file for the project in which this hub is based off
Expand All @@ -128,7 +161,7 @@ on why users want this!
very helpful for 'scratch' buckets that are temporary. Set to
`null` to prevent this cleaning up process from happening.

2. Enable access to these buckets from the hub by [editing `hub_cloud_permissions`](howto:features:cloud-access:gcp:access-perms)
2. Enable access to these buckets from the hub by [editing `hub_cloud_permissions`](howto:features:cloud-access:access-perms)
in the same `.tfvars` file. Follow all the steps listed there - this
should create the storage buckets and provide all users access to them!

Expand All @@ -142,9 +175,13 @@ on why users want this!
jupyterhub:
singleuser:
extraEnv:
SCRATCH_BUCKET: gcs://<bucket-full-name>/$(JUPYTERHUB_USER)
SCRATCH_BUCKET: <s3 or gcs>://<bucket-full-name>/$(JUPYTERHUB_USER)
PANGEO_SCRATCH: <s3 or gcs>://<bucket-full-name>/$(JUPYTERHUB_USER)
```
```{note}
Use s3 on AWS and gcs on GCP for the protocol part
```
```{note}
If the hub is a `daskhub`, nest the config under a `basehub` key
```
Expand Down
8 changes: 4 additions & 4 deletions docs/topic/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,16 +36,16 @@ When this feature is enabled, users on a hub accessing cloud buckets from
other organizations marked as 'requestor pays' will increase our cloud bill.
Hence, this is an opt-in feature.

(topic:features:cloud:gcp:scratch-buckets)=
#### 'Scratch' Buckets on Google Cloud Storage
(topic:features:cloud:scratch-buckets)=
## 'Scratch' Buckets on object storage

Users often want one or more Google Cloud Storage [buckets](https://cloud.google.com/storage/docs/json_api/v1/buckets)
Users often want one or more object storage buckets
to store intermediate results, share big files with other users, or
to store raw data that should be accessible to everyone within the hub.
We can create one more more buckets and provide *all* users on the hub
*equal* access to these buckets, allowing users to create objects in them.
A single bucket can also be designated as as *scratch bucket*, which will
set a `SCRATCH_BUCKET` (and a deprecated `PANGEO_SCRATCH`) environment variable
of the form `gcs://<bucket-name>/<user-name>`. This can be used by individual
of the form `<s3 or gcs>://<bucket-name>/<user-name>`. This can be used by individual
users to store objects temporarily for their own use, although there is nothing
preventing other users from accessing these objects!
62 changes: 62 additions & 0 deletions terraform/aws/buckets.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
resource "aws_s3_bucket" "user_buckets" {
for_each = var.user_buckets
bucket = "${var.cluster_name}-${each.key}"

}

resource "aws_s3_bucket_lifecycle_configuration" "user_bucket_expiry" {
for_each = var.user_buckets
bucket = "${var.cluster_name}-${each.key}"

dynamic "rule" {
for_each = each.value.delete_after != null ? [1] : []

content {
id = "delete-after-expiry"
status = "Enabled"

expiration {
days = each.value.delete_after
}
}
}
}

locals {
# Nested for loop, thanks to https://www.daveperrett.com/articles/2021/08/19/nested-for-each-with-terraform/
bucket_permissions = distinct(flatten([
for hub_name, permissions in var.hub_cloud_permissions : [
for bucket_name in permissions.bucket_admin_access : {
hub_name = hub_name
bucket_name = bucket_name
}
]
]))
}


data "aws_iam_policy_document" "bucket_access" {
for_each = { for bp in local.bucket_permissions : "${bp.hub_name}.${bp.bucket_name}" => bp }
statement {
effect = "Allow"
actions = ["s3:*"]
principals {
type = "AWS"
identifiers = [
aws_iam_role.irsa_role[each.value.hub_name].arn
]
}
resources = [
# Grant access only to the bucket and its contents
aws_s3_bucket.user_buckets[each.value.bucket_name].arn,
"${aws_s3_bucket.user_buckets[each.value.bucket_name].arn}/*"
]
}
}

resource "aws_s3_bucket_policy" "user_bucket_access" {

for_each = { for bp in local.bucket_permissions : "${bp.hub_name}.${bp.bucket_name}" => bp }
bucket = aws_s3_bucket.user_buckets[each.value.bucket_name].id
policy = data.aws_iam_policy_document.bucket_access[each.key].json
}
52 changes: 52 additions & 0 deletions terraform/aws/irsa.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
data "aws_caller_identity" "current" {}

data "aws_partition" "current" {}

resource "aws_iam_role" "irsa_role" {
for_each = var.hub_cloud_permissions
name = "${var.cluster_name}-${each.key}"

assume_role_policy = data.aws_iam_policy_document.irsa_role_assume[each.key].json
}

data "aws_iam_policy_document" "irsa_role_assume" {
for_each = var.hub_cloud_permissions
statement {

effect = "Allow"

actions = ["sts:AssumeRoleWithWebIdentity"]

principals {
type = "Federated"

identifiers = [
"arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${replace(data.aws_eks_cluster.cluster.identity[0].oidc[0].issuer, "https://", "")}"
]
}
condition {
test = "StringEquals"
variable = "${replace(data.aws_eks_cluster.cluster.identity[0].oidc[0].issuer, "https://", "")}:sub"
values = [
"system:serviceaccount:${each.key}:user-sa"
]
}
}
}

output "kubernetes_sa_annotations" {
value = {
for k, v in var.hub_cloud_permissions :
k => "eks.amazonaws.com/role-arn: ${aws_iam_role.irsa_role[k].arn}"
}
description = <<-EOT
Annotations to apply to userServiceAccount in each hub to enable cloud permissions for them.
Helm, not terraform, control namespace creation for us. This makes it quite difficult
to create the appropriate kubernetes service account attached to the Google Cloud Service
Account in the appropriate namespace. Instead, this output provides the list of annotations
to be applied to the kubernetes service account used by jupyter and dask pods in a given hub.
This should be specified under userServiceAccount.annotations (or basehub.userServiceAccount.annotations
in case of daskhub) on a values file created specifically for that hub.
EOT
}
2 changes: 1 addition & 1 deletion terraform/aws/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
version = "~> 4.15"
}
}
backend "gcs" {
Expand Down
23 changes: 22 additions & 1 deletion terraform/aws/projects/uwhackweeks.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,25 @@ region = "us-west-2"

cluster_name = "uwhackweeks"

cluster_nodes_location = "us-west-2b"
cluster_nodes_location = "us-west-2b"

user_buckets = {
"scratch-staging": {
"delete_after" : 7
},
"scratch": {
"delete_after": 7
}
}


hub_cloud_permissions = {
"staging" : {
requestor_pays: true,
bucket_admin_access: ["scratch-staging"],
},
"prod" : {
requestor_pays: true,
bucket_admin_access: ["scratch"],
}
}
32 changes: 32 additions & 0 deletions terraform/aws/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,35 @@ variable "cluster_nodes_location" {
Location of the nodes of the kubernetes cluster
EOT
}

variable "user_buckets" {
type = map(object({ delete_after : number }))
default = {}
description = <<-EOT
GCS Buckets to be created.
The key for each entry will be prefixed with {var.prefix}- to form
the name of the bucket.
The value is a map, with 'delete_after' the only accepted key in that
map - it lists the number of days after which any content in the
bucket will be deleted. Set to null to not delete data.
EOT
}

variable "hub_cloud_permissions" {
type = map(object({ requestor_pays : bool, bucket_admin_access : set(string) }))
default = {}
description = <<-EOT
Map of cloud permissions given to a particular hub
Key is name of the hub namespace in the cluster, and values are particular
permissions users running on those hubs should have. Currently supported are:
1. requestor_pays: Identify as coming from the google cloud project when accessing
storage buckets marked as https://cloud.google.com/storage/docs/requester-pays.
This *potentially* incurs cost for us, the originating project, so opt-in.
2. bucket_admin_access: List of GCS storage buckets that users on this hub should have read
and write permissions for.
EOT
}

0 comments on commit da5c92c

Please sign in to comment.