Skip to content

Commit

Permalink
Allow public access to buckets via IAM, not ACL
Browse files Browse the repository at this point in the history
Previously, we were mixing ACL and IAM, which led to
basically the bucket *not* being accessible publicly - only to
authenticated users.

This switches everything to using IAM, which *does* make the
bucket properly publicly accessible.

In addition, there's now a policy that we only enable this
when 2i2c is *not* handling billing, as there can be
disastrous cost consequences.

We fix this for the LEAP bucket.

Config is moved to `user_buckets` rather than `hub_permissions`,
as the config is purely set on the bucket and not related to
which hub we are configuring.

Ref https://2i2c.freshdesk.com/a/tickets/954
  • Loading branch information
yuvipanda committed Nov 6, 2023
1 parent e6f32a1 commit 064c7bb
Show file tree
Hide file tree
Showing 6 changed files with 40 additions and 26 deletions.
28 changes: 25 additions & 3 deletions docs/howto/features/buckets.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,31 @@ on why users want this!

4. Get this change deployed, and users should now be able to use the buckets!
Currently running users might have to restart their pods for the change to take effect.


## Allowing access to buckets from outside the JupyterHub


## Allowing public, readonly to buckets from outside the JupyterHub

### GCP

Some hubs want to expose a particular bucket to the broad internet.
This can have catastrophic cost consequences, so we only allow this
on clusters where 2i2c is not paying the bill for.

This can be enabled by setting the `public_access` parameter in
`user_buckets` for the appropriate bucket, and running `terraform apply`.

Example:

```terraform
user_buckets = {
"persistent": {
"delete_after": null,
"public_access": true
}
}
```

## Allowing authenticated access to buckets from outside the JupyterHub

### GCP

Expand Down
3 changes: 0 additions & 3 deletions docs/howto/features/cloud-access.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ This AWS IAM Role is managed via terraform.
"<hub-name-slug>": {
requestor_pays : true,
bucket_admin_access : ["bucket-1", "bucket-2"]
bucket_public_access : ["bucket-1"]
hub_namespace : "<hub-name>"
}
}
Expand All @@ -64,8 +63,6 @@ This AWS IAM Role is managed via terraform.
access to. Used along with the [user_buckets](howto:features:storage-buckets)
terraform variable to enable the [scratch buckets](topic:features:cloud:scratch-buckets)
feature.
4. `bucket_public_access` lists bucket names (as specified in `user_buckets`
terraform variable) that should be publicly accessible.
5. (GCP only) `hub_namespace` is the full name of the hub, as hubs are put in Kubernetes
Namespaces that are the same as their names. This is explicitly specified here
because `<hub-name-slug>` could possibly be truncated on GCP.
Expand Down
18 changes: 5 additions & 13 deletions terraform/gcp/buckets.tf
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,6 @@ locals {
]
]))

bucket_public_permissions = distinct(flatten([
for hub_name, permissions in var.hub_cloud_permissions : [
for bucket_name in permissions.bucket_public_access : {
hub_name = hub_name
bucket_name = bucket_name
}
]
]))
}

resource "google_storage_bucket_iam_member" "member" {
Expand All @@ -86,11 +78,11 @@ resource "google_storage_bucket_iam_member" "extra_admin_members" {
member = each.value.member
}

resource "google_storage_bucket_access_control" "public_rule" {
for_each = { for bp in local.bucket_public_permissions : "${bp.hub_name}.${bp.bucket_name}" => bp }
bucket = google_storage_bucket.user_buckets[each.value.bucket_name].name
role = "READER"
entity = "allUsers"
resource "google_storage_bucket_iam_member" "public_access" {
for_each = { for k, v in var.user_buckets : k => v if v.public_access }
bucket = google_storage_bucket.user_buckets[each.key].name
role = "roles/storage.objectViewer"
member = "allUsers"
}

output "buckets" {
Expand Down
7 changes: 4 additions & 3 deletions terraform/gcp/projects/leap.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,13 @@ user_buckets = {
# For https://github.com/2i2c-org/infrastructure/issues/1230#issuecomment-1278183441
"persistent-ro" : {
"delete_after" : null,
"extra_admin_members" : ["group:[email protected]"]
"extra_admin_members" : ["group:[email protected]"],
"public_access" : true
},
"persistent-ro-staging" : {
"delete_after" : null,
"extra_admin_members" : ["group:[email protected]"]
"extra_admin_members" : ["group:[email protected]"],
"public_access" : true
}
}

Expand All @@ -63,7 +65,6 @@ hub_cloud_permissions = {
requestor_pays : true,
bucket_admin_access : ["scratch", "persistent"],
bucket_readonly_access : ["persistent-ro"],
bucket_public_access : ["persistent-ro"],
hub_namespace : "prod"
}
}
Expand Down
4 changes: 2 additions & 2 deletions terraform/gcp/projects/m2lines.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ user_buckets = {
},
"public-persistent" : {
"delete_after" : null,
"extra_admin_members" : ["group:[email protected]"]
"extra_admin_members" : ["group:[email protected]"],
"public_access" : true
},

}
Expand Down Expand Up @@ -115,7 +116,6 @@ hub_cloud_permissions = {
"prod" : {
requestor_pays : true,
bucket_admin_access : ["scratch", "persistent", "public-persistent"],
bucket_public_access : ["public-persistent"],
hub_namespace : "prod"
},
}
6 changes: 4 additions & 2 deletions terraform/gcp/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ variable "enable_network_policy" {
}

variable "user_buckets" {
type = map(object({ delete_after : number, extra_admin_members : optional(list(string), []) }))
type = map(object({ delete_after : number, extra_admin_members : optional(list(string), []), public_access : optional(bool, false) }))
default = {}
description = <<-EOT
GCS Buckets to be created.
Expand All @@ -258,6 +258,9 @@ variable "user_buckets" {
is primarily useful for moving data into and out of buckets from outside
the cloud. See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket_iam#member/members
for the format this would be specified in.
'public_access', if set to true, makes the bucket fully accessible to
the public internet, without any authentication.
EOT
}

Expand Down Expand Up @@ -379,7 +382,6 @@ variable "hub_cloud_permissions" {
requestor_pays : bool,
bucket_admin_access : set(string),
bucket_readonly_access : optional(set(string), []),
bucket_public_access : optional(set(string), []),
hub_namespace : string
})
)
Expand Down

0 comments on commit 064c7bb

Please sign in to comment.