Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to write to scratch and persistent buckets on a hub? #3639

Closed
sgibson91 opened this issue Jan 25, 2024 · 7 comments · Fixed by #3663
Closed

How to write to scratch and persistent buckets on a hub? #3639

sgibson91 opened this issue Jan 25, 2024 · 7 comments · Fixed by #3663

Comments

@sgibson91
Copy link
Member

sgibson91 commented Jan 25, 2024

Context

I was asked via support to set up scratch and persistent buckets for the showcase hub, and I did so in:

There was some confusion in the ticket because Jenny was expecting me to send her a CLIENT_KEY and CLIENT_SECRET according to https://docs.2i2c.org/user/topics/data/cloud/, but following https://infrastructure.2i2c.org/howto/features/buckets/ I found no mention of keys and secrets, and nothing I did generated any.

As far as I am aware, adding the bucket names to the bucket_admin_access section of the hub_cloud_permissions variable in terraform grants read/write permission to the bucket in a role that is passed to the hub via a kubernetes annotation. Every user server gets this annotation and reading/writing the bucket should "just work". However, Jenny has reported that she is unable to write to the bucket:

$ import os
$ import xarray as xr
$ SCRATCH_BUCKET = os.environ['SCRATCH_BUCKET'] 
$ ds = xr.tutorial.open_dataset("rasm")  # load example data
$ ds.to_zarr(f'{SCRATCH_BUCKET}/rasm.zarr')  # write data

Refreshing temporary credentials failed during mandatory refresh period.
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py", line 327, in _protected_refresh
    metadata = await self._refresh_using()
  File "/srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py", line 385, in fetch_credentials
    return await self._get_cached_credentials()
  File "/srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py", line 395, in _get_cached_credentials
    response = await self._get_credentials()
  File "/srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py", line 469, in _get_credentials
    return await client.assume_role_with_web_identity(**kwargs)
  File "/srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/client.py", line 371, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
File /srv/conda/envs/notebook/lib/python3.10/site-packages/s3fs/core.py:113, in _error_wrapper(func, args, kwargs, retries)
    112 try:
--> 113     return await func(*args, **kwargs)
    114 except S3_RETRYABLE_ERRORS as e:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/client.py:354, in AioBaseClient._make_api_call(self, operation_name, api_params)
    353     apply_request_checksum(request_dict)
--> 354     http, parsed_response = await self._make_request(
    355         operation_model, request_dict, request_context
    356     )
    358 await self.meta.events.emit(
    359     'after-call.{service_id}.{operation_name}'.format(
    360         service_id=service_id, operation_name=operation_name
   (...)
    365     context=request_context,
    366 )

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/client.py:379, in AioBaseClient._make_request(self, operation_model, request_dict, request_context)
    378 try:
--> 379     return await self._endpoint.make_request(
    380         operation_model, request_dict
    381     )
    382 except Exception as e:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/endpoint.py:96, in AioEndpoint._send_request(self, request_dict, operation_model)
     95 self._update_retries_context(context, attempts)
---> 96 request = await self.create_request(request_dict, operation_model)
     97 success_response, exception = await self._get_response(
     98     request, operation_model, context
     99 )

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/endpoint.py:84, in AioEndpoint.create_request(self, params, operation_model)
     81     event_name = 'request-created.{service_id}.{op_name}'.format(
     82         service_id=service_id, op_name=operation_model.name
     83     )
---> 84     await self._event_emitter.emit(
     85         event_name,
     86         request=request,
     87         operation_name=operation_model.name,
     88     )
     89 prepared_request = self.prepare_request(request)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/hooks.py:66, in AioHierarchicalEmitter._emit(self, event_name, kwargs, stop_on_response)
     65 # Await the handler if its a coroutine.
---> 66 response = await resolve_awaitable(handler(**kwargs))
     67 responses.append((handler, response))

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/_helpers.py:15, in resolve_awaitable(obj)
     14 if inspect.isawaitable(obj):
---> 15     return await obj
     17 return obj

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/signers.py:24, in AioRequestSigner.handler(self, operation_name, request, **kwargs)
     19 async def handler(self, operation_name=None, request=None, **kwargs):
     20     # This is typically hooked up to the "request-created" event
     21     # from a client's event emitter.  When a new request is created
     22     # this method is invoked to sign the request.
     23     # Don't call this method directly.
---> 24     return await self.sign(operation_name, request)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/signers.py:73, in AioRequestSigner.sign(self, operation_name, request, region_name, signing_type, expires_in, signing_name)
     72 try:
---> 73     auth = await self.get_auth_instance(**kwargs)
     74 except UnknownSignatureVersionError as e:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/signers.py:147, in AioRequestSigner.get_auth_instance(self, signing_name, region_name, signature_version, **kwargs)
    145 if self._credentials is not None:
    146     frozen_credentials = (
--> 147         await self._credentials.get_frozen_credentials()
    148     )
    149 kwargs['credentials'] = frozen_credentials

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py:358, in AioRefreshableCredentials.get_frozen_credentials(self)
    357 async def get_frozen_credentials(self):
--> 358     await self._refresh()
    359     return self._frozen_credentials

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py:312, in AioRefreshableCredentials._refresh(self)
    309 is_mandatory_refresh = self.refresh_needed(
    310     self._mandatory_refresh_timeout
    311 )
--> 312 await self._protected_refresh(
    313     is_mandatory=is_mandatory_refresh
    314 )
    315 return

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py:327, in AioRefreshableCredentials._protected_refresh(self, is_mandatory)
    326 try:
--> 327     metadata = await self._refresh_using()
    328 except Exception:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py:385, in AioCachedCredentialFetcher.fetch_credentials(self)
    384 async def fetch_credentials(self):
--> 385     return await self._get_cached_credentials()

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py:395, in AioCachedCredentialFetcher._get_cached_credentials(self)
    394 if response is None:
--> 395     response = await self._get_credentials()
    396     self._write_to_cache(response)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/credentials.py:469, in AioAssumeRoleWithWebIdentityCredentialFetcher._get_credentials(self)
    468 async with self._client_creator('sts', config=config) as client:
--> 469     return await client.assume_role_with_web_identity(**kwargs)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/aiobotocore/client.py:371, in AioBaseClient._make_api_call(self, operation_name, api_params)
    370     error_class = self.exceptions.from_code(error_code)
--> 371     raise error_class(parsed_response, operation_name)
    372 else:

ClientError: An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

The above exception was the direct cause of the following exception:

PermissionError                           Traceback (most recent call last)
Cell In[3], line 1
----> 1 ds.to_zarr(f'{SCRATCH_BUCKET}/rasm.zarr')  # write data

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xarray/core/dataset.py:2141, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, chunkmanager_store_kwargs)
   2020 """Write dataset contents to a zarr group.
   2021 
   2022 Zarr chunks are determined in the following way:
   (...)
   2137     The I/O user guide, with more details and examples.
   2138 """
   2139 from xarray.backends.api import to_zarr
-> 2141 return to_zarr(  # type: ignore[call-overload,misc]
   2142     self,
   2143     store=store,
   2144     chunk_store=chunk_store,
   2145     storage_options=storage_options,
   2146     mode=mode,
   2147     synchronizer=synchronizer,
   2148     group=group,
   2149     encoding=encoding,
   2150     compute=compute,
   2151     consolidated=consolidated,
   2152     append_dim=append_dim,
   2153     region=region,
   2154     safe_chunks=safe_chunks,
   2155     zarr_version=zarr_version,
   2156     chunkmanager_store_kwargs=chunkmanager_store_kwargs,
   2157 )

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xarray/backends/api.py:1672, in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, chunkmanager_store_kwargs)
   1670     already_consolidated = False
   1671     consolidate_on_close = consolidated or consolidated is None
-> 1672 zstore = backends.ZarrStore.open_group(
   1673     store=mapper,
   1674     mode=mode,
   1675     synchronizer=synchronizer,
   1676     group=group,
   1677     consolidated=already_consolidated,
   1678     consolidate_on_close=consolidate_on_close,
   1679     chunk_store=chunk_mapper,
   1680     append_dim=append_dim,
   1681     write_region=region,
   1682     safe_chunks=safe_chunks,
   1683     stacklevel=4,  # for Dataset.to_zarr()
   1684     zarr_version=zarr_version,
   1685 )
   1687 if mode in ["a", "r+"]:
   1688     _validate_datatypes_for_zarr_append(zstore, dataset)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xarray/backends/zarr.py:445, in ZarrStore.open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks, stacklevel, zarr_version)
    443     zarr_group = zarr.open_consolidated(store, **open_kwargs)
    444 else:
--> 445     zarr_group = zarr.open_group(store, **open_kwargs)
    446 return cls(
    447     zarr_group,
    448     mode,
   (...)
    452     safe_chunks,
    453 )

File /srv/conda/envs/notebook/lib/python3.10/site-packages/zarr/hierarchy.py:1455, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, meta_array)
   1452         init_group(store, path=path, chunk_store=chunk_store)
   1454 elif mode in ['w-', 'x']:
-> 1455     if contains_array(store, path=path):
   1456         raise ContainsArrayError(path)
   1457     elif contains_group(store, path=path):

File /srv/conda/envs/notebook/lib/python3.10/site-packages/zarr/storage.py:108, in contains_array(store, path)
    106 prefix = _path_to_prefix(path)
    107 key = _prefix_to_array_key(store, prefix)
--> 108 return key in store

File /srv/conda/envs/notebook/lib/python3.10/site-packages/zarr/storage.py:1447, in FSStore.__contains__(self, key)
   1445 def __contains__(self, key):
   1446     key = self._normalize_key(key)
-> 1447     return key in self.map

File /srv/conda/envs/notebook/lib/python3.10/site-packages/fsspec/mapping.py:181, in FSMap.__contains__(self, key)
    179 """Does key exist in mapping?"""
    180 path = self._key_to_str(key)
--> 181 return self.fs.exists(path) and self.fs.isfile(path)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/fsspec/asyn.py:121, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    118 @functools.wraps(func)
    119 def wrapper(*args, **kwargs):
    120     self = obj or args[0]
--> 121     return sync(self.loop, func, *args, **kwargs)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/fsspec/asyn.py:106, in sync(loop, func, timeout, *args, **kwargs)
    104     raise FSTimeoutError from return_result
    105 elif isinstance(return_result, BaseException):
--> 106     raise return_result
    107 else:
    108     return return_result

File /srv/conda/envs/notebook/lib/python3.10/site-packages/fsspec/asyn.py:61, in _runner(event, coro, result, timeout)
     59     coro = asyncio.wait_for(coro, timeout=timeout)
     60 try:
---> 61     result[0] = await coro
     62 except Exception as ex:
     63     result[0] = ex

File /srv/conda/envs/notebook/lib/python3.10/site-packages/s3fs/core.py:1004, in S3FileSystem._exists(self, path)
   1001     return exists_in_cache
   1003 try:
-> 1004     await self._info(path, bucket, key, version_id=version_id)
   1005     return True
   1006 except FileNotFoundError:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/s3fs/core.py:1271, in S3FileSystem._info(self, path, bucket, key, refresh, version_id)
   1269 if key:
   1270     try:
-> 1271         out = await self._call_s3(
   1272             "head_object",
   1273             self.kwargs,
   1274             Bucket=bucket,
   1275             Key=key,
   1276             **version_id_kw(version_id),
   1277             **self.req_kw,
   1278         )
   1279         return {
   1280             "ETag": out.get("ETag", ""),
   1281             "LastModified": out["LastModified"],
   (...)
   1287             "ContentType": out.get("ContentType"),
   1288         }
   1289     except FileNotFoundError:

File /srv/conda/envs/notebook/lib/python3.10/site-packages/s3fs/core.py:348, in S3FileSystem._call_s3(self, method, *akwarglist, **kwargs)
    346 logger.debug("CALL: %s - %s - %s", method.__name__, akwarglist, kw2)
    347 additional_kwargs = self._get_s3_method_kwargs(method, *akwarglist, **kwargs)
--> 348 return await _error_wrapper(
    349     method, kwargs=additional_kwargs, retries=self.retries
    350 )

File /srv/conda/envs/notebook/lib/python3.10/site-packages/s3fs/core.py:140, in _error_wrapper(func, args, kwargs, retries)
    138         err = e
    139 err = translate_boto_error(err)
--> 140 raise err

PermissionError: Not authorized to perform sts:AssumeRoleWithWebIdentity

What is going on here? I don't actually know how researchers actually access/write to buckets and assumed this "just worked" since no one else has had issues, and I assumed I had done everything I needed to from the terraform/hub side of things. Can @2i2c-org/engineering provide any clarity?

Proposal

No response

Updates and actions

No response

@GeorgianaElena
Copy link
Member

@sgibson91, found this #1322 (comment) that looks relevant and wanted to read on it.

@GeorgianaElena
Copy link
Member

I don't actually know how researchers actually access/write to buckets and assumed this "just worked" since no one else has had issues,

Also, I'm in the same boat on the above ⬆️

@jnywong
Copy link
Member

jnywong commented Jan 25, 2024

@sgibson91, found this #1322 (comment) that looks relevant and wanted to read on it.

I dunno if this helps but in reference to the above, I tried the following in the Showcase Hub and got:

(notebook) jovyan@jupyter-jnywong:~$ aws sts assume-role-with-web-identity \
 --role-arn $AWS_ROLE_ARN \
 --role-session-name $JUPYTERHUB_CLIENT_ID \
 --web-identity-token file://$AWS_WEB_IDENTITY_TOKEN_FILE

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

@damianavila
Copy link
Contributor

damianavila commented Jan 29, 2024

@yuvipanda or @consideRatio, do you have any further thoughts about this one?

@consideRatio
Copy link
Contributor

consideRatio commented Jan 31, 2024

Issue 1 - our docs isn't explicit enough (now tracked in #3665)

I think our working with object storage docs should explicitly distinguish the following scenarios:

  1. its a 2i2c hub's user environment and its 2i2c hub's associated cloud bucket

    In this case, writing data should work straight away as credentials should be setup and available for the user as long as the user environment image has relevant cloud CLI tools installed like aws, gcloud, or az.

    I know that the pangeo/pangeo-notebook is an image that includes the aws CLI of relevance for the showcase hub with an AWS bucket.

  2. its their environment / laptop and its 2i2c hub's associated cloud bucket
    In this case, they need to acquire and configure credentials manually first. They could either get temporary credentials by extracting such from a 2i2c user environment, or use dedicated credentials in the cloud project setup for them personally.

    Extracting temporary credentials is discussed in Access to buckets on AWS and GCP from local computers features#22 and isn't an established and user friendly procedure yet.

  3. its their environment / laptop and its their cloud bucket

    I don't think its in scope for 2i2c to help with this or write docs focused on this situation.

Issue 2 - access to showcase hub's scratch bucket

I've attempted to use the code snippet from our docs on scratch buckets and also got an error.

import os
import xarray as xr
SCRATCH_BUCKET = os.environ['SCRATCH_BUCKET'] 
ds = xr.tutorial.open_dataset("rasm")  # load example data
ds.to_zarr(f'{SCRATCH_BUCKET}/rasm.zarr')  # write data

Using the more direct test of aws s3 ls $SCRATCH_BUCKET I've concluded that there is an auth issue to handle for the showcase hub.

Like @sgibson91 and @GeorgianaElena, my expectation is that permissions to access the scratch bucket should be setup automatically.

Debugging issue 2

So how do I get aws s3 ls $SCRATCH_BUCKET to run successfully in a terminal?

Explicit understanding

My understanding is that for bucket permissions to be provided to the user servers' users by:

  1. user pods should reference a k8s ServiceAccount
  2. the k8s ServiceAccount should via annotation reference an AWS role or similar granting relevant permissions
  3. behind the scenes: aws has registered a "mutating webhook" to modify pod specifications before they are registered, and it modifies pod specifications referencing k8s ServiceAcconts with aws specific annotations so that they get mounted credentials.
     # a volume is added to the pod
     - name: aws-iam-token
       projected:
         defaultMode: 420
         sources:
         - serviceAccountToken:
             audience: sts.amazonaws.com
             expirationSeconds: 86400
             path: token
    
     # a volumeMount is added to the pod's user server container
     - mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
       name: aws-iam-token
       readOnly: true
       
     # env variables are setup in the pod's user server container to detail relevant info for the `aws` cli
     - name: AWS_STS_REGIONAL_ENDPOINTS
       value: regional
     - name: AWS_DEFAULT_REGION
       value: us-west-2
     - name: AWS_REGION
       value: us-west-2
     - name: AWS_ROLE_ARN
       value: arn:aws:iam::790657130469:role/2i2c-aws-us-researchdelight
     - name: AWS_WEB_IDENTITY_TOKEN_FILE
       value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
  4. (assumption) aws as a CLI will be able to make use of these credentials without further commands, and this is done by exchanging the mounted token ("web identity token") for temporary credentials representing the an AWS Role we reference

What fails?

  1. I've seen /var/run/secrets/eks.amazonaws.com/serviceaccount in a showcase user pod, the available credentials seem to have propegated properly to there together with environment variables.

  2. I noted that aws cli was version 1.x in the older pangeo/pangeo-notebook image used, so I switched to a newer image where the aws cli was version 2.x - no difference.

  3. I think what fails is that we lack the relevant permissions, specifically the permission to sts:AssumeRoleWithWebIdentity, based on An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity.
    Why? I'll have a look in terraform config and verify the state is matching the config. To be continued...

  4. I went looking terraform config and how various resources are created, and then observed in the AWS web console and concluded that there was a reference to the user-sa ServieAccount in the researchdelight namespace. The showcase hub doesn't live in that namespace, so that caused issues I think.

    In practice, this terraform section in terraform/aws/irsa.tf led to it in combination with:

    hub_cloud_permissions = {
      # ...
      "researchdelight" : {
        requestor_pays : true,
        bucket_admin_access : [
          "scratch-researchdelight",
          "persistent-showcase"
        ],
        extra_iam_policy : ""
      },
      # ...

    By updating hub_cloud_permissions.researchdelight to hub_cloud_permissions.showcase we end up creating the relevant resources to grant permissions in the right namespace without recreating the buckets (which would destroy data).

    terraform plan that I've applied
    terraform plan -var-file=projects/2i2c-aws-us.tfvars
    data.aws_partition.current: Reading...
    data.aws_caller_identity.current: Reading...
    data.aws_partition.current: Read complete after 0s [id=aws]
    aws_iam_user.continuous_deployer: Refreshing state... [id=hub-continuous-deployer]
    aws_efs_file_system.homedirs: Refreshing state... [id=fs-0b70db2b65209a77d]
    data.aws_eks_cluster.cluster: Reading...
    aws_s3_bucket.user_buckets["scratch-researchdelight"]: Refreshing state... [id=2i2c-aws-us-scratch-researchdelight]
    aws_s3_bucket.user_buckets["scratch-itcoocean"]: Refreshing state... [id=2i2c-aws-us-scratch-itcoocean]
    aws_s3_bucket.user_buckets["scratch-staging"]: Refreshing state... [id=2i2c-aws-us-scratch-staging]
    aws_s3_bucket.user_buckets["scratch-dask-staging"]: Refreshing state... [id=2i2c-aws-us-scratch-dask-staging]
    aws_s3_bucket.user_buckets["persistent-showcase"]: Refreshing state... [id=2i2c-aws-us-persistent-showcase]
    aws_s3_bucket.user_buckets["scratch-go-bgc"]: Refreshing state... [id=2i2c-aws-us-scratch-go-bgc]
    data.aws_caller_identity.current: Read complete after 1s [id=790657130469]
    aws_s3_bucket.user_buckets["scratch-ncar-cisl"]: Refreshing state... [id=2i2c-aws-us-scratch-ncar-cisl]
    aws_s3_bucket_lifecycle_configuration.user_bucket_expiry["scratch-dask-staging"]: Refreshing state... [id=2i2c-aws-us-scratch-dask-staging]
    data.aws_eks_cluster.cluster: Read complete after 1s [id=2i2c-aws-us]
    aws_s3_bucket_lifecycle_configuration.user_bucket_expiry["scratch-go-bgc"]: Refreshing state... [id=2i2c-aws-us-scratch-go-bgc]
    aws_s3_bucket_lifecycle_configuration.user_bucket_expiry["scratch-itcoocean"]: Refreshing state... [id=2i2c-aws-us-scratch-itcoocean]
    aws_s3_bucket_lifecycle_configuration.user_bucket_expiry["scratch-ncar-cisl"]: Refreshing state... [id=2i2c-aws-us-scratch-ncar-cisl]
    aws_s3_bucket_lifecycle_configuration.user_bucket_expiry["scratch-researchdelight"]: Refreshing state... [id=2i2c-aws-us-scratch-researchdelight]
    aws_s3_bucket_lifecycle_configuration.user_bucket_expiry["scratch-staging"]: Refreshing state... [id=2i2c-aws-us-scratch-staging]
    aws_s3_bucket_lifecycle_configuration.user_bucket_expiry["persistent-showcase"]: Refreshing state... [id=2i2c-aws-us-persistent-showcase]
    aws_iam_access_key.continuous_deployer: Refreshing state... [id=AKIA3QFWWL7SV6N5KWVD]
    data.aws_subnets.cluster_node_subnets: Reading...
    aws_iam_user_policy.continuous_deployer: Refreshing state... [id=hub-continuous-deployer:eks-readonly]
    data.aws_iam_policy_document.irsa_role_assume["staging"]: Reading...
    data.aws_iam_policy_document.irsa_role_assume["staging"]: Read complete after 0s [id=1010295039]
    data.aws_iam_policy_document.irsa_role_assume["itcoocean"]: Reading...
    data.aws_iam_policy_document.irsa_role_assume["itcoocean"]: Read complete after 0s [id=2616849819]
    data.aws_iam_policy_document.irsa_role_assume["ncar-cisl"]: Reading...
    data.aws_iam_policy_document.irsa_role_assume["ncar-cisl"]: Read complete after 0s [id=3212506434]
    data.aws_iam_policy_document.irsa_role_assume["dask-staging"]: Reading...
    data.aws_iam_policy_document.irsa_role_assume["dask-staging"]: Read complete after 0s [id=136391812]
    data.aws_iam_policy_document.irsa_role_assume["showcase"]: Reading...
    data.aws_iam_policy_document.irsa_role_assume["showcase"]: Read complete after 0s [id=236808588]
    data.aws_iam_policy_document.irsa_role_assume["go-bgc"]: Reading...
    data.aws_iam_policy_document.irsa_role_assume["go-bgc"]: Read complete after 0s [id=4000431596]
    data.aws_security_group.cluster_nodes_shared_security_group: Reading...
    aws_efs_backup_policy.homedirs: Refreshing state... [id=fs-0b70db2b65209a77d]
    aws_iam_role.irsa_role["ncar-cisl"]: Refreshing state... [id=2i2c-aws-us-ncar-cisl]
    data.aws_subnets.cluster_node_subnets: Read complete after 1s [id=us-west-2]
    aws_iam_role.irsa_role["itcoocean"]: Refreshing state... [id=2i2c-aws-us-itcoocean]
    aws_iam_role.irsa_role["researchdelight"]: Refreshing state... [id=2i2c-aws-us-researchdelight]
    data.aws_security_group.cluster_nodes_shared_security_group: Read complete after 1s [id=sg-000f5f85de16c7792]
    aws_iam_role.irsa_role["dask-staging"]: Refreshing state... [id=2i2c-aws-us-dask-staging]
    aws_iam_role.irsa_role["go-bgc"]: Refreshing state... [id=2i2c-aws-us-go-bgc]
    aws_iam_role.irsa_role["staging"]: Refreshing state... [id=2i2c-aws-us-staging]
    aws_efs_mount_target.homedirs["subnet-03b0128ec2dc8b556"]: Refreshing state... [id=fsmt-0dee9be88ac06e4cf]
    aws_efs_mount_target.homedirs["subnet-0b712f4ef2e6898ad"]: Refreshing state... [id=fsmt-0d2dcf261ec5830ea]
    aws_efs_mount_target.homedirs["subnet-0c6ba7d0e925eb6b2"]: Refreshing state... [id=fsmt-010692c2cd660e83f]
    aws_s3_bucket_policy.user_bucket_access["ncar-cisl.scratch-ncar-cisl"]: Refreshing state... [id=2i2c-aws-us-scratch-ncar-cisl]
    aws_s3_bucket_policy.user_bucket_access["go-bgc.scratch-go-bgc"]: Refreshing state... [id=2i2c-aws-us-scratch-go-bgc]
    aws_s3_bucket_policy.user_bucket_access["researchdelight.scratch-researchdelight"]: Refreshing state... [id=2i2c-aws-us-scratch-researchdelight]
    aws_s3_bucket_policy.user_bucket_access["itcoocean.scratch-itcoocean"]: Refreshing state... [id=2i2c-aws-us-scratch-itcoocean]
    aws_s3_bucket_policy.user_bucket_access["staging.scratch-staging"]: Refreshing state... [id=2i2c-aws-us-scratch-staging]
    aws_s3_bucket_policy.user_bucket_access["dask-staging.scratch-dask-staging"]: Refreshing state... [id=2i2c-aws-us-scratch-dask-staging]
    aws_s3_bucket_policy.user_bucket_access["researchdelight.persistent-showcase"]: Refreshing state... [id=2i2c-aws-us-persistent-showcase]
    
    Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
    + create
    ~ update in-place
    - destroy
    <= read (data resources)
    
    Terraform will perform the following actions:
    
    # data.aws_iam_policy_document.bucket_access["dask-staging.scratch-dask-staging"] will be read during apply
    # (depends on a resource or a module with changes pending)
    <= data "aws_iam_policy_document" "bucket_access" {
          + id   = (known after apply)
          + json = (known after apply)
    
          + statement {
             + actions   = [
                + "s3:*",
                ]
             + effect    = "Allow"
             + resources = [
                + "arn:aws:s3:::2i2c-aws-us-scratch-dask-staging",
                + "arn:aws:s3:::2i2c-aws-us-scratch-dask-staging/*",
                ]
    
             + principals {
                + identifiers = [
                      + "arn:aws:iam::790657130469:role/2i2c-aws-us-dask-staging",
                   ]
                + type        = "AWS"
                }
          }
       }
    
    # data.aws_iam_policy_document.bucket_access["go-bgc.scratch-go-bgc"] will be read during apply
    # (depends on a resource or a module with changes pending)
    <= data "aws_iam_policy_document" "bucket_access" {
          + id   = (known after apply)
          + json = (known after apply)
    
          + statement {
             + actions   = [
                + "s3:*",
                ]
             + effect    = "Allow"
             + resources = [
                + "arn:aws:s3:::2i2c-aws-us-scratch-go-bgc",
                + "arn:aws:s3:::2i2c-aws-us-scratch-go-bgc/*",
                ]
    
             + principals {
                + identifiers = [
                      + "arn:aws:iam::790657130469:role/2i2c-aws-us-go-bgc",
                   ]
                + type        = "AWS"
                }
          }
       }
    
    # data.aws_iam_policy_document.bucket_access["itcoocean.scratch-itcoocean"] will be read during apply
    # (depends on a resource or a module with changes pending)
    <= data "aws_iam_policy_document" "bucket_access" {
          + id   = (known after apply)
          + json = (known after apply)
    
          + statement {
             + actions   = [
                + "s3:*",
                ]
             + effect    = "Allow"
             + resources = [
                + "arn:aws:s3:::2i2c-aws-us-scratch-itcoocean",
                + "arn:aws:s3:::2i2c-aws-us-scratch-itcoocean/*",
                ]
    
             + principals {
                + identifiers = [
                      + "arn:aws:iam::790657130469:role/2i2c-aws-us-itcoocean",
                   ]
                + type        = "AWS"
                }
          }
       }
    
    # data.aws_iam_policy_document.bucket_access["ncar-cisl.scratch-ncar-cisl"] will be read during apply
    # (depends on a resource or a module with changes pending)
    <= data "aws_iam_policy_document" "bucket_access" {
          + id   = (known after apply)
          + json = (known after apply)
    
          + statement {
             + actions   = [
                + "s3:*",
                ]
             + effect    = "Allow"
             + resources = [
                + "arn:aws:s3:::2i2c-aws-us-scratch-ncar-cisl",
                + "arn:aws:s3:::2i2c-aws-us-scratch-ncar-cisl/*",
                ]
    
             + principals {
                + identifiers = [
                      + "arn:aws:iam::790657130469:role/2i2c-aws-us-ncar-cisl",
                   ]
                + type        = "AWS"
                }
          }
       }
    
    # data.aws_iam_policy_document.bucket_access["showcase.persistent-showcase"] will be read during apply
    # (config refers to values not yet known)
    <= data "aws_iam_policy_document" "bucket_access" {
          + id   = (known after apply)
          + json = (known after apply)
    
          + statement {
             + actions   = [
                + "s3:*",
                ]
             + effect    = "Allow"
             + resources = [
                + "arn:aws:s3:::2i2c-aws-us-persistent-showcase",
                + "arn:aws:s3:::2i2c-aws-us-persistent-showcase/*",
                ]
    
             + principals {
                + identifiers = [
                      + (known after apply),
                   ]
                + type        = "AWS"
                }
          }
       }
    
    # data.aws_iam_policy_document.bucket_access["showcase.scratch-researchdelight"] will be read during apply
    # (config refers to values not yet known)
    <= data "aws_iam_policy_document" "bucket_access" {
          + id   = (known after apply)
          + json = (known after apply)
    
          + statement {
             + actions   = [
                + "s3:*",
                ]
             + effect    = "Allow"
             + resources = [
                + "arn:aws:s3:::2i2c-aws-us-scratch-researchdelight",
                + "arn:aws:s3:::2i2c-aws-us-scratch-researchdelight/*",
                ]
    
             + principals {
                + identifiers = [
                      + (known after apply),
                   ]
                + type        = "AWS"
                }
          }
       }
    
    # data.aws_iam_policy_document.bucket_access["staging.scratch-staging"] will be read during apply
    # (depends on a resource or a module with changes pending)
    <= data "aws_iam_policy_document" "bucket_access" {
          + id   = (known after apply)
          + json = (known after apply)
    
          + statement {
             + actions   = [
                + "s3:*",
                ]
             + effect    = "Allow"
             + resources = [
                + "arn:aws:s3:::2i2c-aws-us-scratch-staging",
                + "arn:aws:s3:::2i2c-aws-us-scratch-staging/*",
                ]
    
             + principals {
                + identifiers = [
                      + "arn:aws:iam::790657130469:role/2i2c-aws-us-staging",
                   ]
                + type        = "AWS"
                }
          }
       }
    
    # aws_iam_role.irsa_role["researchdelight"] will be destroyed
    # (because key ["researchdelight"] is not in for_each map)
    - resource "aws_iam_role" "irsa_role" {
          - arn                   = "arn:aws:iam::790657130469:role/2i2c-aws-us-researchdelight" -> null
          - assume_role_policy    = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "sts:AssumeRoleWithWebIdentity"
                         - Condition = {
                            - StringEquals = {
                                  - "oidc.eks.us-west-2.amazonaws.com/id/E2ACE6437981F58A2BA31CE7F6F85AB8:sub" = "system:serviceaccount:researchdelight:user-sa"
                               }
                            }
                         - Effect    = "Allow"
                         - Principal = {
                            - Federated = "arn:aws:iam::790657130469:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/E2ACE6437981F58A2BA31CE7F6F85AB8"
                            }
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> null
          - create_date           = "2022-12-02T18:04:34Z" -> null
          - force_detach_policies = false -> null
          - id                    = "2i2c-aws-us-researchdelight" -> null
          - managed_policy_arns   = [] -> null
          - max_session_duration  = 3600 -> null
          - name                  = "2i2c-aws-us-researchdelight" -> null
          - path                  = "/" -> null
          - role_last_used        = [
             - {
                - last_used_date = "2023-10-23T00:23:14Z"
                - region         = "us-west-2"
                },
          ] -> null
          - tags                  = {} -> null
          - tags_all              = {} -> null
          - unique_id             = "AROA3QFWWL7S4UXO5J6T7" -> null
       }
    
    # aws_iam_role.irsa_role["showcase"] will be created
    + resource "aws_iam_role" "irsa_role" {
          + arn                   = (known after apply)
          + assume_role_policy    = jsonencode(
                {
                + Statement = [
                      + {
                         + Action    = "sts:AssumeRoleWithWebIdentity"
                         + Condition = {
                            + StringEquals = {
                                  + "oidc.eks.us-west-2.amazonaws.com/id/E2ACE6437981F58A2BA31CE7F6F85AB8:sub" = "system:serviceaccount:showcase:user-sa"
                               }
                            }
                         + Effect    = "Allow"
                         + Principal = {
                            + Federated = "arn:aws:iam::790657130469:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/E2ACE6437981F58A2BA31CE7F6F85AB8"
                            }
                         + Sid       = ""
                      },
                   ]
                + Version   = "2012-10-17"
                }
          )
          + create_date           = (known after apply)
          + force_detach_policies = false
          + id                    = (known after apply)
          + managed_policy_arns   = (known after apply)
          + max_session_duration  = 3600
          + name                  = "2i2c-aws-us-showcase"
          + name_prefix           = (known after apply)
          + path                  = "/"
          + role_last_used        = (known after apply)
          + tags_all              = (known after apply)
          + unique_id             = (known after apply)
       }
    
    # aws_s3_bucket_policy.user_bucket_access["dask-staging.scratch-dask-staging"] will be updated in-place
    ~ resource "aws_s3_bucket_policy" "user_bucket_access" {
          id     = "2i2c-aws-us-scratch-dask-staging"
          ~ policy = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "s3:*"
                         - Effect    = "Allow"
                         - Principal = {
                            - AWS = "arn:aws:iam::790657130469:role/2i2c-aws-us-dask-staging"
                            }
                         - Resource  = [
                            - "arn:aws:s3:::2i2c-aws-us-scratch-dask-staging/*",
                            - "arn:aws:s3:::2i2c-aws-us-scratch-dask-staging",
                            ]
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> (known after apply)
          # (1 unchanged attribute hidden)
       }
    
    # aws_s3_bucket_policy.user_bucket_access["go-bgc.scratch-go-bgc"] will be updated in-place
    ~ resource "aws_s3_bucket_policy" "user_bucket_access" {
          id     = "2i2c-aws-us-scratch-go-bgc"
          ~ policy = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "s3:*"
                         - Effect    = "Allow"
                         - Principal = {
                            - AWS = "arn:aws:iam::790657130469:role/2i2c-aws-us-go-bgc"
                            }
                         - Resource  = [
                            - "arn:aws:s3:::2i2c-aws-us-scratch-go-bgc/*",
                            - "arn:aws:s3:::2i2c-aws-us-scratch-go-bgc",
                            ]
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> (known after apply)
          # (1 unchanged attribute hidden)
       }
    
    # aws_s3_bucket_policy.user_bucket_access["itcoocean.scratch-itcoocean"] will be updated in-place
    ~ resource "aws_s3_bucket_policy" "user_bucket_access" {
          id     = "2i2c-aws-us-scratch-itcoocean"
          ~ policy = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "s3:*"
                         - Effect    = "Allow"
                         - Principal = {
                            - AWS = "arn:aws:iam::790657130469:role/2i2c-aws-us-itcoocean"
                            }
                         - Resource  = [
                            - "arn:aws:s3:::2i2c-aws-us-scratch-itcoocean/*",
                            - "arn:aws:s3:::2i2c-aws-us-scratch-itcoocean",
                            ]
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> (known after apply)
          # (1 unchanged attribute hidden)
       }
    
    # aws_s3_bucket_policy.user_bucket_access["ncar-cisl.scratch-ncar-cisl"] will be updated in-place
    ~ resource "aws_s3_bucket_policy" "user_bucket_access" {
          id     = "2i2c-aws-us-scratch-ncar-cisl"
          ~ policy = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "s3:*"
                         - Effect    = "Allow"
                         - Principal = {
                            - AWS = "arn:aws:iam::790657130469:role/2i2c-aws-us-ncar-cisl"
                            }
                         - Resource  = [
                            - "arn:aws:s3:::2i2c-aws-us-scratch-ncar-cisl/*",
                            - "arn:aws:s3:::2i2c-aws-us-scratch-ncar-cisl",
                            ]
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> (known after apply)
          # (1 unchanged attribute hidden)
       }
    
    # aws_s3_bucket_policy.user_bucket_access["researchdelight.persistent-showcase"] will be destroyed
    # (because key ["researchdelight.persistent-showcase"] is not in for_each map)
    - resource "aws_s3_bucket_policy" "user_bucket_access" {
          - bucket = "2i2c-aws-us-persistent-showcase" -> null
          - id     = "2i2c-aws-us-persistent-showcase" -> null
          - policy = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "s3:*"
                         - Effect    = "Allow"
                         - Principal = {
                            - AWS = "arn:aws:iam::790657130469:role/2i2c-aws-us-researchdelight"
                            }
                         - Resource  = [
                            - "arn:aws:s3:::2i2c-aws-us-persistent-showcase/*",
                            - "arn:aws:s3:::2i2c-aws-us-persistent-showcase",
                            ]
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> null
       }
    
    # aws_s3_bucket_policy.user_bucket_access["researchdelight.scratch-researchdelight"] will be destroyed
    # (because key ["researchdelight.scratch-researchdelight"] is not in for_each map)
    - resource "aws_s3_bucket_policy" "user_bucket_access" {
          - bucket = "2i2c-aws-us-scratch-researchdelight" -> null
          - id     = "2i2c-aws-us-scratch-researchdelight" -> null
          - policy = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "s3:*"
                         - Effect    = "Allow"
                         - Principal = {
                            - AWS = "arn:aws:iam::790657130469:role/2i2c-aws-us-researchdelight"
                            }
                         - Resource  = [
                            - "arn:aws:s3:::2i2c-aws-us-scratch-researchdelight/*",
                            - "arn:aws:s3:::2i2c-aws-us-scratch-researchdelight",
                            ]
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> null
       }
    
    # aws_s3_bucket_policy.user_bucket_access["showcase.persistent-showcase"] will be created
    + resource "aws_s3_bucket_policy" "user_bucket_access" {
          + bucket = "2i2c-aws-us-persistent-showcase"
          + id     = (known after apply)
          + policy = (known after apply)
       }
    
    # aws_s3_bucket_policy.user_bucket_access["showcase.scratch-researchdelight"] will be created
    + resource "aws_s3_bucket_policy" "user_bucket_access" {
          + bucket = "2i2c-aws-us-scratch-researchdelight"
          + id     = (known after apply)
          + policy = (known after apply)
       }
    
    # aws_s3_bucket_policy.user_bucket_access["staging.scratch-staging"] will be updated in-place
    ~ resource "aws_s3_bucket_policy" "user_bucket_access" {
          id     = "2i2c-aws-us-scratch-staging"
          ~ policy = jsonencode(
                {
                - Statement = [
                      - {
                         - Action    = "s3:*"
                         - Effect    = "Allow"
                         - Principal = {
                            - AWS = "arn:aws:iam::790657130469:role/2i2c-aws-us-staging"
                            }
                         - Resource  = [
                            - "arn:aws:s3:::2i2c-aws-us-scratch-staging/*",
                            - "arn:aws:s3:::2i2c-aws-us-scratch-staging",
                            ]
                         - Sid       = ""
                      },
                   ]
                - Version   = "2012-10-17"
                }
          ) -> (known after apply)
          # (1 unchanged attribute hidden)
       }
    
    Plan: 3 to add, 5 to change, 3 to destroy.
    
    Changes to Outputs:
    ~ kubernetes_sa_annotations = {
          - researchdelight = "eks.amazonaws.com/role-arn: arn:aws:iam::790657130469:role/2i2c-aws-us-researchdelight"
          + showcase        = (known after apply)
          # (5 unchanged attributes hidden)
       }
  5. Applying this, and updating basehub.userServiceAccount.annotations where we reference an AWS role now named differently, things work again!

@jnywong
Copy link
Member

jnywong commented Jan 31, 2024

Thanks for the update @consideRatio ! I will definitely take the lessons learned from this to update the Service Docs accordingly.

@consideRatio
Copy link
Contributor

Thanks for the update @consideRatio ! I will definitely take the lessons learned from this to update the Service Docs accordingly.

Thank you @jnywong!!! I opened #3663 to extract "Issue 1" from this issue that I closed as part of verifying that bucket access was once again functioning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

5 participants