Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to AWS SDK v2 #286

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Update to AWS SDK v2 #286

wants to merge 4 commits into from

Conversation

fredex42
Copy link

What does this change?

Swaps out AWS SDK v1 for S3 with AWS SDK v2

How to test

How can we measure success?

Have we considered potential risks?

@fredex42 fredex42 marked this pull request as draft May 22, 2023 13:47
rtyley added a commit that referenced this pull request Jan 8, 2025
This change adds these improvements:

* Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching))
* Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286

The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338.

### Usage

```scala
import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching
import com.gu.facia.client.{ApiClient, Environment}
import software.amazon.awssdk.services.s3.S3AsyncClient

val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build()

val apiClient = ApiClient.withCaching(
  "facia-tool-store",
  Environment.Prod,
  S3ObjectFetching.byteArraysWith(s3AsyncClient)
)
```

_PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_

### Independence from AWS SDK version (v1 vs v2)

Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version.

For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version.

Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations:

* **`ApiClient()`**  - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour
* **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65)

### Solved problems

* **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
rtyley added a commit that referenced this pull request Jan 8, 2025
This change adds these improvements:

* Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching))
* Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286

The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338.

### Usage

```scala
import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching
import com.gu.facia.client.{ApiClient, Environment}
import software.amazon.awssdk.services.s3.S3AsyncClient

val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build()

val apiClient = ApiClient.withCaching(
  "facia-tool-store",
  Environment.Prod,
  S3ObjectFetching.byteArraysWith(s3AsyncClient)
)
```

_PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_

### Independence from AWS SDK version (v1 vs v2)

Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version.

For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version.

Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations:

* **`ApiClient()`**  - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour
* **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65)

### Solved problems

* **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant