-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't flake out on 404s #32
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
👍 |
rtyley
added a commit
to guardian/etag-caching
that referenced
this pull request
Sep 6, 2023
This is a hack, and it probably won't be able to fully conceal that fact. The facia-scala-client, in normal use, is asked for collections that don't yet have collection JSON, and may never have it (apparently, if a collection has never been edited, its collection JSON won't exist). In the past, this would be handled by generating a `FaciaNotFound` response for `com.gu.facia.client.S3Client.get()` which `ApiClient` would cheerfully _not_ throw an exception for: https://github.com/guardian/facia-scala-client/blob/f9830fd286d409662be850ff441792ec03c1c2e2/facia-json/src/main/scala/com/gu/facia/client/ApiClient.scala#L25-L27 ...so encountering the missing resource would not be logged as an exception, **which is the desired behaviour**, see guardian/facia-scala-client#32 However, as my attempt (guardian/facia-scala-client#287) to introduce ETag-caching & AWS SDK v2 support rejects the `com.gu.facia.client.S3Client` trait, in favour of adapting `com.gu.facia.client.ApiClient`, I need to provide that same exception-supressing behaviour. The task is complicated because currently `ETagCache` holds a scaffeine cache with **value** type `ETaggedData[V]` - there's no successful response for the cache loader to return for keys that can denote 'resource missing' - any successful value must have an ETag for one thing, as `ETaggedData` requires it, but when a resource is missing, we have no ETag. This commit tried to deal with the problem by replacing the `S3Exception` with a `CancellationException`, which caffeine special-cases as exception to _not_ log: ben-manes/caffeine@1e52b10 ...however this gets even yuckier than we might have thought, because it means that `facia-scala-client`, as the consumer of `etag-caching`, has to know that it must now try to catch `CancellationException`, rather than `S3Exception`, when it is trying to do it's _own_ suppression of exception-logging.
rtyley
added a commit
to guardian/etag-caching
that referenced
this pull request
Jan 5, 2024
This is a hack, and it probably won't be able to fully conceal that fact. The facia-scala-client, in normal use, is asked for collections that don't yet have collection JSON, and may never have it (apparently, if a collection has never been edited, its collection JSON won't exist). In the past, this would be handled by generating a `FaciaNotFound` response for `com.gu.facia.client.S3Client.get()` which `ApiClient` would cheerfully _not_ throw an exception for: https://github.com/guardian/facia-scala-client/blob/f9830fd286d409662be850ff441792ec03c1c2e2/facia-json/src/main/scala/com/gu/facia/client/ApiClient.scala#L25-L27 ...so encountering the missing resource would not be logged as an exception, **which is the desired behaviour**, see guardian/facia-scala-client#32 However, as my attempt (guardian/facia-scala-client#287) to introduce ETag-caching & AWS SDK v2 support rejects the `com.gu.facia.client.S3Client` trait, in favour of adapting `com.gu.facia.client.ApiClient`, I need to provide that same exception-supressing behaviour. The task is complicated because currently `ETagCache` holds a scaffeine cache with **value** type `ETaggedData[V]` - there's no successful response for the cache loader to return for keys that can denote 'resource missing' - any successful value must have an ETag for one thing, as `ETaggedData` requires it, but when a resource is missing, we have no ETag. This commit tried to deal with the problem by replacing the `S3Exception` with a `CancellationException`, which caffeine special-cases as exception to _not_ log: ben-manes/caffeine@1e52b10 ...however this gets even yuckier than we might have thought, because it means that `facia-scala-client`, as the consumer of `etag-caching`, has to know that it must now try to catch `CancellationException`, rather than `S3Exception`, when it is trying to do it's _own_ suppression of exception-logging.
rtyley
added a commit
that referenced
this pull request
Jan 8, 2025
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching)) * Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286 The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338. ### Usage ```scala import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching import com.gu.facia.client.{ApiClient, Environment} import software.amazon.awssdk.services.s3.S3AsyncClient val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build() val apiClient = ApiClient.withCaching( "facia-tool-store", Environment.Prod, S3ObjectFetching.byteArraysWith(s3AsyncClient) ) ``` _PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_ ### Independence from AWS SDK version (v1 vs v2) Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version. For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version. Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations: * **`ApiClient()`** - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65) ### Solved problems * **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
rtyley
added a commit
that referenced
this pull request
Jan 8, 2025
This change adds these improvements: * Facia data is only re-downloaded & re-parsed if the S3 content has _changed_ (thanks to [ETag-caching](https://github.com/guardian/etag-caching)) * Independence from AWS SDK version (v1 vs v2) _(this PR can replace #286 The [ETag-caching](https://github.com/guardian/etag-caching) library itself is also being used in DotCom PROD, introduced with guardian/frontend#26338. ### Usage ```scala import com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching import com.gu.facia.client.{ApiClient, Environment} import software.amazon.awssdk.services.s3.S3AsyncClient val s3AsyncClient = S3AsyncClient.builder().region(...).credentialsProvider(...).build() val apiClient = ApiClient.withCaching( "facia-tool-store", Environment.Prod, S3ObjectFetching.byteArraysWith(s3AsyncClient) ) ``` _PR using this updated version of the FAPI client: https://github.com/guardian/ophan/pull/6741_ ### Independence from AWS SDK version (v1 vs v2) Ideally, the whole of `facia-scala-client` would be independent of AWS SDK version - we'd _like_ consumers of this library to be able to use whatever AWS SDK version they want, without us pulling in dependency on either SDK version. For `facia-scala-client` this is an attainable goal, as the only AWS API action it performs is fetching from S3, and [guardian/etag-caching](https://github.com/guardian/etag-caching) provides the [`S3ByteArrayFetching`](https://github.com/guardian/etag-caching/blob/v6.0.0/aws-s3/base/src/main/scala/com/gu/etagcaching/aws/s3/package.scala#L6-L22) abstraction that encapsulates this action without tying to a specific AWS SDK version. Due to legacy code compatibility, we can't completely remove AWS SDK v1 from `fapi-client` for now, but we _have_ removed it from `fapi-client-core`, which is the new home of `com.gu.facia.client.ApiClient`, which is now a trait, with 2 constructor methods that provide different implementations: * **`ApiClient()`** - legacy, using the existing `com.gu.facia.client.S3Client` abstraction on S3 behaviour * **`ApiClient.withCaching()`** - provides ETag-based caching and is independent of AWS SDK version - the consumer just needs to provide an appropriate instance of `com.gu.etagcaching.aws.sdkv2.s3.S3ObjectFetching` (ie with `"com.gu.etag-caching" %% "aws-s3-sdk-v2"` and `S3ObjectFetching.byteArraysWith(s3AsyncClient)`, introduced with guardian/etag-caching#65) ### Solved problems * **Noisy logging associated with absent collection JSON** - the `etag-caching` library has been updated with guardian/etag-caching#56 to avoid excessive logging that would occur in the Facia client, due to it typically trying to access collections that aren't yet persisted: #32.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If a collection has never been edited, its collection JSON won't exist. Clients ought to behave properly in that situation, as everything can be inferred from the config.