-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18073. Adds in s3 client config and updates list. #4706
HADOOP-18073. Adds in s3 client config and updates list. #4706
Conversation
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
I've added some suggestions - to put it simply, let's do whatever we can to reduce the diff and improve legibility to make it easier to review since we're going to end up touching a lot of files across S3A in the process of this upgrade.
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AWSClientConfig.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AWSClientConfig.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AWSClientConfig.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/MockS3ClientFactory.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/TestRequestFactory.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small feedback on the region determination code I missed on first review. After that, should be ready for +1.
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java
Outdated
Show resolved
Hide resolved
// build a s3 client with region eu-west-2 that can be used to get the region of the bucket. | ||
S3Client s3Client = S3Client.builder().region(Region.EU_WEST_2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eu-west-2
/ London seems a bit of a random decision.
Let's be intentional by using the AWS_GLOBAL
region value - this will result in us-east-1
. (Only works for commercial regions. Not sure if we can do anything better for China / US Gov Cloud?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documented in later commit. "us-east-1"
/AWS_GLOBAL
can't be used. thanks Ahmar
Thanks @dannycjones, have moved those things to constants. RE using Not using I'm not sure how the region logic should behave and how it should handle failures. Returning I'm also not sure if this new region/endpoint logic is sufficient to handle third party stores, keen to know what other people think. |
@steveloughran / @mukund-thakur , I've opened a PR with the first set of changes for the SDK upgrade, could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 (non-binding)
Thanks Ahmar!
🎊 +1 overall
This message was automatically generated. |
closing this, these changes will be part of a new PR I'll open once the feature branch is in sync with trunk. |
This provides the first set of changes for upgrading the s3 client. It configures the s3 client and updates the list operation.
Configuring the client
The createAwsConf method is now split into:The table below lists the configurations S3A was using and what they now map to.
Endpoint and region configuration
Previously, if no endpoint and region was configured, fall back to using us-east-1 . Set withForceGlobalBucketAccessEnabled(true) which will allow access to buckets not in this region too.
Since the SDK V2 no longer supports cross region access, we need to set the region and endpoint of the bucket. The behaviour has now been changed to:
Things not done yet:
Updating ListOperations
This was pretty straightforward. A few things to note:
Failing Tests
The following tests are failing: