-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16599. Allow a SignerInitializer to be specified along with a #1516
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial patch looks OK
-
I've noted where I want classes moved into .auth or .impl groups; consider the base fs.s3a module somewhere where things shouldn't be dropped in any more
-
and what scope/stability tags we should have on a new feature
-
Have you thought about how to do an integration test here? I could imagine a custom signer which just forwards to the AWS signer
-
and what about collecting metrics on this, e.g. #of signing requests made. We could have another callback under org.apache.hadoop.fs.s3a.S3AInstrumentation which the signers could use to pass this info back
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.hadoop.fs.s3a; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to go into fs.s3a.impl.auth
@InterfaceAudience.LimitedPrivate("authorization-subsystems")
@InterfaceStability.Unstable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have moved it under fs.s3a.auth (not fs.s3a.auth.impl). This is an interface which is meant to be implemented by others.
Removed the interface* annotation in favor or package-info.
* | ||
*/ | ||
@Public | ||
@Evolving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* CustomSigners -> 'CSigner1:CustomSignerClass1,CSigner2:CustomerSignerClass2 | ||
* name will be associated with this signer class in the S3 SDK. | ||
* Examples | ||
* CustomSigner -> 'CustomSigner:org.apache...CustomSignerClass' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just had to merge https://issues.apache.org/jira/browse/HADOOP-16602 in to fix the javadoc here
- merge and copy the fix
- do a test run with mvn package to see all is well (I do mvn javadoc:javadoc when I remember)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Surprised the pre-commit didn't catch this.
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.hadoop.fs.s3a; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
org.apache.hadoop.fs.s3a.auth.delegation;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
On a related note - what are your thoughts on moving some of these delegation and auth interfaces to a new module - something like s3a-plugins. That makes it easier for downstream projects to have a limited dependency which doesn't pull in all of S3AFileSystem, aws-sdk etc. Would be a separate jira ofcourse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry it'd force even more version brittleness: "you are trying to use the hadoop-aws-plugins-3.2.1 with hadoop-aws-3.2.2". If you ever search for spark + s3 on stack overflow you can see that the #1 recurrent complaint is "I added hadoop-aws-3.1 to hadoop and now I get a class not found exception.
This is why the S3A troubleshooting docs start by telling people not to mix jars as all they do is move stack traces around https://hadoop.apache.org/docs/current3/hadoop-aws/tools/hadoop-aws/troubleshooting_s3a.html#Classpath_Setup
I don't want to make things worse. if you don't want the aws sdk, don't include it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aws-plugins would help break some circular dependencies.
Users would ofcourse need to make sure the versions of aws-plugins and aws are the same. In fact, users would only import 'aws' which would automatically include the 'aws-plugins' - in that sense this would not be an incompatible change.
The intent of switching 'aws-plugins' to a separate module is to allow people writing plugins to not depend on everything, and potentially help with circular dependencies.
IAC, that's unrelated to this jira an PR. Can be taken up later if it makes sense.
/** | ||
* Interface for S3A Delegation Token access. | ||
*/ | ||
@Public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The auth.delegation package is declared as
@InterfaceAudience.LimitedPrivate("authorization-subsystems")
@InterfaceStability.Unstable
This is exactly the guarantees we should be making here. We don't know whether this is going to work, and should not make any commitments about stability.
you can just cut these lines are rely on package-info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
import java.io.Closeable; | ||
import java.io.IOException; | ||
import java.util.LinkedList; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class has only gone in. I thought it was going into .impl, but clearly not
- Move in this patch to a subdir; s3a.auth is the most appropriate
- Sort out the imports
I don't want any new implementation classes to go into fs.s3a; I have a goal of marking the impl as private in java 11 modules. Please avoid adding things there unless unavoidable -and explain to have to justify. thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to s3a.auth.
What needs to change in the imports?
It's using
java.*
\n
Everything other than org.apache.*
\n
org.apache.*
\n
static imports
@@ -1879,3 +1879,61 @@ To disable checksum verification in `distcp`, use the `-skipcrccheck` option: | |||
hadoop distcp -update -skipcrccheck -numListstatusThreads 40 /user/alice/datasets s3a://alice-backup/datasets | |||
``` | |||
|
|||
### <a name="customsigners"></a> Advanced - Custom Signers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably time to add a whole new authentication.md file, linked off the index.md file; index.md is a bit to big and we actually need to document things like the standard set of signers.
A new file actually makes merging easier...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly keeping the signer documentation vague. This is not a feature that's going to be used by a lot of people. The default signers will change with the SDK version - and are mentioned in the documentation already.
I'd prefer not having a page which talks only about signing (the new auth page) - again don't want to call this out since it's not something that majority of users will want to touch. Delegation is already a separate page from what I can tell.
|
Updated PR based on the review comments, except for what is called out further up. Also adds unit tests and an integration test. |
🎊 +1 overall
This message was automatically generated. |
Tests run against a bucket in us-east-2 |
🎊 +1 overall
This message was automatically generated. |
patch LGTM, +1 once you fix whatever merge conflicts have crept in (Constants, inevitably) regarding instrumentation, it'd make sense to have some interface for the signers to invoke with some signed/rejected counters; we'd have an implementation in S3AInstrumentation which would be the one normally passed down. |
🎊 +1 overall
This message was automatically generated. |
Fixed the merge conflicts, and have run mvn javadoc:javadoc successfully. Thanks for the review. Merging the changes. |
Patch is missing some unit tests, which will get added soon. Posting early to solicit feedback on the interface additions - @steveloughran