-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16759. Filesystem openFile() builder to take a FileStatus param #1761
HADOOP-16759. Filesystem openFile() builder to take a FileStatus param #1761
Conversation
tested -s3a ireland. Not tested the other stores to make sure they don't break (as they don't read the status, it's hard to see how). Could also add more failure tests (path mismatch, ...) and use s3a metrics to verify HEAD doesn't count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall. I can give another round review later. Thanks!
* | ||
* If/when new attributes added to the builder, this class will be extended. | ||
*/ | ||
public class OpenFileParameters { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking does it need a builder itself for easier construction and being immutable?
OpenFileParameters parameters = new OpenFileParameters();
parameters.setMandatoryKeys(getMandatoryKeys());
parameters.setOptions(getOptions());
parameters.setBufferSize(getBufferSize());
parameters.setStatus(getStatus());
to
OpenFileParameters parameters = OpenFileParameters.builder()
.mandatoryKeys(getMandatoryKeys())
.ptions(getOptions())
.bufferSize(getBufferSize())
.status(getStatus())
.build();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's just overkill IMO...just a struct we can pass round and expand in an implementation side AP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense. This is totally fine.
I was thinking if there would be more parameters to support and some parameters are optional, a builder can be better in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noted. No real reason not to I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gone half way with this; renamed set* to with* and return the same object for ease of chaining
getMandatoryKeys(), | ||
getOptions(), | ||
getBufferSize()); | ||
parameters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: merge with previous line seems no longer than 80
@@ -4366,15 +4378,38 @@ private void requireSelectSupport(final Path source) throws | |||
InternalConstants.STANDARD_OPENFILE_KEYS, | |||
"for " + path + " in non-select file I/O"); | |||
} | |||
FileStatus status = parameters.getStatus(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This status
can be named providedStatus
or something. Clearer as it's referred multiple times following.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
.build().get()) { | ||
instream.read(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about fs.openFile(testPath)
without file status? It will simply fail right because S3Guard has wrong status? Do we need a test for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean with the file version changed? It shouldn't notice the version issues without s3guard, will fai on read() after a number of retriesl when s3guard version doesn't match that in the store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two try-with tests are testing that with a good FileStatus passed, it's able to skip the S3Guard and hence no errors are reported regardless of change detection policy.
I was thinking is it worth to make sure if we don't pass file status, the forged file status is indeed used. So it will error out because of bad etag. For e.g.
try(FSDataInputStream instream = fs.openFile(testpath)
.build().get()) {
try {
instream.read();
// No exception only if we don't enforce change detection as exception
assertTrue(changeDetectionMode.equals(CHANGE_DETECT_MODE_NONE) ||
changeDetectionMode.equals(CHANGE_DETECT_MODE_WARN));
} catch (Exception ignored) {
// Ignored.
}
}
I think again and now guess this extra test perhaps is not required to demonstrate the withFileStatus is being honored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. Also a contract test to verify that the status must be non-null
final Optional<Configuration> options) | ||
final Path file, | ||
final Optional<Configuration> options, | ||
final S3AFileStatus status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but status can be Optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
thanks for the comments. I'm on vacation until jan so will address them then. |
2f7e666
to
217a0b1
Compare
update with the recommended changes, and some extra tests I could think of. Sorry, but I rebased the branch while I wasn't paying full attention -things aren't going to match up properly. I'm going to do that again once I've done the auth mode merge as I want as much QE of that patch as I can |
* Enhanced builder + FS spec * s3a FS to use this to skip HEAD on open * and to use version/etag when opening the file works with S3AFileStatus FS and S3ALocatedFileStatus Change-Id: If80f73137643fd50a969a92ad5794d0d09e3aee6
* ~builder API for openFileParameters * S3AFS moves to Optional<S3AFileStatus> where appropriate * Which includes the select() operation too * extra test to verify pickup of invalid status when a valid one is not passed in * also contract test to verify that null value is rejected Change-Id: I391abb6030503fc6288c6494156b85391cb7c196
* approximate builder API for openFileParameters * S3AFS moves to Optional<S3AFileStatus> where appropriate, which includes the select() operation too. There's a common method to do the extraction. * Switch to objects.requireNonNull in stream builder / openFileParameters plus error text (NPE debugging...) Extra tests * verify pickup of invalid status when none is supplied * rejection of a status declaring the source is a directory * after an object is deleted, the 404 isn't picked up until the read() call initiates the GET. * And there, if you use server-side versionID, *you get the file still* * +contract test to verify that withFileStatus(null) status value is rejected Change-Id: I326f68538940f245e1af56d3b6055015fd3e1bfe
217a0b1
to
6a1a951
Compare
💔 -1 overall
This message was automatically generated. |
style
|
Address the checkstyle issues. Add some tests of the API in ITestS3GuardOutOfBandOperations, for auth and non-auth. This includes the discovery (and fix!) of the fact that with the specific s3guard retry logic of HADOOP-16490, we needed to set a different retry option to get the tests to fail fast on deleted file in auth mode. It has been like that for a few months, but we've never noticed...though even in parallel runs it would have reduced performance by using up a process for 2+ minutes. Running in the IDE I initially thought my changes had broken something. Also: ITestS3Select fails as trying to select a dir raises an FNFE, just as open() always has. Because we skip looking for dir markers in select or open, attempts to read a nonexistent file will fail faster (though still add a 404 to the S3 cache) Change-Id: I33ffc90ce470c590143cebacc6efd2d2849d2106
Latest iteration -tested s3 ireland *Address the checkstyle issues.
This includes the discovery (and fix!) of the fact that with the Also: ITestS3Select fails as trying to select a dir raises an FNFE, |
💔 -1 overall
This message was automatically generated. |
b05fae7
to
76e3d46
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
apache#1761). Contributed by Steve Loughran * Enhanced builder + FS spec * s3a FS to use this to skip HEAD on open * and to use version/etag when opening the file works with S3AFileStatus FS and S3ALocatedFileStatus
apache#1761). Contributed by Steve Loughran * Enhanced builder + FS spec * s3a FS to use this to skip HEAD on open * and to use version/etag when opening the file works with S3AFileStatus FS and S3ALocatedFileStatus
apache#1761). Contributed by Steve Loughran * Enhanced builder + FS spec * s3a FS to use this to skip HEAD on open * and to use version/etag when opening the file works with S3AFileStatus FS and S3ALocatedFileStatus
works with S3AFileStatus FS and S3ALocatedFileStatus
Change-Id: If80f73137643fd50a969a92ad5794d0d09e3aee6