-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hadoop-18184. Adds support for unbuffer #4298
Closed
ahmarsuhail
wants to merge
11
commits into
apache:feature-HADOOP-18028-s3a-prefetch
from
ahmarsuhail:HADOOP-18184-support-unbuffer
Closed
Hadoop-18184. Adds support for unbuffer #4298
ahmarsuhail
wants to merge
11
commits into
apache:feature-HADOOP-18028-s3a-prefetch
from
ahmarsuhail:HADOOP-18184-support-unbuffer
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the the initial merge of the HADOOP-18028 S3A performance input stream. This patch on its own is incomplete and must be accompanied by all other commits with HADOOP-18028 in their git commit message. Consult the JIRA for that list Contributed by Bhalchandra Pandit.
…3A prefetching stream (apache#4115) Contributed by PJ Fanning.
Contributed by Ahmar Suhail
…ache#4212) Contributed by Monthon Klongklaew
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@ahmarsuhail for JIRA to pick up the PR, can you update the title to "HADOOP-18184. Add support for unbuffer" |
ahmarsuhail
changed the title
Hadoop 18184. Adds support for unbuffer
Hadoop-18184. Adds support for unbuffer
May 18, 2022
asfgit
force-pushed
the
feature-HADOOP-18028-s3a-prefetch
branch
from
May 30, 2022 16:50
f38bbe2
to
b75b72b
Compare
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of PR
This PR adds support for unbuffer.
Unbuffer is used by certain applications (eg: Impala) when they want to hold onto an input stream but free the resource it's using. This is useful as when it needs to read from the stream again, it doesn't have to open the stream again, and can save on HEAD calls.
For prefetching, unbuffer needs to free up the buffer pool, delete any local files, clear state about blocks in the file etc. Also, when reading after an unbuffer, the input stream should reinitialise all this state. It should also read from the last active position before the read.
How was this patch tested?
Tested in eu-west-1 by running
mvn -Dparallel-tests -DtestsThreadCount=16 clean verify
ITestS3AInputStreamPerformance
is failing, unrelated to this PR. Created issue: https://issues.apache.org/jira/browse/HADOOP-18231ITestS3AUnbuffer
fails. instance of assertion &isObjectStreamOpen()
fails. Similar to the above issue, there are a few different ways to fix this test. I'm not sure what the best way is. Parameterized tests and different assertions based on if prefetching is enabled/new tests? I've left it failing for now.All unbuffer contract tests are passing now.
Also tested a few different read sequences, for eg seek should work after an unbuffer: