Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19027. S3A: S3AInputStream doesn't recover from HTTP/channel exceptions #6425

Merged

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Jan 8, 2024

  • Add a new Path IOE subclass, AwsHttpChannelException
  • Method ErrorTranslation.maybeExtractChannelException() to create this from shaded/unshaded NoHttpResponseException, using string match to avoid classpath problems.
  • And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035. This isn't strictly the right place for this as its not an IOE we are remapping...
  • ErrorTranslation.maybeExtractIOException() to perform this translation as appropriate.

S3AInputStream.reopen() code now retries, except on EOFException, which is converted to a -1 response to the caller, as is done historically.

New tests for the translation, but not for the input stream logic. The place for that will be TestS3AInputStreamRetry

How was this patch tested?

New unit test cases; so far run the contract tests.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 39s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 29s trunk passed
+1 💚 mvnsite 0m 42s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 9s trunk passed
+1 💚 shadedclient 37m 46s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 20s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 2 new + 8 unchanged - 0 fixed = 10 total (was 8)
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s the patch passed
+1 💚 shadedclient 37m 33s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 46s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
138m 22s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/1/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 7b15f999737c 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0776a2e
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/1/testReport/
Max. process+thread count 589 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@mukund-thakur
Copy link
Contributor

Re-ran the whole test suite as well. Don't see any new failures.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 22s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 29s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s trunk passed
+1 💚 shadedclient 37m 32s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 37m 52s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 19s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 2 new + 8 unchanged - 0 fixed = 10 total (was 8)
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 37m 42s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 50s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
138m 0s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/2/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 32e3fe1f71b0 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a22fd68
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/2/testReport/
Max. process+thread count 532 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran force-pushed the s3/HADOOP-19027-stream-error-recovery branch from a22fd68 to 4537fa2 Compare January 9, 2024 21:19
Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Good tests.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 48s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 18s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 9s trunk passed
+1 💚 shadedclient 37m 54s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 20s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 27 new + 8 unchanged - 0 fixed = 35 total (was 8)
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 37m 54s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 50s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
138m 35s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/3/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ac9a414217a8 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 4537fa2
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/3/testReport/
Max. process+thread count 529 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

note aws sdk has done something here with Netty
aws/aws-sdk-java-v2#4655

@steveloughran
Copy link
Contributor Author

Testing

[ERROR] Tests run: 17, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 48.129 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI
[ERROR] testCreateDirectory(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI)  Time elapsed: 13.4 s  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSS3IOException: 
Remove S3 Dir Markers on s3a://stevel-london/Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-aws/target/test-dir/7/testContextURI/createTest: org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException: [S3Error(Key=Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-aws/target/test-dir/7/testContextURI/createTest/()&^%$#@!~_+}{><?/, Code=InternalError, Message=We encountered an internal error. Please try again.)] (Service: Amazon S3, Status Code: 200, Request ID: null):MultiObjectDeleteException: InternalError: Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-aws/target/test-dir/7/testContextURI/createTest/()&^%$#@!~_+}{><?/: We encountered an internal error. Please try again.
: [S3Error(Key=Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-aws/target/test-dir/7/testContextURI/createTest/()&^%$#@!~_+}{><?/, Code=InternalError, Message=We encountered an internal error. Please try again.)] (Service: Amazon S3, Status Code: 200, Request ID: null)
        at org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException.translateException(MultiObjectDeleteException.java:136)
        at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:347)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:163)
        at org.apache.hadoop.fs.s3a.impl.DeleteOperation.asyncDeleteAction(DeleteOperation.java:445)
        at org.apache.hadoop.fs.s3a.impl.DeleteOperation.lambda$submitDelete$2(DeleteOperation.java:403)
        at org.apache.hadoop.fs.store.audit.AuditingFunctions.lambda$callableWithinAuditSpan$3(AuditingFunctions.java:119)
        at org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:88)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
        at org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
        at org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException: [S3Error(Key=Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-aws/target/test-dir/7/testContextURI/createTest/()&^%$#@!~_+}{><?/, Code=InternalError, Message=We encountered an internal error. Please try again.)] (Service: Amazon S3, Status Code: 200, Request ID: null)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:3174)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeysS3(S3AFileSystem.java:3405)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:3475)
        at org.apache.hadoop.fs.s3a.S3AFileSystem$OperationCallbacksImpl.removeKeys(S3AFileSystem.java:2491)
        at org.apache.hadoop.fs.s3a.impl.DeleteOperation.lambda$asyncDeleteAction$8(DeleteOperation.java:447)
        at org.apache.hadoop.fs.s3a.Invoker.lambda$once$0(Invoker.java:165)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
        ... 11 more

Copy link
Contributor Author

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nits; will fix

if (values.length == 2) {
try {
long start = Long.parseUnsignedLong(values[0]);
long end = Long.parseUnsignedLong(values[0]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey, got this wrong. as it is in the production code...will fix.

.cause(cause)
.build();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cut this

}

},
with(Statistic.ACTION_HTTP_GET_REQUEST, 0)); // vector stats don't add this
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to await result and then the count will be incremented

@steveloughran
Copy link
Contributor Author

as it stands, this PR will

  • map http/ssl channel problems to EOFException subclass which is retried
  • map out of range 416 to a different EOF subclass -which is not retried
  • lazyseek will retry on channel failures
  • close s3 input stream when it returns -1

not done: any attempt for readFully() to recover from that -1 response.
That will take more work, possibly including clarification in spec, as the base implementation doesn't recover either and we should consider this + abfs semantics.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 7s trunk passed
+1 💚 compile 0m 27s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 22s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 21s trunk passed
+1 💚 mvnsite 0m 27s trunk passed
+1 💚 javadoc 0m 17s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 23s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 0m 46s trunk passed
+1 💚 shadedclient 20m 6s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 16s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 20s the patch passed
+1 💚 compile 0m 14s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 14s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 10s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 49 new + 8 unchanged - 0 fixed = 57 total (was 8)
+1 💚 mvnsite 0m 17s the patch passed
+1 💚 javadoc 0m 9s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 15s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 0m 41s the patch passed
+1 💚 shadedclient 20m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 14s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 20s The patch does not generate ASF License warnings.
83m 12s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/4/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux dc2e9893b02e 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2a9f54d
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/4/testReport/
Max. process+thread count 555 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
-1 ❌ mvninstall 14m 36s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 29s /branch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
+1 💚 compile 1m 48s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 0m 54s trunk passed
+1 💚 javadoc 0m 28s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 38s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 26s trunk passed
+1 💚 shadedclient 41m 35s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 19s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 51 new + 8 unchanged - 0 fixed = 59 total (was 8)
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 14s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 5s the patch passed
+1 💚 shadedclient 37m 41s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
114m 23s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/5/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 2df7e4706748 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2a517eb
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/5/testReport/
Max. process+thread count 564 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 49m 11s trunk passed
+1 💚 compile 0m 43s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 8s trunk passed
+1 💚 shadedclient 38m 26s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 20s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 50 new + 8 unchanged - 0 fixed = 58 total (was 8)
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 38m 21s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 55s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
142m 47s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/6/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e5aeef35f82e 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 92c9005
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/6/testReport/
Max. process+thread count 533 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran force-pushed the s3/HADOOP-19027-stream-error-recovery branch from 92c9005 to a46b4ee Compare January 11, 2024 17:27
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 2s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
-1 ❌ mvninstall 11m 17s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 26s /branch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
+1 💚 compile 2m 13s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 0m 51s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 36s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 35s trunk passed
+1 💚 shadedclient 41m 7s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 20s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 50 new + 8 unchanged - 0 fixed = 58 total (was 8)
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 37m 51s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 58s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
110m 20s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/7/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 44e837d7add7 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a46b4ee
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/7/testReport/
Max. process+thread count 534 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

…ions

Differentiate from "EOF out of range" from "channel problems" through
two different subclasses of EOFException and input streams to always
retry on http channel errors; out of range GET requests are not retried.
Currently an EOFException is always treated as a fail-fast call in read()

This allows for all existing external code catching EOFException to handle
both, but S3AInputStream to cleanly differentiate range errors (map to -1)
from channel errors (retry)

- HttpChannelEOFException is subclass of EOFException, so all code
  which catches EOFException is still happy.
  retry policy: connectivityFailure
- RangeNotSatisfiableEOFException is the subclass of EOFException
  raised on 416 GET range errors.
  retry policy: fail
- Method ErrorTranslation.maybeExtractChannelException() to create this
  from shaded/unshaded NoHttpResponseException, using string match to
  avoid classpath problems.
- And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035.
  This isn't strictly the right place for this as its not an IOE we are
  remapping...
- ErrorTranslation.maybeExtractIOException() to perform this translation as
  appropriate.

S3AInputStream.reopen() code retries on EOF, except on
 RangeNotSatisfiableEOFException,
 which is converted to a -1 response to the caller
 as is done historically.

HADOOP-19027. S3A: S3AInputStream doesn't recover from HTTP/channel exceptions

S3AInputStream knows to handle these with
 read(): HttpChannelEOFException: stream aborting close then retry
 lazySeek(): Map RangeNotSatisfiableEOFException to -1, but do not map
  any other EOFException class raised.

This means that
* out of range reads map to -1
* channel problems in reopen are retried
* channel problems in read() abort the failed http connection so it
  isn't recycled

Tests for this using/abusing mocking.

Change-Id: I0a31c1ae291ea2b38b0294a50dca5e9d0d4d1fdf
…xceptions

Testing through actually raising 416 exceptions and verifying that
readFully(), char read() and vector reads are all good.

Not yet validated, PositionedReadable.read() where we need to
* return -1 past EOF
* retry on the http channel issues

Change-Id: Ia37f348ccc7d730ae439422e99dfc828612d0bfd
…xceptions

If a read() on the inner wrapped stream returned -1 then it is closed.

There is no attempt to actually recover within a readFully(); there's
a switch to turn this on, but if anyone does it a test will spin forever
as the inner PositionedReadable.read(position, buffer, len) downgrades
all EOF exceptions to -1.
A new method would need to be added which controls whether to downgrade/rethrow
exceptions, which makes for more complex work.

What does that mean? Possibly reduced resilience to non-retried failures
on the inner stream, even though more channel exceptions are retried on.

+Split out tests of different read methods into their own test cases.

Change-Id: I8f7e75ba73a376e7dfd06c5bb9ebc4138bc80394
Change-Id: Ided5c0c3967199b7485641a40b06fba92aeea948
Change-Id: I37f05880cad808db9f53035d4d49af48a333f2c7
Change-Id: I51d860b8ecc774149307ebba2d2b6ec393671fff
Change-Id: I483d0aa29d45956a4cf9b1efd939978b550fd722
@steveloughran steveloughran force-pushed the s3/HADOOP-19027-stream-error-recovery branch from 54811a8 to 776ac8b Compare January 15, 2024 12:08
@steveloughran
Copy link
Contributor Author

rebased to force rebuild

@steveloughran
Copy link
Contributor Author

@mukund-thakur @ahmarsuhail can i get a review of this...want to merge and ideally get onto 3.4 branch in case there's another RC of that

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 57s trunk passed
+1 💚 compile 0m 43s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 9s trunk passed
+1 💚 shadedclient 37m 49s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 14s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 23s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 11s the patch passed
+1 💚 shadedclient 37m 53s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 56s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
155m 57s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/9/artifact/out/Dockerfile
GITHUB PR #6425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 92a7115f8c41 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 776ac8b
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/9/testReport/
Max. process+thread count 583 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6425/9/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

tested: s3 london; 12 threads and -Dscale. No failures other than the two I'm always seeing on trunk right now

@steveloughran
Copy link
Contributor Author

update, one new failure, probably from having 12 threads as it was a timeout in one of the tests

[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 609.807 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3APrefetchingLruEviction
[ERROR] testSeeksWithLruEviction[max-blocks-2](org.apache.hadoop.fs.s3a.ITestS3APrefetchingLruEviction)  Time elapsed: 600.009 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 600000 milliseconds
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1308)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
        at org.apache.hadoop.fs.s3a.ITestS3APrefetchingLruEviction.testSeeksWithLruEviction(ITestS3APrefetchingLruEviction.java:162)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.lang.Thread.run(Thread.java:750)

Does not happen on a standalone rerun.

[INFO] Running org.apache.hadoop.fs.s3a.ITestS3APrefetchingLruEviction
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 17.108 s - in org.apache.hadoop.fs.s3a.ITestS3APrefetchingLruEviction
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@steveloughran steveloughran merged commit 36198b5 into apache:trunk Jan 16, 2024
4 checks passed
asfgit pushed a commit that referenced this pull request Jan 22, 2024
…xceptions (#6425)



Differentiate from "EOF out of range/end of GET" from
"EOF channel problems" through
two different subclasses of EOFException and input streams to always
retry on http channel errors; out of range GET requests are not retried.
Currently an EOFException is always treated as a fail-fast call in read()

This allows for all existing external code catching EOFException to handle
both, but S3AInputStream to cleanly differentiate range errors (map to -1)
from channel errors (retry)

- HttpChannelEOFException is subclass of EOFException, so all code
  which catches EOFException is still happy.
  retry policy: connectivityFailure
- RangeNotSatisfiableEOFException is the subclass of EOFException
  raised on 416 GET range errors.
  retry policy: fail
- Method ErrorTranslation.maybeExtractChannelException() to create this
  from shaded/unshaded NoHttpResponseException, using string match to
  avoid classpath problems.
- And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035.
  We believe this is the OpenSSL equivalent.
- ErrorTranslation.maybeExtractIOException() to perform this translation as
  appropriate.

S3AInputStream.reopen() code retries on EOF, except on
 RangeNotSatisfiableEOFException,
 which is converted to a -1 response to the caller
 as is done historically.

S3AInputStream knows to handle these with
 read(): HttpChannelEOFException: stream aborting close then retry
 lazySeek(): Map RangeNotSatisfiableEOFException to -1, but do not map
  any other EOFException class raised.

This means that
* out of range reads map to -1
* channel problems in reopen are retried
* channel problems in read() abort the failed http connection so it
  isn't recycled

Tests for this using/abusing mocking.

Testing through actually raising 416 exceptions and verifying that
readFully(), char read() and vector reads are all good.

There is no attempt to recover within a readFully(); there's
a boolean constant switch to turn this on, but if anyone does
it a test will spin forever as the inner PositionedReadable.read(position, buffer, len)
downgrades all EOF exceptions to -1.
A new method would need to be added which controls whether to downgrade/rethrow
exceptions.

What does that mean? Possibly reduced resilience to non-retried failures
on the inner stream, even though more channel exceptions are retried on.

Contributed by Steve Loughran
slfan1989 pushed a commit that referenced this pull request Jan 25, 2024
…xceptions (#6425)



Differentiate from "EOF out of range/end of GET" from
"EOF channel problems" through
two different subclasses of EOFException and input streams to always
retry on http channel errors; out of range GET requests are not retried.
Currently an EOFException is always treated as a fail-fast call in read()

This allows for all existing external code catching EOFException to handle
both, but S3AInputStream to cleanly differentiate range errors (map to -1)
from channel errors (retry)

- HttpChannelEOFException is subclass of EOFException, so all code
  which catches EOFException is still happy.
  retry policy: connectivityFailure
- RangeNotSatisfiableEOFException is the subclass of EOFException
  raised on 416 GET range errors.
  retry policy: fail
- Method ErrorTranslation.maybeExtractChannelException() to create this
  from shaded/unshaded NoHttpResponseException, using string match to
  avoid classpath problems.
- And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035.
  We believe this is the OpenSSL equivalent.
- ErrorTranslation.maybeExtractIOException() to perform this translation as
  appropriate.

S3AInputStream.reopen() code retries on EOF, except on
 RangeNotSatisfiableEOFException,
 which is converted to a -1 response to the caller
 as is done historically.

S3AInputStream knows to handle these with
 read(): HttpChannelEOFException: stream aborting close then retry
 lazySeek(): Map RangeNotSatisfiableEOFException to -1, but do not map
  any other EOFException class raised.

This means that
* out of range reads map to -1
* channel problems in reopen are retried
* channel problems in read() abort the failed http connection so it
  isn't recycled

Tests for this using/abusing mocking.

Testing through actually raising 416 exceptions and verifying that
readFully(), char read() and vector reads are all good.

There is no attempt to recover within a readFully(); there's
a boolean constant switch to turn this on, but if anyone does
it a test will spin forever as the inner PositionedReadable.read(position, buffer, len)
downgrades all EOF exceptions to -1.
A new method would need to be added which controls whether to downgrade/rethrow
exceptions.

What does that mean? Possibly reduced resilience to non-retried failures
on the inner stream, even though more channel exceptions are retried on.

Contributed by Steve Loughran
jiajunmao pushed a commit to jiajunmao/hadoop-MLEC that referenced this pull request Feb 6, 2024
…xceptions (apache#6425)



Differentiate from "EOF out of range/end of GET" from
"EOF channel problems" through
two different subclasses of EOFException and input streams to always
retry on http channel errors; out of range GET requests are not retried.
Currently an EOFException is always treated as a fail-fast call in read()

This allows for all existing external code catching EOFException to handle
both, but S3AInputStream to cleanly differentiate range errors (map to -1)
from channel errors (retry)

- HttpChannelEOFException is subclass of EOFException, so all code
  which catches EOFException is still happy.
  retry policy: connectivityFailure
- RangeNotSatisfiableEOFException is the subclass of EOFException
  raised on 416 GET range errors.
  retry policy: fail
- Method ErrorTranslation.maybeExtractChannelException() to create this
  from shaded/unshaded NoHttpResponseException, using string match to
  avoid classpath problems.
- And do this for SdkClientExceptions with OpenSSL error code WFOPENSSL0035.
  We believe this is the OpenSSL equivalent.
- ErrorTranslation.maybeExtractIOException() to perform this translation as
  appropriate.

S3AInputStream.reopen() code retries on EOF, except on
 RangeNotSatisfiableEOFException,
 which is converted to a -1 response to the caller
 as is done historically.

S3AInputStream knows to handle these with
 read(): HttpChannelEOFException: stream aborting close then retry
 lazySeek(): Map RangeNotSatisfiableEOFException to -1, but do not map
  any other EOFException class raised.

This means that
* out of range reads map to -1
* channel problems in reopen are retried
* channel problems in read() abort the failed http connection so it
  isn't recycled

Tests for this using/abusing mocking.

Testing through actually raising 416 exceptions and verifying that
readFully(), char read() and vector reads are all good.

There is no attempt to recover within a readFully(); there's
a boolean constant switch to turn this on, but if anyone does
it a test will spin forever as the inner PositionedReadable.read(position, buffer, len)
downgrades all EOF exceptions to -1.
A new method would need to be added which controls whether to downgrade/rethrow
exceptions.

What does that mean? Possibly reduced resilience to non-retried failures
on the inner stream, even though more channel exceptions are retried on.

Contributed by Steve Loughran
steveloughran added a commit that referenced this pull request Mar 8, 2024
…t runs (#6465)



Disables the new tests added in:

HADOOP-19027. S3A: S3AInputStream doesn't recover from HTTP/channel exceptions #6425

The underlying issue here is that the block prefetch code can identify
when there's a mismatch between declared and actual length, and doesn't
store any of the incomplete buffer.

This should be addressed in HADOOP-18184.

Contributed by Steve Loughran
steveloughran added a commit that referenced this pull request Mar 8, 2024
…t runs (#6465)

Disables the new tests added in:

HADOOP-19027. S3A: S3AInputStream doesn't recover from HTTP/channel exceptions #6425

The underlying issue here is that the block prefetch code can identify
when there's a mismatch between declared and actual length, and doesn't
store any of the incomplete buffer.

This should be addressed in HADOOP-18184.

Contributed by Steve Loughran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants