-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK repeatedly complaining "Not all bytes were read from the S3ObjectInputStream" #1211
Comments
related to #1111, #1203. Also surfacing in IMPALA-5387. |
Is it possible to explicitly call abort if it's the optimal choice for the circumstances? If you call abort then the warning message is not logged, and for cases where you do want to reuse the connection you can just drain the stream. BTW close when there are bytes left in the stream does abort the underlying connection. This was a change we made in 1.11.x. |
OK, so that's the big change...it used to stream through but now it aborts. That's a major change. We can fix that. Will a skip(EOF-pos) do it? |
@steveloughran Yes, that should be sufficient to drain the stream. |
OK, We'll do that, plus also
|
Current status: if we drain the stream a lot of the messages go away, but we still seem to get told off on an
|
Here's what I suspect is happening. We've invoked abort(), but a stream close() is still being called (somewhere) and that's triggering the warning, even though the stream is now aborted & shouldn't be considered live. I think maybe you should see if the operations in 'abort()` are complete (set the http channel to null? the input stream? & make sure its in sync with the expectations of the warning code in close(). There's nothing else we can do in our code. I'm draining the stream on a close when the remaining bytes is in our configured skip range. (there's one flaw in my hypothesis: after we call 'abort() |
It's happening in your abort call @Override
public void abort() {
if (httpRequest != null) {
httpRequest.abort();
}
IOUtils.closeQuietly(in, null); // **HERE**
} think you should catch the special case "this is an abort so don't worry about it" & not log. |
Ah I see, yeah we can fix that. Thanks for reporting. |
This was fixed in 1.11.163. Going to go ahead and close this. |
I'm still getting this error in 1.11.182. I'm using AWS-SDK as the mechanism to access S3 within Apache Spark. Has anyone else seen this issue? Spark is pulling down a 640Kb test file and throwing up this error.
|
you will see this at the tail of reads in S3A if the version of hadoop you are using doesn't have HADOOP-14596 in to drain the stream at the end of any non-aborted Which version of the Hadoop JARs are on your CP? There have been no ASF releases with this patch (indeed, no ASF releases built & tested against 1.11.x, AFAIK) |
I'm still seeing this issue, I call |
Is the call to abort throwing an exception. Looks like we set the flag after we try to abort so if it fails we may still log the warning. |
If I apply the HADOOP-14890 patch to update the AWS SDK for Hadoop, the message goes away from hadoop-3. That means the abort() path is silent; we're making sure we drain the input stream when we want to stop reading before the end of the stream, but not actually force an abort. I'm happy...just need to get the SDK update in. |
@shorea: The error might be here: aws-sdk-java/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/model/S3ObjectInputStream.java Line 94 in 2931563
The comment says "The default abort() implementation calls abort on the wrapped stream if it's an SdkFilterInputStream; otherwise we'll need to close the stream." but the code doesn't actually do that -- it calls cc: @dhruba |
The |
I'm in 1.11.221, and I'm seeing the IOUtils.closeQuietly at the end of S3ObjectInputStream.abort printing this warning. The "aborted" flag is just not getting set correctly, it's hard to track what the intention was to get it set though. |
@davidvandebunte would you happen to have a reproducible code sample for this problem? |
Also experiencing this with version 1.11.215: // try-with-resources auto-close adds a 2nd WARN to log
try (final S3ObjectInputStream is = amazonS3.getObject(new GetObjectRequest(bucketName,
keyName)).getObjectContent()){
myService.process(is); // read most of the input stream, but not all -it's a ZIP archive and I read 1 file inside the ZIP archive (which is ~99% of the file in my case)
is.abort(); // adds a 3rd WARN to the log
} So by closing the input stream I get a 2nd warning to the log. And then by explicitly calling abort I get a 3rd WARN message to the log. Workaround will be to consume rest of input stream using another library but there ought to be a way to suppress this warning. Also it would very helpful if the WARN message included a stack trace or some other meta data as it took a while to figure out what code was generating this warn message. |
@jhovell if you look @ what we do in Hadoop's S3A connector, if we aren't that far from the EOF, we just skip() to the end. Further away (which means a few hundred KB) and we call the abort, accept that the error text can get printed & not worry about it. I think we may have some coverage in our troubleshooting docs...if not something should go in. FWIW, libraries trying to be "helpful" by downgrading close() to read-to-EOF, and abort to close() in order to improve HTTP connection pooling are a recurrent problem with Hadoop object store connectors. When you have a 16GB file and want to stop reading 32 MB in, abort() is absolutely the correct strategy, and we know exactly what we are doing, at least until HTTP/2 becomes the transport |
@ALL has there been any progress on this ticket yet? |
Any update? still encountered this issue. |
Fixed for me in hadoop-aws 2.9.0 and later. |
yes, fixed on 2.9 and 3.0 with the AWS update of HADOOP-14890 |
I'm getting this warning when aborting on |
Still happening on 1.11.368 |
@metabrain we have a small fix per #1657. This should be released today. |
Hi, I still got the following message:
I only checked for a sample, but the tasks producing the logs containing those messages were also the ones which only complete after an unreasonable amount of time. |
|
For people like @PowerToThePeople111 who are seeing this w/ S3A, download cloudstore, run storediag with the -j option to see what JARs are actually on your CP. We shouldn't see this on recent Hadoop releases, e.g 3.1+. |
Experiencing the same in 1.11.595 WARN S3AbortableInputStream:178 - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use. |
@naveen09 -you need to explicitly drain the bytes in the stream, call abort() or do a ranged get. It should not be showing this message on an explicit call to abort() |
With a recent upgrade to the 1.11.134 SDK, tests seeking around a large CSV file is triggering a large set of repeated warnings about closing the stream early.
full log
I've seen other people complaining about this, with the issue being closed as "user gets to fix their code", or similar. However, here's why I believe the system is overreacting: it is making invalid assumptions about the remaining amount of data & cost of read vs abort & reconnect, and it is also failing to note that it's previously warned of this.
Hadoop supports reading in files of tens to hundreds of GB, with the client code assuming its a Posix input stream where seek() is inefficient, but less inefficient than reading.
The first S3A release did call
close()
, rather thanabort()
, leading to HADOOP-11570: seeking being pathologically bad on a long input stream. What we now do, HADOOP-13047, is provide a tunable threshold for forward seeks where we skip bytes rather than seek. The default is 64K, for long-haul links a value of 512K works better. But above 512K, even over a long-haul connect, it is better to set up a new HTTPS connection than try and reuse an existing HTTP/1.1 connection. Which is what we do.only now, every time it happens a message appears in the log, "This is likely an error". It's not, its exactly what we want to do based on our benchmarking of IO performance. We do have a faster IO mechanism when users explicitly want random access, but as thats pathological on non-seeking file reads, it's not on by default.
I'm covering this in HADOOP-14596; I think we'll end up configuring log4j so that, even in production clusters, warning messages from
S3AbortableInputStream
are not logged. This is a bit dangerous.Here are some ways which our logs could be improved without having to be so drastic.
thanks.
The text was updated successfully, but these errors were encountered: