Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3_get_file usage of BufferStream breaks AWS.jl error handling #249

Closed
li1 opened this issue Apr 25, 2022 · 4 comments
Closed

s3_get_file usage of BufferStream breaks AWS.jl error handling #249

li1 opened this issue Apr 25, 2022 · 4 comments

Comments

@li1
Copy link

li1 commented Apr 25, 2022

Hi!

We've run into an issue where an HTTP.StatusError generated by https://github.com/JuliaCloud/AWS.jl/blob/master/src/AWSExceptions.jl#L59-L63 is getting clobbered by an exception because seekstart (https://github.com/JuliaCloud/AWS.jl/blob/master/src/AWSExceptions.jl#L60) is undefined on a BufferStream.

From what I can tell, AWS.jl currently expects this to break, as they warn against using an <:IO that doesn't support seeking: https://github.com/JuliaCloud/AWS.jl/blob/master/src/utilities/request.jl#L69-L70

BufferStreams are used in AWSS3.jl in s3_get_file: https://github.com/JuliaCloud/AWSS3.jl/blob/master/src/AWSS3.jl#L177 calls s3_get, which creates a BufferStream in turn: https://github.com/JuliaCloud/AWSS3.jl/blob/master/src/AWSS3.jl#L128.

I'm not sure whether the better plan of attack is to

  • properly handle BufferStream upstream (either by providing another overload or at wrapping the exception generation code in a try catch or something similar — Add failsafe when exception processing fails AWS.jl#478 advocates for this), or to
  • avoid the usage of BufferStream in this library — but then I'm not sure how you'd implement s3_get file.

Help much appreciated!

PS: I'll file a similar issue over at AWS.jl, will add the link here. Edit: JuliaCloud/AWS.jl#545

@iamed2
Copy link
Member

iamed2 commented Jun 21, 2022

@omus do you think it would be safe to switch to an IOBuffer instead? Or is there another way to use the new response type to avoid BufferStream? I don't know much about the new thing.

@iamed2
Copy link
Member

iamed2 commented Jun 21, 2022

Here's an example error:

julia> s3_get_file("invenia-public-datasets", "doesntexist.json")
ERROR: MethodError: no method matching seek(::Base.BufferStream, ::Int64)
Closest candidates are:
  seek(::Base.GenericIOBuffer, ::Integer) at iobuffer.jl:250
  seek(::Base.Libc.FILE, ::Integer) at libc.jl:97
  seek(::IOStream, ::Integer) at iostream.jl:127
  ...
Stacktrace:
  [1] seekstart(s::Base.BufferStream)
    @ Base ./iostream.jl:154
  [2] AWS.AWSExceptions.AWSException(e::HTTP.ExceptionRequest.StatusError, stream::Base.BufferStream)
    @ AWS.AWSExceptions ~/.julia/packages/AWS/5goc1/src/AWSExceptions.jl:60
  [3] macro expansion
    @ ~/.julia/packages/AWS/5goc1/src/utilities/request.jl:101 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/Retry/vS1bg/src/repeat_try.jl:100 [inlined]
  [5] submit_request(aws::AWS.AWSConfig, request::AWS.Request; return_headers::Nothing)
    @ AWS ~/.julia/packages/AWS/5goc1/src/utilities/request.jl:85
  [6] (::AWS.RestXMLService)(request_method::String, request_uri::String, args::Dict{String, Any}; aws_config::AWS.AWSConfig, feature_set::AWS.FeatureSet)
    @ AWS ~/.julia/packages/AWS/5goc1/src/AWS.jl:284
  [7] #get_object#90
    @ ~/.julia/packages/AWS/5goc1/src/services/s3.jl:2624 [inlined]
  [8] macro expansion
    @ ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:144 [inlined]
  [9] macro expansion
    @ ~/.julia/packages/Retry/vS1bg/src/repeat_try.jl:192 [inlined]
 [10] s3_get(aws::AWS.AWSConfig, bucket::AWS.AWSConfig, path::String; version::Nothing, retry::Bool, raw::Bool, byte_range::Nothing, headers::Dict{String, Any}, return_stream::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:126
 [11] #s3_get_file#6
    @ ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:177 [inlined]
 [12] s3_get_file(aws::AWS.AWSConfig, bucket::AWS.AWSConfig, path::String, filename::String)
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:177
 [13] s3_get_file(::AWS.AWSConfig, ::Vararg{Any, N} where N; b::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [14] s3_get_file(::AWS.AWSConfig, ::String, ::String)
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [15] s3_get_file(::String, ::Vararg{String, N} where N; b::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [16] s3_get_file(::String, ::String)
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [17] top-level scope
    @ REPL[8]:1

caused by: HTTP.ExceptionRequest.StatusError(400, "GET", "/AWS.AWSConfig%28%28AKIAIY55GILO5HWHABJA%2C%20SAM...%2C%20146138512-12-31T23%3A59%3A59%29%2C%20\"us-east-1\"%2C%20\"json\"%29/invenia-public-datasets", HTTP.Messages.Response:
"""
HTTP/1.1 400 Bad Request
x-amz-request-id: CNRP4QJCC488N6W9
x-amz-id-2: iF3N5ReRYOM/1YbTEuIRZQJLCFtmIo2/lGj3z/L5vwj6pvv0uxgnVzDftoG2AkrjhPTBHley3tw=
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Tue, 21 Jun 2022 18:14:35 GMT
Server: AmazonS3
Connection: close

[Message Body was streamed]""")
Stacktrace:
  [1] request(::Type{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer{Union{}}}}}, ::URIs.URI, ::Vararg{Any, N} where N; kw::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:iofunction, :require_ssl_verification, :response_stream), Tuple{Nothing, Bool, Base.BufferStream}}})
    @ HTTP.ExceptionRequest ~/.julia/packages/HTTP/aTjcj/src/ExceptionRequest.jl:22
  [2] request(::Type{HTTP.MessageRequest.MessageLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer{Union{}}}}}}, method::String, url::URIs.URI, headers::Vector{Pair{SubString{String}, SubString{String}}}, body::String; http_version::VersionNumber, target::String, parent::Nothing, iofunction::Nothing, kw::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:require_ssl_verification, :response_stream), Tuple{Bool, Base.BufferStream}}})
    @ HTTP.MessageRequest ~/.julia/packages/HTTP/aTjcj/src/MessageRequest.jl:66
  [3] request(::Type{HTTP.BasicAuthRequest.BasicAuthLayer{HTTP.MessageRequest.MessageLayer{HTTP.ExceptionRequest.ExceptionLayer{HTTP.ConnectionRequest.ConnectionPoolLayer{HTTP.StreamRequest.StreamLayer{Union{}}}}}}}, method::String, url::URIs.URI, headers::Vector{Pair{SubString{String}, SubString{String}}}, body::String; kw::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:require_ssl_verification, :response_stream), Tuple{Bool, Base.BufferStream}}})
    @ HTTP.BasicAuthRequest ~/.julia/packages/HTTP/aTjcj/src/BasicAuthRequest.jl:28
  [4] #request#1
    @ ~/.julia/packages/HTTP/aTjcj/src/TopRequest.jl:15 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/Mocking/MsKoy/src/mock.jl:29 [inlined]
  [6] macro expansion
    @ ~/.julia/packages/AWS/5goc1/src/utilities/request.jl:165 [inlined]
  [7] macro expansion
    @ ~/.julia/packages/Retry/vS1bg/src/repeat_try.jl:192 [inlined]
  [8] _http_request(http_backend::AWS.HTTPBackend, request::AWS.Request, response_stream::Base.BufferStream)
    @ AWS ~/.julia/packages/AWS/5goc1/src/utilities/request.jl:155
  [9] macro expansion
    @ ~/.julia/packages/Mocking/MsKoy/src/mock.jl:29 [inlined]
 [10] macro expansion
    @ ~/.julia/packages/AWS/5goc1/src/utilities/request.jl:88 [inlined]
 [11] macro expansion
    @ ~/.julia/packages/Retry/vS1bg/src/repeat_try.jl:192 [inlined]
 [12] submit_request(aws::AWS.AWSConfig, request::AWS.Request; return_headers::Nothing)
    @ AWS ~/.julia/packages/AWS/5goc1/src/utilities/request.jl:85
 [13] (::AWS.RestXMLService)(request_method::String, request_uri::String, args::Dict{String, Any}; aws_config::AWS.AWSConfig, feature_set::AWS.FeatureSet)
    @ AWS ~/.julia/packages/AWS/5goc1/src/AWS.jl:284
 [14] #get_object#90
    @ ~/.julia/packages/AWS/5goc1/src/services/s3.jl:2624 [inlined]
 [15] macro expansion
    @ ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:144 [inlined]
 [16] macro expansion
    @ ~/.julia/packages/Retry/vS1bg/src/repeat_try.jl:192 [inlined]
 [17] s3_get(aws::AWS.AWSConfig, bucket::AWS.AWSConfig, path::String; version::Nothing, retry::Bool, raw::Bool, byte_range::Nothing, headers::Dict{String, Any}, return_stream::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:126
 [18] #s3_get_file#6
    @ ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:177 [inlined]
 [19] s3_get_file(aws::AWS.AWSConfig, bucket::AWS.AWSConfig, path::String, filename::String)
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:177
 [20] s3_get_file(::AWS.AWSConfig, ::Vararg{Any, N} where N; b::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [21] s3_get_file(::AWS.AWSConfig, ::String, ::String)
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [22] s3_get_file(::String, ::Vararg{String, N} where N; b::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [23] s3_get_file(::String, ::String)
    @ AWSS3 ~/.julia/packages/AWSS3/D1gQl/src/AWSS3.jl:186
 [24] top-level scope
    @ REPL[8]:1

@quinnj
Copy link
Member

quinnj commented Jun 21, 2022

Proposal here to just support Base.BufferStream explicitly when generating AWSExceptions. Any danger in this approach? I mean, we're in an error path, so something went wrong, so it seems justified that we close the buffer stream and read the body in order to see if there's an aws exception there. Or do we want to avoid touching the user's response_stream if possible?

@quinnj
Copy link
Member

quinnj commented Jun 30, 2022

Fixed upstream

@quinnj quinnj closed this as completed Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants