Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocol (un)marshallers make many allocations #377

Closed
jasdel opened this issue Sep 8, 2015 · 18 comments
Closed

Protocol (un)marshallers make many allocations #377

jasdel opened this issue Sep 8, 2015 · 18 comments
Labels
closed-for-staleness feature-request A feature should be added or improved.

Comments

@jasdel
Copy link
Contributor

jasdel commented Sep 8, 2015

The protocol marshallers make numerous allocations when marshaling and unmarshalling structs. This is much more noticeable for larger API operations such as ElasticTranscoder's CreateJob can does about 90k bytes in 4k allocations. Whereas encoding/json marshal on the same structure only does 4k bytes in 7 allocations.
#376 updates internal/protocol/json to perform a more efficient check if a field is exported or not. Reducing the allocations caused by strings.ToLower. Several of the other protocol's marshallers perform strings.ToLower check, and probably can be updated also.

Improving the SDK's marshallers to be more efficient with allocations would help to increase performance and reduce memory usage.

@jasdel
Copy link
Contributor Author

jasdel commented Oct 28, 2015

#404 added a nice improvement to the performance of marshalling JSONRPC, JSONUTIL, and RESTJSON protocols. With a performance improvement of about 60%. Simple structures didn't see as much impact, but more complex input structs with nested types saw larger improvements.

make bench-protocol snip:

# BenchmarkJSONRPCBuild_Simple_dynamodbPutItem:
0.9.17             50000         32969 ns/op        9757 B/op        211 allocs/op
8236f00           100000         20773 ns/op        4194 B/op        105 allocs/op

# BenchmarkJSONUtilBuild_Simple_dynamodbPutItem:
0.9.17             50000         32709 ns/op        9693 B/op        209 allocs/op
8236f00           100000         20495 ns/op        4130 B/op        103 allocs/op

# BenchmarkRESTJSONBuild_Complex_elastictranscoderCreateJobInput
0.9.17              3000        439981 ns/op       89382 B/op       2554 allocs/op
8236f00             5000        280141 ns/op       41221 B/op       1131 allocs/op

# BenchmarkRESTJSONBuild_Simple_elastictranscoderListJobsByPipeline-8
0.9.17            100000         16040 ns/op        5104 B/op         98 allocs/op
8236f00           100000         14590 ns/op        4304 B/op         80 allocs/op

@bpot
Copy link
Contributor

bpot commented Nov 17, 2015

@jasdel The interface style looks good to me.

I have one other idea. Would it be possible to hang the payload tags off of the blank identifier instead of using the unexported metadata* field/SDKShapeTraits combination? For example,

type UploadPartInput struct {
  _ struct{} `type:"structure" payload:"Body"`
  ...
}

We would still need to use FieldByName("_") to fetch the payload tag but because the field we're looking for is on the top level struct it would take the fast path instead of the path the triggers all of the allocations.

@jasdel
Copy link
Contributor Author

jasdel commented Nov 25, 2015

Hi @bpot Thanks for digging deeper into this pattern using _ is an interesting alternative. Ill review the PR at #453 and update you there with any feedback.

@jasdel jasdel added feature-request A feature should be added or improved. and removed enhancement labels Jan 13, 2016
@tve
Copy link

tve commented Dec 7, 2016

Is there any work in progress on the XML front? The EC2 DescribeImages call seems like a good test case ;-)

@jasdel
Copy link
Contributor Author

jasdel commented Dec 8, 2016

Updated in #722. We're currently investigating how the marshalers can be improved. Hopefully this will lead to replacing the memory duplication and reflection type walking with code generation.

@jasdel jasdel added the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Dec 8, 2016
@jasdel jasdel mentioned this issue Mar 10, 2017
@jasdel jasdel removed the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Mar 29, 2017
@jonaskint
Copy link

Any upgrade on the status of this issue? We're currently running against the same problem as issue #1300 .

@jasdel
Copy link
Contributor Author

jasdel commented Nov 7, 2017

Thanks for the feedback @jonaskint. We started the work with this task via #1554 targeting RESTXML and RESTJSON marshaling first. We discovered the SDK's protocol tests were not comprehensive and allowed a bug to leak to master. We reverted the change due to an error in the code generation and marshaling of the payloads. This PR represents the pending work on the RESTJSON and RESTXML marshalers. We've not yet started work on improving the unmarshalers.

For the s3 manager issue #1554 won't fix that. I think s3manager needs to be refactored significantly to improve its memory usage and footprint. This work is still in our backlog.

@jasdel
Copy link
Contributor Author

jasdel commented Nov 9, 2017

Merging #722 and this issue together to unify the discussion and work on this issue. Both can be solved by improved SDK (un)marshaling support.

@jasdel jasdel changed the title Protocol Marshallers make many allocations Protocol (un)marshallers make many allocations Nov 9, 2017
@teastburn
Copy link

teastburn commented Jan 9, 2019

Hello, thanks for the hard work! Any updates on this?

I am seeing ~9x memory growth using WriteAtBuffer from a Lambda. Here I am downloading a file that is 50949808 bytes and printing memory stats via runtime.MemStats:

# Alloc = 1 MiB TotalAlloc = 1 MiB Sys = 3 MiB NumGC = 0
buffer := aws.NewWriteAtBuffer([]byte{})
n, err := downloader.Download(buffer, &s3.GetObjectInput{
    Bucket: aws.String(s3Bucket),
    Key:    aws.String(s3Path),
})
# Alloc = 250 MiB TotalAlloc = 25973 MiB Sys = 438 MiB NumGC = 346

Note the Sys memory, which is what Lambda seems to care about, went up 435 MiB.

@jasdel
Copy link
Contributor Author

jasdel commented Jan 9, 2019

@teastburn have you tried to preallocated the buffer used by the WriteAtBuffer? You can do this by passing in a preallocated []byte slice into NewWriteAtBuffer. Without a preallocated buffer each time the underlying buffer of WriteAtBuffer needs to grow it would need to allocate a new byte array and copy in the old data. The old buffer will eventually be garbage collected, but depending on the order of parts downloaded of the object this could cause many allocations.

@teastburn
Copy link

teastburn commented Jan 9, 2019

@jasdel thanks for the quick response! I did also try that after looking at the WriteAtBuffer code.

# Alloc = 1 MiB TotalAlloc = 1 MiB Sys = 5 MiB NumGC = 0
b := make([]byte, 50000000)
# Alloc = 49 MiB    TotalAlloc = 49 MiB    Sys = 55 MiB    NumGC = 1
buffer := aws.NewWriteAtBuffer(b)
# Alloc = 48 MiB    TotalAlloc = 49 MiB    Sys = 55 MiB    NumGC = 1
n, err := downloader.Download(buffer, &s3.GetObjectInput{
	Bucket: aws.String(s3Bucket),
	Key:    aws.String(s3Path),
})
# Alloc = 153 MiB    TotalAlloc = 5522 MiB    Sys = 525 MiB    NumGC = 55

Interestingly, the Lambda did not run out of memory (Memory Size: 512 MB Max Memory Used: 509 MB), and the NumGC is far lower when pre-allocating, along with TotalAlloc. However, the Sys memory used appears to be roughly equivalent or higher.

I'm using go version go1.x (lambda) and aws-sdk-go version v1.15.72.

@teastburn
Copy link

teastburn commented Jan 15, 2019

I switched to using

	resp, err := svc.GetObject(&s3.GetObjectInput{
		Bucket: aws.String(s3Bucket),
		Key:    aws.String(s3Path),
	})

which uses a constant amount of memory if used with an io.Reader on resp.Body.

@jasdel
Copy link
Contributor Author

jasdel commented Jan 15, 2019

Thanks for the update @teastburn. Is your application using io.Copy from the resp.Body into another buffer, or is your application reading directly from the resp.Body reader?

@teastburn
Copy link

teastburn commented Jan 15, 2019

@jasdel do you have a recommendation there? I am wrapping it in a bufio.NewReader so that I can easily read by lines via a Scanner. One semi annoying issue with my solution is the caller to my struct receiver function still has to call Close() on the body via a struct method. The good thing, however, is that even loading a 256mb file takes a max memory of 50mb.

// Allows us to loop through s3 file line by line
type ByteReader interface {
	io.Reader
	ReadBytes(byte) ([]byte, error)
}

// Mockable interface
type S3FileGettable interface {
	Get(string, string) (ByteReader, error)
	io.Closer
}

 type S3Getter struct {
	body io.ReadCloser
}

 func (s3g *S3Getter) Close() error {
	return s3g.body.Close()
}

 // Must call S3Getter.Close() after done with ByteReader
func (s3g *S3Getter) Get(s3Bucket, s3Path string) (ByteReader, error) {
	// ...
	svc := s3.New(session.Must(session.NewSession(s3Cfg)))
	resp, err := svc.GetObject(&s3.GetObjectInput{
		Bucket: aws.String(s3Bucket),
		Key:    aws.String(s3Path),
	})
	// ...
	s3g.body = resp.Body
	reader := bufio.NewReader(resp.Body)
	return reader, nil
}

@jasdel
Copy link
Contributor Author

jasdel commented Jan 15, 2019

Using the buff reader like your are is a good way to get the ByteReader interface. I suggest instead of the close method on S3Getter that the returned reader has the Close method on it. This would allow close handling to be performed by the downstream consumer of the stream, instead of making S3Getter stateful. If S3Getter.Get is accidentally called before the S3Getter.Close method is called, the previous object's body may never be closed.

Something like the following might work better for closing the body.

type ByteReadCloser interface {
    ByteReader
    io.Closer
}

type S3FileGetter struct {
     Client *s3.S3 // Add client shared across all object gets.
}

func (s3g *S3Getter) Get(s3Bucket, s3Path string) (ByteReadCloser, error) {
	// ...
	resp, err := s3g.Client.GetObject(&s3.GetObjectInput{
		Bucket: aws.String(s3Bucket),
		Key:    aws.String(s3Path),
	})
	// ...
	return NewBufferedReadCloser(resp.Body), nil
}

type BufferedReadCloser struct {
    *bufio.Reader
    io.Closer
}

func NewBufferedReadCloser(reader io.ReadCloser) *BufferedReadCloser {
    return &BufferedReadCloser{
         Reader: bufio.NewReader(reader),
         Closer: reader,
    }
}

func (b *ByteReadCloser) Close() error {
     return b.Closer.Close()
}

Then be used similar to the following. Also, try to only initialize the S3 client once per region/account to ensure that duplicate resources aren't created within the applications.

objGetter := S3Getter{Client: s3.New(session.Must(session.NewSession(s3Cfg)))}

// Get each object, and process it

reader, err := objGetter.Get(myBucket, myKey)
// handle error
defer reader.Close()

// process reader content.

@teastburn
Copy link

@jasdel That's great feedback. Thanks for the help!

@eran-levy
Copy link

Its possible to provide another buffer strategy that is using sync.Pool for download, just wrote the following post for more info: https://medium.com/@levyeran/high-memory-allocations-and-gc-cycles-while-downloading-large-s3-objects-using-the-aws-sdk-for-go-e776a136c5d0

@github-actions
Copy link

github-actions bot commented Mar 8, 2021

We have noticed this issue has not received attention in 1 year. We will close this issue for now. If you think this is in error, please feel free to comment and reopen the issue.

@github-actions github-actions bot added closing-soon This issue will automatically close in 4 days unless further comments are made. closed-for-staleness and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-for-staleness feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

6 participants