Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] SDK overhaul #725

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 0 additions & 16 deletions aws/request/handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,22 +20,6 @@ type Handlers struct {
AfterRetry HandlerList
}

// Copy returns of this handler's lists.
func (h *Handlers) Copy() Handlers {
return Handlers{
Validate: h.Validate.copy(),
Build: h.Build.copy(),
Sign: h.Sign.copy(),
Send: h.Send.copy(),
ValidateResponse: h.ValidateResponse.copy(),
Unmarshal: h.Unmarshal.copy(),
UnmarshalError: h.UnmarshalError.copy(),
UnmarshalMeta: h.UnmarshalMeta.copy(),
Retry: h.Retry.copy(),
AfterRetry: h.AfterRetry.copy(),
}
}

// Clear removes callback functions for all handlers
func (h *Handlers) Clear() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These copy's are actually needed. HandlerList is a struct, but it contains a slice of Handlers. This slice needs to be independently mutable at each stage of a request's pipeline without impacting any upstream functionality.

Specifically this change will cause any modifications made to a request handler's list to be propagated upstream to the service's request handler list. This means that If a customization needs to modify a request handler to perform some functionality for a specific API operation, the mutation of the handler list will be propagated upstream to the service, and now all API operation made will now also include the request handler customization. A common pattern is to share session.Session between multiple service clients, and this would polute all other service client instances with the customized request handlers.

This also introduces race conditions in SDK. Which is the panic shown in the failed Travis test. Since the backing arrays of handlers would be shared between multiple instances of service clients, and API operations being made concurrently, any mutation to the handler list would introduce race conditions.

This Go blog post provides a good rundown on how Arrays vs Slices could break if down stream changes make unintended mutations to a upstream owned slice.

Copy link

@vburenin vburenin Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How frequently this Copy is being called? Once per request?

Edit: Yes, once per request. It takes around 3000ns on my laptop to make a copy.

Copy link
Contributor

@jasdel jasdel Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the only time Copy will be called is when a service client is created from a session.Session, and then each time an API operation is called. Outside of that it should be very rare of a HandlerList to be copied.

An alternative approach might be to implement lazy copy of the handlers, but this would require mutexes to wrap the HandlerList functions, since concurrent mutations/operation need to be supported. As a guess adding the critical section check will have more overhead over the life time of the request than the copy of slices of function pointers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it really should be implemented somehow differently. Let say if I initialize a client for dynamo db, everything should be initialized only and only once. I can't see there is anything needs to be copied in that already prepared context.

Copy link
Contributor

@jasdel jasdel Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users are always free to add or remove request handlers from the various HandlerLists based on their use cases. This can be performed both on the service client instance, and the individual request such as with, dynamodb.CreateTableRequest. All of the xxxRequest methods return a request.Request that can be modified based on the user's preferences. This is commonly done for debugging, or metrics reporting.

In addition there are customizations the SDK implements for some of the service client's and their API operations, and these are applied to the individual operation requests as they are called.

The HandlerList could be created once the first time each API operation is called by the service client, but mutex's would need to be added around the HandlerLists. This is due to API operations are allowed to be called concurrently. With that said, the handler lists still must be a unique copy per API operation request call instance, because users could choose to augment an operation's request Handlers uniformly. An upfront, or first time use copy would prevent this functionality.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like @vburenin said, this happens a lot.

I believe this is a good case where the so called convenience of user trying to add a metric or a logging as you say should be taken away and have the user add the code for monitoring manually.

In @vburenin's case, this alone takes 3ms to happen for every single time. It's a third of the "single digit millisecond" "feature" of DynamoDB.

Copy link
Contributor

@jasdel jasdel Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note 3000ns is only 3us or 0.003ms not 3ms. Since requests can take multiple ms to cross the wire 0.003ms once per request is not significant.

Having user modifiable handlers is important, and some users depend on this functionality today. Removing this would be a breaking change to the SDK.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, my math is wrong in the morning.

h.Validate.Clear()
Expand Down
19 changes: 5 additions & 14 deletions aws/request/request.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ import (
"io"
"io/ioutil"
"net/http"
"net/url"
"reflect"
"strings"
"time"

Expand Down Expand Up @@ -77,28 +75,21 @@ func New(cfg aws.Config, clientInfo metadata.ClientInfo, handlers Handlers,
p = "/"
}

httpReq, _ := http.NewRequest(method, "", nil)

var err error
httpReq.URL, err = url.Parse(clientInfo.Endpoint + p)
httpReq, err := http.NewRequest(method, clientInfo.Endpoint+p, nil)
if err != nil {
httpReq.URL = &url.URL{}
err = awserr.New("InvalidEndpointURL", "invalid endpoint uri", err)
return &Request{Error: err}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this can exit early here. It is possible for user's custom request handlers to depend on the request.Request.HTTPRequest value expecting it to be set. Returning early without the other fields could cause breaking changes because fields like HTTPRequest are no longer set. Its still an error condition, but exiting early may cause applications to panic instead of returning a API operation error.

I don't see any problem with getting ride of the url.Parse and letting NewRequest validate the URL. Both will report the error in Go 1.6. Neither 1.4 or 1.5 would return an error for invalid URLs such as http://example.com:80. For 1.4 and 1.5 the request would fail at transmission time instead of being caught early. #689.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take on this is the following:

  • the function itself should have been returning a *Request, error pair from the beginning. It doesn't do so so I'd rather not break it yet.
  • regarding the issue in Trailing space in the Endpoint config parameter crashes the client #689, I don't see why this would make any difference to be honest. In both cases if there's an error there, the API call can't continue. If the developer can't provide a proper URL then it's ok for this to crash. We are also pretty close to Go 1.7 so there should be little to no incentive to support Go 1.4 (or really anything under 1.6 at this stage).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both cases are an error, but if the SDK were to start panicking it would break user's applications where before only an error would of been returned.

The SDK's min supported version is 1.5, and plans to continue to support this version in the future. Go 1.4 is not officially supported by the SDK, but many users still use this version of Go, and we'd like to continue unofficial support for this version as long as possible. The SDK supporting these additional Go versions does not impact the overall performance of the SDK. This does introduce some limitations on what new stdlib features the SDK can use, but this has not introduced any issues as of yet.

}

r := &Request{
Config: cfg,
ClientInfo: clientInfo,
Handlers: handlers.Copy(),

Handlers: handlers,
Retryer: retryer,
Time: time.Now(),
ExpireTime: 0,
Operation: operation,
HTTPRequest: httpReq,
Body: nil,
Params: params,
Error: err,
Data: data,
}
r.SetBufferBody([]byte{})
Expand All @@ -115,14 +106,14 @@ func (r *Request) WillRetry() bool {
// and the parameters are valid. False is returned if no parameters are
// provided or invalid.
func (r *Request) ParamsFilled() bool {
return r.Params != nil && reflect.ValueOf(r.Params).Elem().IsValid()
return r.Params != nil
}

// DataFilled returns true if the request's data for response deserialization
// target has been set and is a valid. False is returned if data is not
// set, or is invalid.
func (r *Request) DataFilled() bool {
return r.Data != nil && reflect.ValueOf(r.Data).Elem().IsValid()
return r.Data != nil
Copy link
Contributor

@jasdel jasdel Jun 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for nil of Param and Data were to insure that a interface with a nil type value wasn't provided. Since r.Param != nil may fail with a false positive in the situation where a nil type value was set to the interface [1] [2].
Though in practice this could only happen if a user where to set the value of the r.Param or r.Data them selves since all operation's should always provided a valid value for two fields.

Are you seeing these two calls impacting the performance of the SDK? In the pprof's we've been running this hasn't registered.

The portion of the overhead which is impacted by reflection will be focused more on the (un)marshaling of the API operation's request/response data. As visible in #722.

}

// SetBufferBody will set the request's body bytes that will be sent to
Expand Down
2 changes: 1 addition & 1 deletion aws/session/session.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ func initHandlers(s *Session) {
func (s *Session) Copy(cfgs ...*aws.Config) *Session {
newSession := &Session{
Config: s.Config.Copy(cfgs...),
Handlers: s.Handlers.Copy(),
Handlers: s.Handlers,
}

initHandlers(newSession)
Expand Down
52 changes: 44 additions & 8 deletions private/waiter/waiter.go
Original file line number Diff line number Diff line change
Expand Up @@ -61,20 +61,56 @@ func (w *Waiter) Wait() error {
break
}
result = true
for _, val := range vals {
if !awsutil.DeepEqual(val, a.Expected) {
result = false
break
if exp, ok := a.Expected.(string); ok {
Copy link
Contributor

@jasdel jasdel Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to be type specific is a fine change instead of reflection, but an a.Expected can be many more types than string and int. It can actually be any value type. This is the reason awsutil.DeepEqual is used. The service's waiter will define a field that indicates a specific state of the waiter has occurred. The type of this field is pretty open ended. Such as, the a.Expected could contain complex value such as structs, maps or lists.

I don't think off hand there are any that actually use complex types, so a primitive type matching might be used without reflection. Though at any time a service's waiter could use a complex type, so the SDK would need to use the DeepEquals as a fallback.

In either case any functionality like this should be in its own independent function so its not duplicated through the code, and can be tested.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under which condition do you see this happening? I've searched the SDK and all I could find was that the waiter waits for either a string or an int. I'm rather tempted to split this into two fields actually, ExpectString and ExpectInt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoScaling's waiters are a good example of bool expected types. e.g. WaitUntilGroupInService. In generally only bool, string, and int values are used as Expected.

If the fields were split out they would be ExpectedString, ExpectedInt, ExpectedBool. This would also grow if the Expected for each additional type that were ever added such as float, slice, and maps. With that said, this would be a breaking change to the SDK because the waiter.WaitAcceptor type is an exported type. User's most likely would have no reason to use the type or its fields directly, but changing the fields would break them if they did.

There is nothing wrong with updating this section to use non-reflection for for the known cases int, string, bool, and falling back to the awsutil.DeepEquals for all other cases. In addition to the change should not be hardcodeed and duplicated inside of the waiter code, but implemented in a standalone utility function that is testable. In addition this should be discussed in its own Issue, with its own PR.

for _, val := range vals {
if got, ok := val.(*string); ok {
if exp != *got {
result = false
break
}
} else {
result = false
break
}
}
} else if exp, ok := a.Expected.(int); ok {
for _, val := range vals {
if got, ok := val.(*int); ok {
if exp != *got {
result = false
break
}
} else {
result = false
break
}
}
} else {
panic("Unexpected type for expected value")
}
case "pathAny":
// Only a single match needs to equal for the result to match
vals, _ = awsutil.ValuesAtPath(req.Data, a.Argument)
for _, val := range vals {
if awsutil.DeepEqual(val, a.Expected) {
result = true
break
if exp, ok := a.Expected.(string); ok {
for _, val := range vals {
if got, ok := val.(*string); ok {
if exp == *got {
result = true
break
}
}
}
} else if exp, ok := a.Expected.(int); ok {
for _, val := range vals {
if got, ok := val.(*int); ok {
if exp == *got {
result = true
break
}
}
}
} else {
panic("Unexpected type for expected value")
}
case "status":
s := a.Expected.(int)
Expand Down