Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce sensitivity of fncache cancellation test #14585

Merged
merged 4 commits into from
Jul 27, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 18 additions & 7 deletions lib/utils/fncache_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -157,35 +157,46 @@ func testFnCacheSimple(t *testing.T, ttl time.Duration, delay time.Duration) {
func TestFnCacheCancellation(t *testing.T) {
t.Parallel()

const timeout = time.Millisecond * 10
const longTimeout = time.Second * 10 // should never be hit

cache, err := NewFnCache(FnCacheConfig{TTL: time.Minute})
require.NoError(t, err)

ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()

// used to artificially block the load function
blocker := make(chan struct{})

// set up a context that we can cancel from within the load function to
// simulate a scenario where the calling context is canceled or times out.
// if we actually hit the timeout, that is a bug.
ctx, cancel := context.WithTimeout(context.Background(), longTimeout)
defer cancel()

v, err := cache.Get(ctx, "key", func(context.Context) (interface{}, error) {
cancel()
<-blocker
return "val", nil
})

require.Nil(t, v)
require.Equal(t, context.DeadlineExceeded, trace.Unwrap(err))
require.Equal(t, context.Canceled, trace.Unwrap(err), "context should have been canceled immediately")

// unblock the loading operation which is still in progress
close(blocker)

ctx, cancel = context.WithTimeout(context.Background(), timeout)
// since we unblocked the loadfn, we expect the next Get to return almost
// immediately. we still use a fairly long timeout just to ensure that failure
// is due to an actual bug and not due to resource constraints in the test env.
ctx, cancel = context.WithTimeout(context.Background(), longTimeout)
defer cancel()

loadFnWasRun := atomic.NewBool(false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure, but I think this will panic on a 32-bit arch. We've already had issues like that #11822

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use stdlib atomic int64 for now? Next version of Go will have a built-in atomic bool.

Copy link
Contributor Author

@fspmarshall fspmarshall Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakule @zmb3 I think I can answer both questions at the same time:

We started using go.uber.org/atomic a couple years back precisely because of issues w/ alignment (and the generally poor standard library API). go.uber.org/atomic automatically handles alignment and protects from non-atomic access. #11822 wouldn't have happened if we'd been better about using go.uber.org/atomic everywhere. IMO we should keep preferring go.uber.org/atomic over the standard library's API until we get to go1.19 (when the standard library will start exposing an equivalent API of its own).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go.uber.org/atomic automatically handles alignment and protects from non-atomic access.

I don't think this is entirely true, ex: uber-go/atomic#105
But I checked, and this use seems to work on the 32bit arch. I didn't realize that this is uber atomic, not stdlib.
I agree with @zmb3 that we should prefer stuff that is in the stdlib, but atomics is one of those things that so far causes us a lot of problems, and Go 1.19 is not available for us now.

@fspmarshall Do you know why did we decide to stop using Uber atomics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider the linked issue to be an abuse of the API rather than a failure of the library. They are manually dereferencing the wrapper returned by NewDuration(), which breaks alignment. IMO if you're dereferencing opaque types returned by concurrency libraries, you should expect some problems 😬

Do you know why did we decide to stop using Uber atomics?

If we stopped, I certainly wasn't aware of it. I always encourage people to use it when I'm reviewing, and I use it in all the code I write 🤷 The team was a lot smaller back then. Probably just something that got lost along the way.

We do deliberately avoid it in the api package because we are a more aggressive about introducing external deps in there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we/I just lost the context. When I was looking at our code, I wasn't aware that we were using Uber atomics as most of the usage that I saw used the std library.

v, err = cache.Get(ctx, "key", func(context.Context) (interface{}, error) {
t.Fatal("this should never run!")
loadFnWasRun.Store(true)
return nil, nil
})

require.False(t, loadFnWasRun.Load(), "loadfn should not have been run")

require.NoError(t, err)
require.Equal(t, "val", v.(string))
}
Expand Down