-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd-tester: pause before compaction, fix races #5914
Conversation
2b18785
to
16fa9a9
Compare
16fb7ca
to
4652155
Compare
ea569cf
to
5daa121
Compare
566619b
to
25abbb5
Compare
0aed9eb
to
3314318
Compare
6ebae67
to
02fc2e4
Compare
60b8dd0
to
12e4001
Compare
@xiang90 @heyitsanthony Can you check this change? I am trying to find where I am leaking stresser goroutines... This change keeps giving me these errors.
So the cancel waits for all goroutines to exit counting the waitgroup, but the log shows there are still ongoing stresser.run goroutines, which are causing inconsistent revisions. |
@gyuho in |
@heyitsanthony We already define in pointer type stresser struct {
Endpoint string
KeySize int
KeySuffixRange int
qps int
N int
mu sync.Mutex
wg *sync.WaitGroup
conn *grpc.ClientConn
rateLimiter *rate.Limiter
cancel func()
canceled bool
success int
} |
12e4001
to
7d2b39c
Compare
After this changes, everything works fine? |
@xiang90 I will try again today. I was running this patch yesterday but kept getting
I think I am still doing cancellation wrong. Thanks! |
- increase default qps - fix race condition between stresser cancel, start - pause before doing compaction - remove duplicate cleanup calls - add additional atomic canceled check
7d2b39c
to
9453d6b
Compare
@gyuho yes, you're right about the wg pointer. What about:
Should this be synchronous instead of launched in a new goroutine? Otherwise the stress cancel could happen before the stress start completes. |
Looking at the code closer, |
@heyitsanthony I will try to make it more synchronous. I tried to hold the lock to block concurrent cancellation with additional but I still get this error...
Thanks! |
@gyuho I don't really trust that |
@heyitsanthony Yeah it got too messy. I will try to make startStresser synchronous removing Thanks. |
For example,