-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: fix remaining data races #10341
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
And remove unnecessary functions. This is done in preparation for properly locking when updating tags.
Some global variables are patched to shorter values in these tests. But the goroutines that read them can outlive the test because nothing waited for them to exit. This commit adds a Wait() method to the routine manager, so that tests can wait for the goroutines to exit. This prevents the data race because the 'reset to original value' can happen after all other goroutines have stopped.
github-actions
bot
added
theme/internals
Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics
type/ci
Relating to continuous integration (CI) tooling for testing or releases
labels
Jun 2, 2021
dnephin
added
the
pr/no-changelog
PR does not need a corresponding .changelog entry
label
Jun 2, 2021
The LogOutput io.Writer used by TestAgent must allow concurrent reads and writes, and a bytes.Buffer does not allow this. The bytes.Buffer must be wrapped with a lock to make this safe.
The dnsConfig pulled from the atomic.Value is a pointer, so modifying it in place creates a data race. Use the exported ReloadConfig interface instead.
The test was modifying a pointer to a struct that had been passed to another goroutine. Instead create a new struct to modify. ``` WARNING: DATA RACE Write at 0x00c01407c3c0 by goroutine 832: github.com/hashicorp/consul/agent.TestServiceManager_PersistService_API() /home/daniel/pers/code/consul/agent/service_manager_test.go:446 +0x1d86 testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 Previous read at 0x00c01407c3c0 by goroutine 938: reflect.typedmemmove() /usr/lib/go/src/runtime/mbarrier.go:177 +0x0 reflect.Value.Set() /usr/lib/go/src/reflect/value.go:1569 +0x13b github.com/mitchellh/copystructure.(*walker).Primitive() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/copystructure.go:289 +0x190 github.com/mitchellh/reflectwalk.walkPrimitive() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/reflectwalk.go:252 +0x31b github.com/mitchellh/reflectwalk.walk() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/reflectwalk.go:179 +0x24d github.com/mitchellh/reflectwalk.walkStruct() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/reflectwalk.go:386 +0x4ec github.com/mitchellh/reflectwalk.walk() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/reflectwalk.go:188 +0x656 github.com/mitchellh/reflectwalk.walkStruct() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/reflectwalk.go:386 +0x4ec github.com/mitchellh/reflectwalk.walk() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/reflectwalk.go:188 +0x656 github.com/mitchellh/reflectwalk.Walk() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/reflectwalk.go:92 +0x164 github.com/mitchellh/copystructure.Config.Copy() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/copystructure.go:69 +0xe7 github.com/mitchellh/copystructure.Copy() /home/daniel/go/pkg/mod/github.com/mitchellh/[email protected]/copystructure.go:13 +0x84 github.com/hashicorp/consul/agent.mergeServiceConfig() /home/daniel/pers/code/consul/agent/service_manager.go:362 +0x56 github.com/hashicorp/consul/agent.(*serviceConfigWatch).handleUpdate() /home/daniel/pers/code/consul/agent/service_manager.go:279 +0x250 github.com/hashicorp/consul/agent.(*serviceConfigWatch).runWatch() /home/daniel/pers/code/consul/agent/service_manager.go:246 +0x2d4 Goroutine 832 (running) created at: testing.(*T).Run() /usr/lib/go/src/testing/testing.go:1238 +0x5d7 testing.runTests.func1() /usr/lib/go/src/testing/testing.go:1511 +0xa6 testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 testing.runTests() /usr/lib/go/src/testing/testing.go:1509 +0x612 testing.(*M).Run() /usr/lib/go/src/testing/testing.go:1417 +0x3b3 main.main() _testmain.go:1181 +0x236 Goroutine 938 (running) created at: github.com/hashicorp/consul/agent.(*serviceConfigWatch).start() /home/daniel/pers/code/consul/agent/service_manager.go:223 +0x4e4 github.com/hashicorp/consul/agent.(*ServiceManager).AddService() /home/daniel/pers/code/consul/agent/service_manager.go:98 +0x344 github.com/hashicorp/consul/agent.(*Agent).addServiceLocked() /home/daniel/pers/code/consul/agent/agent.go:1942 +0x2e4 github.com/hashicorp/consul/agent.(*Agent).AddService() /home/daniel/pers/code/consul/agent/agent.go:1929 +0x337 github.com/hashicorp/consul/agent.TestServiceManager_PersistService_API() /home/daniel/pers/code/consul/agent/service_manager_test.go:400 +0x17c4 testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 ```
So that the tests in command can use the consul binary
To pick up data race fixes
By setting the hash when we create the policy. ``` WARNING: DATA RACE Read at 0x00c0028b4b10 by goroutine 1182: github.com/hashicorp/consul/agent/structs.(*ACLPolicy).SetHash() /home/daniel/pers/code/consul/agent/structs/acl.go:701 +0x40d github.com/hashicorp/consul/agent/structs.ACLPolicies.resolveWithCache() /home/daniel/pers/code/consul/agent/structs/acl.go:779 +0xfe github.com/hashicorp/consul/agent/structs.ACLPolicies.Compile() /home/daniel/pers/code/consul/agent/structs/acl.go:809 +0xf1 github.com/hashicorp/consul/agent/consul.(*ACLResolver).ResolveTokenToIdentityAndAuthorizer() /home/daniel/pers/code/consul/agent/consul/acl.go:1226 +0x6ef github.com/hashicorp/consul/agent/consul.resolveTokenAsync() /home/daniel/pers/code/consul/agent/consul/acl_test.go:66 +0x5c Previous write at 0x00c0028b4b10 by goroutine 1509: github.com/hashicorp/consul/agent/structs.(*ACLPolicy).SetHash() /home/daniel/pers/code/consul/agent/structs/acl.go:730 +0x3a8 github.com/hashicorp/consul/agent/structs.ACLPolicies.resolveWithCache() /home/daniel/pers/code/consul/agent/structs/acl.go:779 +0xfe github.com/hashicorp/consul/agent/structs.ACLPolicies.Compile() /home/daniel/pers/code/consul/agent/structs/acl.go:809 +0xf1 github.com/hashicorp/consul/agent/consul.(*ACLResolver).ResolveTokenToIdentityAndAuthorizer() /home/daniel/pers/code/consul/agent/consul/acl.go:1226 +0x6ef github.com/hashicorp/consul/agent/consul.resolveTokenAsync() /home/daniel/pers/code/consul/agent/consul/acl_test.go:66 +0x5c Goroutine 1182 (running) created at: github.com/hashicorp/consul/agent/consul.TestACLResolver_Client.func4() /home/daniel/pers/code/consul/agent/consul/acl_test.go:1669 +0x459 testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 Goroutine 1509 (running) created at: github.com/hashicorp/consul/agent/consul.TestACLResolver_Client.func4() /home/daniel/pers/code/consul/agent/consul/acl_test.go:1668 +0x415 testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 ```
This test is super racy, and we already have test coverage for this functionality in the agent/cache package with TestCacheThrottle. Instead of spending time rewriting this test, let's remove it. ``` WARNING: DATA RACE Read at 0x00c01de410fc by goroutine 735: github.com/hashicorp/consul/agent.TestCacheRateLimit.func1() /home/daniel/pers/code/consul/agent/agent_test.go:1024 +0x9af github.com/hashicorp/consul/testrpc.WaitForTestAgent() /home/daniel/pers/code/consul/testrpc/wait.go:99 +0x209 github.com/hashicorp/consul/agent.TestCacheRateLimit.func1() /home/daniel/pers/code/consul/agent/agent_test.go:966 +0x1ad testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 Previous write at 0x00c01de410fc by goroutine 605: github.com/hashicorp/consul/agent.TestCacheRateLimit.func1.2() /home/daniel/pers/code/consul/agent/agent_test.go:998 +0xe9 Goroutine 735 (running) created at: testing.(*T).Run() /usr/lib/go/src/testing/testing.go:1238 +0x5d7 github.com/hashicorp/consul/agent.TestCacheRateLimit() /home/daniel/pers/code/consul/agent/agent_test.go:961 +0x375 testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 Goroutine 605 (finished) created at: github.com/hashicorp/consul/agent.TestCacheRateLimit.func1() /home/daniel/pers/code/consul/agent/agent_test.go:1022 +0x91e github.com/hashicorp/consul/testrpc.WaitForTestAgent() /home/daniel/pers/code/consul/testrpc/wait.go:99 +0x209 github.com/hashicorp/consul/agent.TestCacheRateLimit.func1() /home/daniel/pers/code/consul/agent/agent_test.go:966 +0x1ad testing.tRunner() /usr/lib/go/src/testing/testing.go:1193 +0x202 ```
dnephin
force-pushed
the
dnephin/fix-serf-tag-data-race
branch
from
June 10, 2021 16:40
9699aac
to
40876bc
Compare
This was referenced Jun 11, 2021
Merged
This should be easier to do now that we've removed legacy ACLs, but the commits will need to be rebased on |
Closing as this is over two years old. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
pr/no-changelog
PR does not need a corresponding .changelog entry
theme/internals
Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics
type/ci
Relating to continuous integration (CI) tooling for testing or releases
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related to #8329
Fixes #9458 and includes a hacky fix for #9457
Enables a few more packages to run with the race detector.
There are still a few data races in serf, and a bunch in the
agent
package, but I'd like to see if thecommand
andagent/local
package will run reliably now with the race detector enabled.TODO: