Validate etcd linearizability #14398

serathius · 2022-08-29T12:14:17Z

First draft of linearizability tests that are able to reproduce #14370 within 20 seconds with 80% accuracy. Part of #14045

This approach uses a generic way to of verifing linearizability. In this proof of concept to reproduce #14370, however for full solution scenarios should be generated randomly based on preexisting fail points.

serathius · 2022-08-29T12:14:31Z

cc @ptabor @ahrtr @spzala

codecov-commenter · 2022-09-26T08:14:43Z

Codecov Report

Merging #14398 (069e26e) into main (e24402d) will decrease coverage by 0.26%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main   #14398      +/-   ##
==========================================
- Coverage   75.70%   75.44%   -0.27%     
==========================================
  Files         457      457              
  Lines       37300    37269      -31     
==========================================
- Hits        28239    28116     -123     
- Misses       7309     7380      +71     
- Partials     1752     1773      +21

Flag	Coverage Δ
all	`75.44% <0.00%> (-0.27%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
pkg/expect/expect.go	`72.64% <0.00%> (-1.92%)`	⬇️
client/v3/namespace/watch.go	`87.87% <0.00%> (-6.07%)`	⬇️
raft/rafttest/node.go	`95.00% <0.00%> (-5.00%)`	⬇️
client/v3/concurrency/session.go	`88.63% <0.00%> (-4.55%)`	⬇️
client/pkg/v3/testutil/leak.go	`62.83% <0.00%> (-4.43%)`	⬇️
server/proxy/grpcproxy/watch.go	`92.48% <0.00%> (-4.05%)`	⬇️
server/storage/mvcc/watchable_store.go	`85.14% <0.00%> (-3.99%)`	⬇️
server/etcdserver/api/v3rpc/member.go	`93.54% <0.00%> (-3.23%)`	⬇️
server/etcdserver/cluster_util.go	`70.35% <0.00%> (-3.17%)`	⬇️
server/etcdserver/api/v3rpc/interceptor.go	`74.47% <0.00%> (-3.13%)`	⬇️
... and 20 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

serathius · 2022-10-14T09:22:35Z

@ahrtr @spzala I think this is ready to review. This PR provides first scenario to new framework of linerazibility test. Using this approach I was able to reproduce both data durability and data inconsistency issues. Further work should add more diverse scenarios to cover broader types of failures and allow dynamic scenarios based on random set of actions.

Generating random scenarios should allow us to cover much broader space and should be our main goal. Current approach based on prepacked scenarios is limited to testing historical or simple scenarios. By testing a generic system property like linearizability we are no longer limited by simple got == expect validation. We need to start proactively finding issues instead of just responding to them post.

Next steps:

Add scenario to reproduce Durability API guarantee broken in single node cluster #14370
- Integrate gofail into linearizability tests
- Extend gofail to allow trigger multiple failpoints at once
- Add Dynamic scenario based by triggering a random failpoint
Add scenario to reproduce Inconsistent revision and data occurs #13766

go.sum

tests/linearizability/model.go

tests/linearizability/linearizability_test.go

tests/framework/e2e/etcd_process.go

tests/linearizability/linearizability_test.go

spzala

@serathius great work! My only concern is that thegithub.aaakk.us.kg/anishathalye/porcupine seems developed/maintained by an individual, and there is not much to find out on dev history/PRs. But at the same time, we don't have other options and also that we are using it for test. So giving porcupine a try sounds okay. (Forking and maintaining our own copy may be too much of work specially when we don't have enough contributors so I guess we don't have that option.) I don't have any other comment besides I noticed a question from @ahrtr Thanks!

tests/linearizability/client.go

ahrtr · 2022-10-22T15:18:45Z

tests/linearizability/traffic.go

+)
+
+var (
+	PutGetTraffic Traffic = putGetTraffic{}


It doesn't make sense to define a global variable. We should either define a function something like NewTraffic or users use putGetTraffic{} directly.

I want to maintain a single clean place to that lists are available traffic and failpoints. I find it more clear than using structs and asking user to scan whole file where they are scattered.

tests/linearizability/linearizability_test.go

tests/linearizability/failpoints.go

tests/linearizability/linearizability_test.go

ahrtr · 2022-10-22T15:22:43Z

tests/linearizability/linearizability_test.go

+		},
+		{
+			name:      "KillClusterOfSize3",
+			failpoint: KillFailpoint,


Suggested change

failpoint: KillFailpoint,

failpoint: killFailpoint{},

ahrtr · 2022-10-22T15:23:40Z

tests/linearizability/linearizability_test.go

+			traffic := trafficConfig{
+				minimalQPS:  minimalQPS,
+				maximalQPS:  maximalQPS,
+				clientCount: 8,


minor comment: please consider to make this configurable, such as adding a field clientCount into the struct tcs.

Struct tcs is not set in stone, we can always add it when we decide to parameterize tests based on it.

ahrtr · 2022-10-22T15:45:21Z

Overall looks good to me with some minor comments. Great work!

My only concern is that thegithub.aaakk.us.kg/anishathalye/porcupine seems developed/maintained by an individual, and there is not much to find out on dev history/PRs

Yes, I have the same concern, especially if we want to spend more resource/time on the linearizability test. We can discuss this separately.

Signed-off-by: Marek Siarkowicz <[email protected]>

ahrtr

Overall looks good to me.

serathius force-pushed the linearizability branch 5 times, most recently from c5a30cb to 7615605 Compare August 30, 2022 07:53

serathius force-pushed the linearizability branch from 7615605 to 7bd9685 Compare September 12, 2022 13:50

serathius force-pushed the linearizability branch 3 times, most recently from 16d36e0 to 320ef73 Compare September 22, 2022 13:03

serathius added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Sep 22, 2022

serathius force-pushed the linearizability branch from 320ef73 to e30f0e2 Compare September 26, 2022 07:53

serathius force-pushed the linearizability branch 4 times, most recently from ac43a74 to 697a691 Compare September 30, 2022 19:21

serathius force-pushed the linearizability branch 9 times, most recently from 732a65c to 1894704 Compare October 13, 2022 15:21

serathius changed the title ~~[Draft] tests: Validate etcd linearizability~~ Validate etcd linearizability Oct 14, 2022

ahrtr self-assigned this Oct 14, 2022

serathius force-pushed the linearizability branch from 1894704 to 5779393 Compare October 17, 2022 10:08

serathius force-pushed the linearizability branch 11 times, most recently from cdc5957 to 09909f5 Compare October 20, 2022 21:47

ahrtr reviewed Oct 20, 2022

View reviewed changes

go.sum Outdated Show resolved Hide resolved

ahrtr reviewed Oct 21, 2022

View reviewed changes

tests/linearizability/model.go Outdated Show resolved Hide resolved

ahrtr reviewed Oct 21, 2022

View reviewed changes

serathius force-pushed the linearizability branch 5 times, most recently from 0ce584a to b26705d Compare October 21, 2022 10:39

spzala approved these changes Oct 21, 2022

View reviewed changes

ahrtr reviewed Oct 22, 2022

View reviewed changes

tests: Validate etcd linearizability

069e26e

Signed-off-by: Marek Siarkowicz <[email protected]>

serathius force-pushed the linearizability branch from b26705d to 069e26e Compare October 23, 2022 05:04

ahrtr approved these changes Oct 23, 2022

View reviewed changes

ahrtr added the stage/merge-when-tests-green label Oct 23, 2022

serathius merged commit e5790d2 into etcd-io:main Oct 23, 2022

serathius mentioned this pull request Nov 10, 2022

Introduce etcd linearizability tests #14045

Closed

33 tasks

serathius mentioned this pull request Nov 22, 2022

Unified test framework #14820

Closed

serathius deleted the linearizability branch June 15, 2023 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate etcd linearizability #14398

Validate etcd linearizability #14398

serathius commented Aug 29, 2022 •

edited

Loading

serathius commented Aug 29, 2022

codecov-commenter commented Sep 26, 2022 •

edited

Loading

serathius commented Oct 14, 2022

spzala left a comment

ahrtr Oct 22, 2022

serathius Oct 23, 2022

ahrtr Oct 22, 2022

ahrtr Oct 22, 2022 •

edited

Loading

serathius Oct 23, 2022

ahrtr commented Oct 22, 2022

ahrtr left a comment

Validate etcd linearizability #14398

Validate etcd linearizability #14398

Conversation

serathius commented Aug 29, 2022 • edited Loading

serathius commented Aug 29, 2022

codecov-commenter commented Sep 26, 2022 • edited Loading

Codecov Report

serathius commented Oct 14, 2022

spzala left a comment

Choose a reason for hiding this comment

ahrtr Oct 22, 2022

Choose a reason for hiding this comment

serathius Oct 23, 2022

Choose a reason for hiding this comment

ahrtr Oct 22, 2022

Choose a reason for hiding this comment

ahrtr Oct 22, 2022 • edited Loading

Choose a reason for hiding this comment

serathius Oct 23, 2022

Choose a reason for hiding this comment

ahrtr commented Oct 22, 2022

ahrtr left a comment

Choose a reason for hiding this comment

serathius commented Aug 29, 2022 •

edited

Loading

codecov-commenter commented Sep 26, 2022 •

edited

Loading

ahrtr Oct 22, 2022 •

edited

Loading