Fix watch validation assuming that client not requesting older watch revision #16695

serathius · 2023-10-05T12:10:21Z

…sion Signed-off-by: Marek Siarkowicz <[email protected]>

ahrtr · 2023-10-05T13:08:59Z

tests/robustness/validate/watch.go

 	for _, op := range report.Watch {
+		var lastEventRevision int64 = 1


Not sure why change this. Aren't watch responses globally ordered for each client? In your test case TestValidateWatch, there is no clientId; do you intentionally verify that different clients may request an older revision?

True, we expect client to get increasing revisions due to either fact that watch doesn't break or that users will usually want to reestablish watches on the following revision. However, from etcd perspective, those are independent watch request each providing its own revision to start watching from. It's not invalid that single client can start watching from rev 200, and after that decide to establish new watch from rev 100.

As in case #16693, for some unknown reason etcd went back from revision 301 to 192 in the KV store. So from clients perspective it behaved correctly, after watch was broken it established the new watch on revision 192, even though it has previously seen revision 301.

Goal of this issue is to remove assumption about sensible client behavior (not going back on watch), and just validate the watch responses. This should increase readability of robustness test reports as client misbehavior caused by etcd linearizability issue will no longer also report invalid watch.

etcd went back from revision 301 to 192 in the KV store. So from clients perspective it behaved correctly, after watch was broken it established the new watch on revision 192, even though it has previously seen revision 301

It's true. But in your test case, there is no watch establishment, so the revision shouldn't go back?

The test change is OK. But I'd suggest you to have a deep dive to figure out why the revision go back. Let me know if you need my assistance or I misunderstood anything.

etcd/tests/robustness/traffic/client.go

Line 230 in 6a96ab7

for r := range c.client.Watch(ctx, request.Key, ops...) {

It's true. But in your test case, there is no watch establishment, so the revision shouldn't go back?

There is, but only in Kubernetes traffic. It runs a ListWatch loop with 100 ms timeout.

Look consists of Read and Watch from the Read revision.

etcd/tests/robustness/traffic/kubernetes.go

Lines 67 to 82 in 6a96ab7

g.Go(func() error {

for {

select {

case <-ctx.Done():

return ctx.Err()

case <-finish:

return nil

default:

}

rev, err := t.Read(ctx, kc, s, limiter, keyPrefix)

if err != nil {

continue

}

t.Watch(ctx, kc, s, limiter, keyPrefix, rev+1)

}

})

And watch breaks every 100ms to simulate client loosing connection

etcd/tests/robustness/traffic/kubernetes.go

Lines 190 to 197 in 6a96ab7

func (t kubernetesTraffic) Watch(ctx context.Context, kc *kubernetesClient, s *storage, limiter *rate.Limiter, keyPrefix string, revision int64) {

watchCtx, cancel := context.WithTimeout(ctx, WatchTimeout)

defer cancel()

for e := range kc.client.Watch(watchCtx, keyPrefix, revision, true, true) {

s.Update(e)

}

limiter.Wait(ctx)

}

ahrtr

Let the PR go for now.

Please revisit https://github.com/etcd-io/etcd/pull/16695/files#r1347740431 later.

chaochn47 · 2023-10-05T17:38:55Z

It is difficult for me to understand the title until fully read the PR comment. Thanks for the discussion.

The PR title can be updated to Fix watch validation assuming that client not requesting older watch revision.

Fix should be followed by a problem and the problem is the validation assumes that client won't request watch with older revision.

serathius · 2023-10-06T09:11:30Z

Sorry for that, I rewrote the title couple of times and it turned out not very clear.

chaochn47 · 2023-10-06T18:53:39Z

Thanks for updating!

Fix watch validation assuming that client requesting older watch revi…

c2655b4

…sion Signed-off-by: Marek Siarkowicz <[email protected]>

serathius mentioned this pull request Oct 5, 2023

Etcd revision decreased by 100, while the data stays the same #16693

Closed

4 tasks

ahrtr reviewed Oct 5, 2023

View reviewed changes

ahrtr approved these changes Oct 5, 2023

View reviewed changes

serathius merged commit 3f859a6 into etcd-io:main Oct 5, 2023

serathius changed the title ~~Fix watch validation assuming that client requesting older watch revision~~ Fix watch validation assuming that client not requesting older watch revision Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix watch validation assuming that client not requesting older watch revision #16695

Fix watch validation assuming that client not requesting older watch revision #16695

serathius commented Oct 5, 2023 •

edited

Loading

ahrtr Oct 5, 2023

serathius Oct 5, 2023

ahrtr Oct 5, 2023

ahrtr Oct 5, 2023

serathius Oct 5, 2023 •

edited

Loading

ahrtr left a comment

chaochn47 commented Oct 5, 2023 •

edited

Loading

serathius commented Oct 6, 2023

chaochn47 commented Oct 6, 2023

		for _, op := range report.Watch {
		var lastEventRevision int64 = 1

	g.Go(func() error {
	for {
	select {
	case <-ctx.Done():
	return ctx.Err()
	case <-finish:
	return nil
	default:
	}
	rev, err := t.Read(ctx, kc, s, limiter, keyPrefix)
	if err != nil {
	continue
	}
	t.Watch(ctx, kc, s, limiter, keyPrefix, rev+1)
	}
	})

	func (t kubernetesTraffic) Watch(ctx context.Context, kc kubernetesClient, s storage, limiter *rate.Limiter, keyPrefix string, revision int64) {
	watchCtx, cancel := context.WithTimeout(ctx, WatchTimeout)
	defer cancel()
	for e := range kc.client.Watch(watchCtx, keyPrefix, revision, true, true) {
	s.Update(e)
	}
	limiter.Wait(ctx)
	}

Fix watch validation assuming that client not requesting older watch revision #16695

Fix watch validation assuming that client not requesting older watch revision #16695

Conversation

serathius commented Oct 5, 2023 • edited Loading

ahrtr Oct 5, 2023

Choose a reason for hiding this comment

serathius Oct 5, 2023

Choose a reason for hiding this comment

ahrtr Oct 5, 2023

Choose a reason for hiding this comment

ahrtr Oct 5, 2023

Choose a reason for hiding this comment

serathius Oct 5, 2023 • edited Loading

Choose a reason for hiding this comment

ahrtr left a comment

Choose a reason for hiding this comment

chaochn47 commented Oct 5, 2023 • edited Loading

serathius commented Oct 6, 2023

chaochn47 commented Oct 6, 2023

serathius commented Oct 5, 2023 •

edited

Loading

serathius Oct 5, 2023 •

edited

Loading

chaochn47 commented Oct 5, 2023 •

edited

Loading