You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
grpclog.Errorf("kuberesolver: watching ended with error='%v', will reconnect again", err)
}
}, time.Second, time.Second*30, ctx.Done())
But tkResolver.watch() only returns an error during the initial connection establishment in watchEndpoints(). All other cases in the select return a nil error:
grpclog.Infof("kuberesolver: Unexpected EOF during watch stream event decoding: %v", err)
default:
grpclog.Infof("kuberesolver: Unable to decode an event from the watch stream: %v", err)
}
return
}
sw.result<-obj
}
}
until() only wraps panics ... and will silently stop restarting watch() after 30secs
Am I missing something? How does the actual reconnect happen on intermediate connection errors after the 30 seconds? There is a timer that updates the endpoints every 30min, but the watcher is just not restarted.
I'm asking because we encountered the kubernetes API to be unavailable for minutes during a cluster upgrade.
The text was updated successfully, but these errors were encountered:
until() only wraps panics ... and will silently stop restarting watch() after 30secs
No, this is wrong. until always calls the given function with backoff and never stops recalling until given stop channel called. The backoff sequence is 1, 2, 4, 8, 16, 1, 2, 4... seconds. This backoff logic is added at #40
You are free to submit a PR to change this if you have experienced a problem caused by this
Ah, I double checked the codepath of the channel handed to until. I somewhere mixed it up with the cancel channel of the streamWatcher. Thx for clarifying and sorry for the noise.
AFAICT
kubeBuilder.Build
should reconnect when the witcher encounters an error:kuberesolver/builder.go
Lines 177 to 183 in b382846
But
tkResolver.watch()
only returns an error during the initial connection establishment inwatchEndpoints()
. All other cases in the select return a nil error:kuberesolver/builder.go
Lines 278 to 299 in b382846
watchEndpoints
sets up anewStreamWatcher
which loops over the channel and closes it in case of an error:kuberesolver/stream.go
Lines 69 to 93 in b382846
until()
only wraps panics ... and will silently stop restarting watch() after 30secsAm I missing something? How does the actual reconnect happen on intermediate connection errors after the 30 seconds? There is a timer that updates the endpoints every 30min, but the watcher is just not restarted.
I'm asking because we encountered the kubernetes API to be unavailable for minutes during a cluster upgrade.
The text was updated successfully, but these errors were encountered: