diff --git a/content/en/blog/_posts/2019-08-30-announcing-etcd-3.4.md b/content/en/blog/_posts/2019-08-30-announcing-etcd-3.4.md index 07004b78d1124..1358bdd6d0272 100644 --- a/content/en/blog/_posts/2019-08-30-announcing-etcd-3.4.md +++ b/content/en/blog/_posts/2019-08-30-announcing-etcd-3.4.md @@ -18,7 +18,7 @@ etcd v3.4 includes a number of performance improvements for large scale Kubernet In particular, etcd experienced performance issues with a large number of concurrent read transactions even when there is no write (e.g. `“read-only range request ... took too long to execute”`). Previously, the storage backend commit operation on pending writes blocks incoming read transactions, even when there was no pending write. Now, the commit [does not block reads](https://github.com/etcd-io/etcd/pull/9296) which improve long-running read transaction performance. -We further made [backend read transactions fully concurrent](https://github.com/etcd-io/etcd/pull/10523). Previously, ongoing long-running read transactions block writes and upcoming reads. With this change, write throughput is increased by 70% and P99 write latency is reduced by 90% in the presence of long-running reads. We also ran [Kubernetes 5000-node scalability test on GCE](https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/1130745634945503235) with this change and observed similar improvements. For example, in the very beginning of the test where there are a lot of long-running “LIST pods”, the P99 latency of “POST clusterrolebindings” is [reduced by 97.4%](https://github.com/etcd-io/etcd/pull/10523#issuecomment-499262001). This non-blocking read transaction is now [used for compaction](https://github.com/etcd-io/etcd/pull/11034), which, combined with the reduced compaction batch size, reduces the P99 server request latency during compaction. +We further made [backend read transactions fully concurrent](https://github.com/etcd-io/etcd/pull/10523). Previously, ongoing long-running read transactions block writes and upcoming reads. With this change, write throughput is increased by 70% and P99 write latency is reduced by 90% in the presence of long-running reads. We also ran [Kubernetes 5000-node scalability test on GCE](https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/1130745634945503235) with this change and observed similar improvements. For example, in the very beginning of the test where there are a lot of long-running “LIST pods”, the P99 latency of “POST clusterrolebindings” is [reduced by 97.4%](https://github.com/etcd-io/etcd/pull/10523#issuecomment-499262001). More improvements have been made to lease storage. We enhanced [lease expire/revoke performance](https://github.com/etcd-io/etcd/pull/9418) by storing lease objects more efficiently, and made [lease look-up operation non-blocking](https://github.com/etcd-io/etcd/pull/9229) with current lease grant/revoke operation. And etcd v3.4 introduces [lease checkpoint](https://github.com/etcd-io/etcd/pull/9924) as an experimental feature to persist remaining time-to-live values through consensus. This ensures short-lived lease objects are not auto-renewed after leadership election. This also prevents lease object pile-up when the time-to-live value is relatively large (e.g. [1-hour TTL never expired in Kubernetes use case](https://github.com/kubernetes/kubernetes/issues/65497)).