Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to ingress-nginx v0.40.2 #122

Merged
merged 17 commits into from
Oct 7, 2020
Merged

Upgrade to ingress-nginx v0.40.2 #122

merged 17 commits into from
Oct 7, 2020

Conversation

sslavic
Copy link

@sslavic sslavic commented Oct 5, 2020

# configmap.server-tokens
# Controls whether to send NGINX Server header in responses and display NGINX
# version in error pages.
server-tokens: "false"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now upstream defaults to server-tokens disabled, so there's no need to have this override.

@sslavic sslavic requested a review from a team October 6, 2020 09:54
@sslavic sslavic force-pushed the ingress-nginx-0-40-1 branch from 9fae8ae to a423834 Compare October 6, 2020 09:55
Copy link
Member

@glitchcrab glitchcrab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (assuming tested).

Not sure if the performance test failure is actually true or not though.

@sslavic
Copy link
Author

sslavic commented Oct 6, 2020

Perf tests are failing due to kubernetes/ingress-nginx#6284 and loadtest app using extensions/v1beta1 Ingress. It's a good test, since in reality users will still use that old API, it should be gracefully rejected. So unfortunately, this is on hold, have to wait for next patch or minor release.

CHANGELOG.md Outdated
Comment on lines 25 to 32
- Default configuration changes:

- [`gzip-level`](https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/configmap.md#gzip-level) default changed from `5` to `1`
- [`ssl-session-tickets`](https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/configmap.md#ssl-session-tickets) default changed from `true` to `false`
- [`use-gzip`](https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/configmap.md#use-gzip) default changed from `true` to `false`
- [`upstream-keepalive-connections`](https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/configmap.md#upstream-keepalive-connections) changed from `32` to `320`
- [`upstream-keepalive-requests`](https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/configmap.md#upstream-keepalive-requests) changed from `100` to `10000`
- Support and enable by default [mimalloc](https://github.com/microsoft/mimalloc) as a drop-in malloc replacement to reduce nginx memory utilization.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes to default configuration improved performance in upstream performance tests kubernetes/ingress-nginx#6226 (comment)

//cc @whites11

@sslavic
Copy link
Author

sslavic commented Oct 6, 2020

For performance tests to pass with validating webhook enabled by default:

That may take a while. So as temp solution I've run performance tests with validating webhook disabled. Tests used AWS 12.2.0 platform release. Here are results:

================================================================================
---- Global Information --------------------------------------------------------
> request count                                      50000 (OK=50000  KO=0     )
> min response time                                      0 (OK=0      KO=-     )
> max response time                                   1400 (OK=1400   KO=-     )
> mean response time                                   138 (OK=138    KO=-     )
> std deviation                                        139 (OK=139    KO=-     )
> response time 50th percentile                         98 (OK=98     KO=-     )
> response time 75th percentile                        198 (OK=198    KO=-     )
> response time 95th percentile                        407 (OK=407    KO=-     )
> response time 99th percentile                        607 (OK=607    KO=-     )
> mean requests/sec                                   2000 (OK=2000   KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                         49874 (100%)
> 800 ms < t < 1200 ms                                 119 (  0%)
> t > 1200 ms                                            7 (  0%)
> failed                                                 0 (  0%)
================================================================================

Latency tail is slightly better compared to previous release perf tests run #118 (comment)

In production effect may be different. Perf tests here anyway validate just that there are no regressions.

Once perf test completed, validating webhook has been reenabled.

@aledbf
Copy link

aledbf commented Oct 6, 2020

@sslavic the tail latency performance is because of the change in the defaults kubernetes/ingress-nginx#6226 (comment)
In particular use-gzip: "false".

Without gzip enabled there is no change required to get ~30K RPS https://gist.github.com/aledbf/a79434b28dc529930b88b824976f7044#gistcomment-3467102

@sslavic sslavic changed the title Upgrade to ingress-nginx v0.40.1 Upgrade to ingress-nginx v0.40.2 Oct 6, 2020
@sslavic
Copy link
Author

sslavic commented Oct 6, 2020

Yes @aledbf, thanks! I shared your perf test results earlier with a colleague, @whites11, #122 (comment)
He was looking into some performance issues. Unadjusted previous default configs (like upstream-keepalive-* ones) might have been the bottleneck there, i.e. capacity might improve with upgrade to the new release. Disabling gzip seems like reasonable default, improves latency tail where payloads are dominantly relatively small. On any environment where payloads are large, i.e. would benefit from compression, gzip may have to be re-enabled on demand.

PR has been updated to use just released upstream v0.40.2 binary. Thanks once more @aledbf for merging my PR kubernetes/ingress-nginx#6284 and shipping the release so promptly. That unblocks us - once we update Ingress API and dependencies in part of our perf testing tooling stack (see #122 (comment)) we'll be able to ship this upgrade.

@sslavic
Copy link
Author

sslavic commented Oct 7, 2020

extensions/v1beta1 Ingress on k8s 1.14+ are automatically visible by new NGINX IC validating webhook through networking.k8s.io API, due to Kubernetes automatic conversion.

In previous tests Ingress resource validation was failing due to NGINX IC webhook installation being unreliable. Once that was fixed, by tuning helm hook-weight annotations, performance test passed.

It's still great to have NGINX IC validating webhook resiliency improvement from 0.40.2 in.

It's recommended, but there is no need yet for Ingress resources to have API group changed from extensions/v1beta1 to networking.k8s.io/v1beta1.

Changelog has been updated accordingly.

@sslavic
Copy link
Author

sslavic commented Oct 7, 2020

perf tests results with 0.40.2:

================================================================================
---- Global Information --------------------------------------------------------
> request count                                      50000 (OK=50000  KO=0     )
> min response time                                      0 (OK=0      KO=-     )
> max response time                                   1217 (OK=1217   KO=-     )
> mean response time                                   130 (OK=130    KO=-     )
> std deviation                                        130 (OK=130    KO=-     )
> response time 50th percentile                         95 (OK=95     KO=-     )
> response time 75th percentile                        190 (OK=190    KO=-     )
> response time 95th percentile                        395 (OK=395    KO=-     )
> response time 99th percentile                        590 (OK=590    KO=-     )
> mean requests/sec                                2083.333 (OK=2083.333 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                         49925 (100%)
> 800 ms < t < 1200 ms                                  74 (  0%)
> t > 1200 ms                                            1 (  0%)
> failed                                                 0 (  0%)
================================================================================

@sslavic sslavic merged commit ecfda79 into master Oct 7, 2020
@sslavic sslavic deleted the ingress-nginx-0-40-1 branch October 7, 2020 08:43
@sslavic
Copy link
Author

sslavic commented Oct 7, 2020

Tested upgrade from 1.9.2 to this PR build on gauss. Clean installation was automatically tested with performance tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants