-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scale test errors caused by upstream server count #2439
Fix scale test errors caused by upstream server count #2439
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2439 +/- ##
=======================================
Coverage 88.79% 88.79%
=======================================
Files 100 100
Lines 7531 7531
Branches 50 50
=======================================
Hits 6687 6687
Misses 788 788
Partials 56 56 ☔ View full report in Codecov by Sentry. |
I've added the change to However, I encountered some issues where this fix had flaky results. Sometimes the test would pass and sometimes it would error. I've recorded the results from multiple runs of the test. the files with the name Regardless of the node size, I found that with upstream server count of Does anyone have any thoughts on how to proceed with this? |
What's the error in the logs? Is it the upstream server full error? |
@kate-osborn Yep from the nginx logs its:
And the NGF logs its:
|
Ok, so we have two problems:
Let's focus on number 2. I think we need to drop the N+ upstream number down until we get 5 runs that pass consecutively. You can iterate more quickly by just running the scale upstreams test by adding a
|
Update to the above: there was user testing error on my behalf which gave false negative results for these changes. After fixing my testing procedure, setting the upstream server count to what we suspected would work (original number from the manual test) 556, lets the scale test correctly pass. |
cf96dc8
to
77ded16
Compare
77ded16
to
247bead
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
6682fce
to
957568f
Compare
Proposed changes
Problem: When the scale test runs with NGINX Plus with 648 upstream servers, it reports both NGF and NGINX Plus errors, because at some point the upstream zone size is no longer enough to hold all upstream servers. As a result, NGF fails to update NGINX Plus.
Solution: Adjust the upstream server count on the scale test when it runs with NGINX Plus from 648 to 556.
Testing: Scale test passed 5 consecutive times.
Closes #2023
Checklist
Before creating a PR, run through this checklist and mark each as complete.
Release notes
If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.