Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move ConfigMap updating methods into e2e/framework #2346

Merged
merged 3 commits into from
Apr 18, 2018

Conversation

andrewloux
Copy link
Contributor

@andrewloux andrewloux commented Apr 13, 2018

What this PR does / why we need it:

  1. Problems with how the nginx-configuration configMap are updated in the e2e tests include: making the e2e test runs unreliable due to state leakage, and given that developers use the same controller deployment to test things during development - this makes the development workflow somewhat unpredictable as well.
  2. The nginx-configuration configMap used by the nginx controller within the ingress-nginx namespace is updated and toggled back during several e2e tests. This PR moves that code up into the e2e/framework to make it easier to reuse across these tests.
  3. Currently, in e2e tests, developers attempt to reset the configMap to an initial state in the AfterEach handler, like so:
    updateConfigmap(setting, "/.well-known/acme-challenge", f.KubeClientSet)
    , however after running the e2e tests, if you were starting with an empty configMap, and then ran the no_auth_locations test for example - you would be left with configMap data different from when you started.
  4. This PR attempts to address this by introducing helper methods to make it easier to reset the configMap to its initial state.

TL;DR: The way the nginx-configuration configMap is updated and reset in the e2e tests leaks state! Issa fix.

Special notes for your reviewer:

  • After talking with (my close personal friend) @ElvinEfendi, we realized that maybe a better alternative approach would ultimately be to decouple the e2e tests from the developer workflow, by having the e2e tests spawn an ephemeral ingress deployment for the lifecycle of the tests - similarly to how the backends are spawned just for the lifecycle of the tests. E.g. EchoDeployment
  • However, this is a step in the right direction, and we could always follow up with a PR with the ideas mentioned above.

Let me know what you guys think - happy to iterate or rework this.

cc @ElvinEfendi @zrdaley

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please email the CNCF helpdesk: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 13, 2018
@andrewloux
Copy link
Contributor Author

@k8s-ci-robot yo i signed

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Apr 13, 2018
@aledbf
Copy link
Member

aledbf commented Apr 13, 2018

@andrewlouis93 thank you for doing this.

@aledbf
Copy link
Member

aledbf commented Apr 13, 2018

by having the e2e tests spawn an ephemeral ingress deployment for the lifecycle of the tests - similarly to how the backends are spawned just for the lifecycle of the tests. E.g. EchoDeployment

This is a great idea. Maybe this would enable to run the e2e tests in parallel? (the issue now is being fixed in this PR)

@aledbf aledbf self-assigned this Apr 13, 2018
@andrewloux
Copy link
Contributor Author

Maybe this would enable to run the e2e tests in parallel?

Definitely 🙌 Would love to drive down that ~15 minute run.

@codecov-io
Copy link

codecov-io commented Apr 13, 2018

Codecov Report

Merging #2346 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #2346      +/-   ##
=========================================
- Coverage   39.12%   39.1%   -0.02%     
=========================================
  Files          73      73              
  Lines        5204    5204              
=========================================
- Hits         2036    2035       -1     
- Misses       2878    2880       +2     
+ Partials      290     289       -1
Impacted Files Coverage Δ
internal/watch/file_watcher.go 80.76% <0%> (-3.85%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c37ed7e...882a99c. Read the comment docs.

return nil, err
}

time.Sleep(1 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to sleep here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover from building off of the update method! Nice catch, will delete :)

@@ -259,6 +259,80 @@ func (f *Framework) matchNginxConditions(name string, matcher func(cfg string) b
}
}

// GetNginxConfigMap gets ingress-nginx's nginx-configuration map
func (f *Framework) GetNginxConfigMap() (*v1.ConfigMap, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function does not seem to be used in anywhere else outside of this package, IMHO we should not export it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I feel strongly about doing that since even if we don't namesapce this method to the framework struct, the methods are still usable in the tests. For example, see how UpdateDeployment is used here: https://github.com/kubernetes/ingress-nginx/blob/master/test/e2e/lua/dynamic_configuration.go#L89.

The benefit of namespacing it to framework is that we wouldn't need to pass in a KubeClientSet like we need to when calling UpdateDeployment. Let me know what you think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewlouis93 I just meant declare it as a private function i.e

func (f *Framework) getNginxConfigMap()

that way you could also delete the comment

--

but really not important - I don't mind if you leave this as it is

return err
}

if config == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances can this be nil while err is nil too?

@andrewloux andrewloux force-pushed the e2e-config-update-refactor branch from 0a9977a to ba402be Compare April 13, 2018 18:46
bi := buildBasicAuthIngressWithSecondPath(host, f.Namespace.Name, s.Name, noAuthPath)
ing, err := f.EnsureIngress(bi)
Expect(err).NotTo(HaveOccurred())
Expect(ing).NotTo(BeNil())

if defaultNginxConfigMapData == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just revert updateConfigmap(setting, noAuthPath, f.KubeClientSet) with updateConfigmap(setting, "", f.KubeClientSet) in AfterEach instead of introducing a new logic? If we want to be more restrictive we can also always reset configmap at https://github.com/kubernetes/ingress-nginx/blob/master/test/e2e/framework/framework.go#L107

@ElvinEfendi
Copy link
Member

I suggest

  1. We keep the existing updateConfigmap implementation but just move it to the framework
  2. Fix the leakage at
    updateConfigmap(setting, "/.well-known/acme-challenge", f.KubeClientSet)
    by using updateConfigmap(setting, "", f.KubeClientSet)
  3. (alternative to 2.) reset configmap at https://github.com/kubernetes/ingress-nginx/blob/master/test/e2e/framework/framework.go#L107 iff configmap has been changed

@aledbf
Copy link
Member

aledbf commented Apr 17, 2018

@andrewlouis93 please rebase

…fault nginx-configuration state between tests

Group sublogic
@andrewloux andrewloux force-pushed the e2e-config-update-refactor branch 2 times, most recently from e2501ec to 8d479bd Compare April 17, 2018 19:10
bi := buildBasicAuthIngressWithSecondPath(host, f.Namespace.Name, s.Name, noAuthPath)
ing, err := f.EnsureIngress(bi)
Expect(err).NotTo(HaveOccurred())
Expect(ing).NotTo(BeNil())

err = f.UpdateNginxConfigMapData(setting, noAuthPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any specific reason you move this down below EnsureIngress?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason here, could move it back up

@@ -41,16 +39,19 @@ var _ = framework.IngressNginxDescribe("Proxy Protocol", func() {
BeforeEach(func() {
err := f.NewEchoDeployment()
Expect(err).NotTo(HaveOccurred())

err = f.UpdateNginxConfigMapData(setting, "false")
Expect(err).NotTo(HaveOccurred())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? I could not find the corresponding updateConfigmap for this

Copy link
Contributor Author

@andrewloux andrewloux Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's happening in the AfterEach handler in the current implementation: https://github.com/kubernetes/ingress-nginx/pull/2346/files#diff-271db20be8cbad3b37eafe10f4eacbcfL47

It seems like the test writer just wanted to default the setting to false on all tests, and explicitly set true at the beginning of tests where needed - so thought it was appropriate to move it into the Before handler.

@@ -210,6 +210,7 @@ var _ = framework.IngressNginxDescribe("Dynamic Configuration", func() {
return err
})
Expect(err).NotTo(HaveOccurred())
time.Sleep(5 * time.Second)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, the test is sometimes run with only 1 replica - since it doesn't wait for the 2nd replica to spin up.

The only time it would work is when this test is run before this current test - since already 2 replicas are available at that point. Shout out to @zrdaley for helping troubleshoot this flaky test :)

Should invest in putting together the ephemeral ingresses for each test - otherwise the e2e tests will being to get flaky very quickly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewlouis93 maybe you can use something like this to wait for the replica count you expect

Copy link
Contributor Author

@andrewloux andrewloux Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting.

So this sleep is actually happening right after an UpdateDeployment incantation: https://github.com/kubernetes/ingress-nginx/pull/2346/files#diff-8700c46a8379c603b5e41ccbaff6b71eR206

But it didn't seem like the number of replicas was updated. Will take a closer look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewlouis93 that should return early only if there is an error updating the deployment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I figured as much, edited my comment. Will take a closer look, but feel free to go ahead and approve - I think I'm satisfied with the state its in right now.

Andrew Louis added 2 commits April 18, 2018 11:48
@andrewloux andrewloux force-pushed the e2e-config-update-refactor branch from 5b58de8 to 882a99c Compare April 18, 2018 15:48
@aledbf
Copy link
Member

aledbf commented Apr 18, 2018

@andrewlouis93 after we merge this PR I will review all the timeouts in the tests and fix the random behavior

@andrewloux
Copy link
Contributor Author

Sounds good, happy to give you a hand and keep looking into it as well :)

@aledbf
Copy link
Member

aledbf commented Apr 18, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 18, 2018
@aledbf
Copy link
Member

aledbf commented Apr 18, 2018

/approve

@aledbf
Copy link
Member

aledbf commented Apr 18, 2018

@andrewlouis93 thanks!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aledbf, andrewlouis93

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 54874d4 into kubernetes:master Apr 18, 2018
@aledbf
Copy link
Member

aledbf commented Apr 18, 2018

@andrewlouis93 just in case we will start requesting squashed commits in PRs

@andrewloux
Copy link
Contributor Author

Sounds good, will do it moving forwards :) Making some headway on looking into that bug we're patching by sleeping. The rabbit hole is deep 🐇

@aledbf
Copy link
Member

aledbf commented Apr 18, 2018

@andrewlouis93 #2374

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants