Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch-all server_name _ block of /etc/nginx/nginx.conf is being set to the upstream of the last ingress processed #8823

Closed
ericdstein opened this issue Jul 14, 2022 · 13 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@ericdstein
Copy link

ericdstein commented Jul 14, 2022

What happened:

Ingress variables and $proxy_upstream_name for catch-all server_name _ block of /etc/nginx/nginx.conf are set from the upstream of the last ingress processed with a default backend defined in the ingress spec. This causes requests to any URL paths or hosts that the Ingress-NGINX controller doesn't understand to be sent to this upstream instead of a global catch-all default backend.

	## start server _
	server {
		server_name _ ;

		<removed>

		location / {

			set $namespace      "kube-system";
			set $ingress_name   "randomservice";
			set $service_name   "randomservice";
			set $service_port   "80";
			set $location_path  "/";

			<removed>
			
			set $proxy_upstream_name "kube-system-randomservice-80";

			<removed>

	}
	## end server _

Narrowed this down to the change in this PR: #1379 but started affecting us after this PR: #6576.

In internal/ingress/controller/controller.go createServers func the default server and root location is initialized (server_name _) and added to server map. Then, each ingress is processed and added to the server map. While processing the special "catch all" case, Ingress with a backend but no rule of each ingress, the pointer servers[defServerName].Locations[0] is assigned to defLoc var. defLoc is then updated with the backendUpstream and ing of the ingress currently being processed. Since this is a pointer, the original servers[defServerName].Locations[0] is updated as well; resulting in the global default catch-all backend (server_name _) being assigned to the upstream of the last ingress processed.

// special "catch all" case, Ingress with a backend but no rule
defLoc := servers[defServerName].Locations[0] // defLoc is assigned a pointer
defLoc.Backend = backendUpstream.Name
defLoc.Service = backendUpstream.Service
defLoc.Ingress = ing

What you expected to happen:

Catch-all server_name _ block of /etc/nginx/nginx.conf to be set so that any traffic the Ingress-NGINX controller doesn't understand is sent to the global catch-all default backend.

	## start server _
	server {
		server_name _ ;

		<removed>

		location / {

			set $namespace      "";
			set $ingress_name   "";
			set $service_name   "";
			set $service_port   "";
			set $location_path  "/";

			<removed>
			
			set $proxy_upstream_name "upstream-default-backend";

			<removed>

	}
	## end server _

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): v1.2.0

Kubernetes version (use kubectl version): v1.20.12

Environment:

  • Cloud provider or hardware configuration: Qemu VM on bare metal

  • OS (e.g. from /etc/os-release):

    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    
  • Kernel (e.g. uname -a): Linux t7819mws0001 5.15.13-1.el7.elrepo.x86_64 #1 SMP Tue Jan 4 17:33:28 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

  • How was the ingress-nginx-controller installed:

    • If helm was not used, then copy/paste the complete precise command used to install the controller, along with the flags and options used
        containers:
        - args:
          - /nginx-ingress-controller
          - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
          - --configmap=$(POD_NAMESPACE)/nginx-load-balancer-conf
          - --enable-ssl-passthrough
          - --update-status=false
          - --enable-ssl-chain-completion=false
          - --default-ssl-certificate=kube-system/ingresscontroller-ssl-secret
          - --watch-ingress-without-class=true
    
  • Others:

    • Any other related information like ;
      • copy/paste of the snippet (if applicable)
      • kubectl describe ... of any custom configmap(s) created and in use
      • Any other related information that may help

How to reproduce this issue:

I will look at reproducing in minikube. I believe the cause is fairly clear in that the values for servers[defServerName].Locations[0] are reassigned to values for each ingress while they are being processed.

Anything else we need to know:

All our ingresses are standardized and are defined similarly to:

kind: Ingress
spec:
  defaultBackend:
    service:
      name: {{ app_name }}
      port:
        number: {{ port }}
  rules:
    - host: {{ app_name }}.api-{{ location_name }}.{{ domain_name }}.com
      http:
        paths:
          - backend:
              service:
                name: {{ app_name }}
                port:
                  number: {{ port }}
            path: /
            pathType: ImplementationSpecific
    - host: {{ app_name }}.api.{{ domain_name }}.com
      http:
        paths:
          - backend:
              service:
                name: {{ app_name }}
                port:
                  number: {{ port }}
            path: /
            pathType: ImplementationSpecific
  tls:
    - hosts:
        - {{ app_name }}.api-{{ location_name }}.{{ domain_name }}.com
        - {{ app_name }}.api.{{ domain_name }}.com
      secretName: {{ secret_name }}
status:
  loadBalancer: {}
items: []
@ericdstein ericdstein added the kind/bug Categorizes issue or PR as related to a bug. label Jul 14, 2022
@k8s-ci-robot
Copy link
Contributor

@ericdstein: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Jul 14, 2022
@longwuyuan
Copy link
Contributor

@harry1064 or @Volatus, this may interest you

@harry1064
Copy link
Contributor

@longwuyuan
I will take a look
/assign

@ericdstein
Copy link
Author

ericdstein commented Jul 15, 2022

It looks like we are running into this cause we are defining both defaultBackend and rules in our ingresses with the defaultBackend pointing to the same service as the rules in the ingress (see ingress spec in original comment).

While defining both defaultBackend and rules is not usually necessary, it is possible. And doing so causes ingress-nginx to make an assumption and select the last ingress processed with default backend as the global catch-all default backend; overwriting what is passed in with the --default-backend-service command line argument and causing unexpected behavior.

As I mentioned it was actually #6576 that affected us as assignment to defLoc was moved outside of the condition that rules == 0.

@ericdstein
Copy link
Author

I believe #8473 is same issue.

@bmv126
Copy link

bmv126 commented Jul 15, 2022

@ericdstein

In your ingress spec, if you change for below the path from "/" to something else. Then what is the behavior ?

  - host: {{ app_name }}.api-{{ location_name }}.{{ domain_name }}.com
      http:
        paths:
          - backend:
              service:
                name: {{ app_name }}
                port:
                  number: {{ port }}
            path: /
            pathType: ImplementationSpecific
    - host: {{ app_name }}.api.{{ domain_name }}.com
      http:
        paths:
          - backend:
              service:
                name: {{ app_name }}
                port:
                  number: {{ port }}
            path: /
            pathType: ImplementationSpecific

Also I think you need to check under the respective server_name in nginx.conf
{{ app_name }}.api-{{ location_name }}.{{ domain_name }}.com
and
{{ app_name }}.api.{{ domain_name }}.com

@longwuyuan
Copy link
Contributor

longwuyuan commented Jul 15, 2022

@harry1064 are you on kubernetes slack in the ingress-nginx-dev channel. If yes, can you ping me there please. Thanks

@harry1064
Copy link
Contributor

@longwuyuan Yes, sure.

@luigizhou
Copy link

Thank you for opening this issue @ericdstein , was going to do it myself today if no one was going to pick up #8473 again after my comment :)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 16, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 15, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

7 participants