-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large working ingress working with Modsecurity set to DetectionOnly, but ignored when set On #10115
Comments
@markhley the relevant info visible is the daemonset choice with 8 odd replicas. The info not available is the log messages of the controller pods that relate to modsecurity use-case and reload. Increased resource usage and contention is expected obviously based on the very nature of WAF and 8 pods with live traffic. But specifics can be known after you post logs of the controller pods. Please redact as info you may wan to hide is visible here. And post the logs to a gist and provide the link here. /remove-kind bug |
Additionally, splitting those hostname + path combo of rules into individual ingress objects will reduce the size of dataset involved in reconciliations. If possible please do change to small dedicated ingress objects. |
I will try splitting ingress objects into individual ingress definitions. I will have to schedule prod change/test in order to do so. I will let you know results |
ok. hopefully it will reduce resource usage in reconcile and maybe mitigate the problem. |
/triage needs-information |
/assign @strongjz |
As far logs, our sites are hit at large volume and logs are hard to parse. I may have to turn off access logs to see if we can get better diagnostics for this situation on the error. I will wait to see results this weekend when we split the ingresses into individual files. If that fails, I will try to get logs bundled |
Over weekend, we reconfigured our large ingress into individual ingress definitions. This change was successful with Modsecurity set to "SecRuleEngine DetectionOnly". All URLs responded successfully. We then tried Modsecurity set to "SecRuleEngine On" and immediately started having same issues with all our ingresses. The primary relevant error in log is as follows "[lua] certificate.lua:244: call(): certificate not found, falling back to fake certificate for hostname: customer1.com, context: ssl_certificate_by_lua*" I am attaching error log file for inspection |
We discussed this on the community call and believe we found the issue. Mod security is working properly but it is blocking the controllers internal request to update the nginx.conf
|
Here is the server block in the template https://github.com/kubernetes/ingress-nginx/blob/main/rootfs/etc/nginx/template/nginx.tmpl#L707 if you set the |
/triage accepted |
Sorry I didn't attend. I had typical timezone confusion. Thanks for
bringing this attention
…------ Original Message ------
From "James Strong" ***@***.***>
To "kubernetes/ingress-nginx" ***@***.***>
Cc "Mark Ley" ***@***.***>; "Mention" ***@***.***>
Date 7/6/2023 8:59:15 AM
Subject Re: [kubernetes/ingress-nginx] Large working ingress working
with Modsecurity set to DetectionOnly, but ignored when set On (Issue
#10115)
We discussed this on the community call and believe we found the issue.
Mod security is working properly but it is blocking the controllers
internal request to update the nginx.conf
POST /configuration/servers HTTP/1.1", host: "127.0.0.1:10246 is the
port and uri for the controller to update the conf when there are
changes. So enabling modsecurity is blocking that request for being too
large. We are going to put in a fix in 1.8.2 to disable modsec for that
server block in the nginx.tmpl template.
2023/06/26 05:24:50 [error] 1912#1912: *32231 [client 127.0.0.1] ModSecurity: Access denied with code 400 (phase 2). Matched "Operator `Eq' with parameter `0' against variable `REQBODY_ERROR' (Value: `1' ) [file "/etc/nginx/modsecurity/modsecurity.conf"] [line "76"] [id "200002"] [rev ""] [msg "Failed to parse request body."] [data "Request body excluding files is bigger than the maximum expected."] [severity "2"] [ver ""] [maturity "0"] [accuracy "0"] [hostname "127.0.0.1"] [uri "/configuration/servers"] [unique_id "168775709078.157738"] [ref "v0,1"], client: 127.0.0.1, server: , request: "POST /configuration/servers HTTP/1.1", host: "127.0.0.1:10246"
W0626 05:24:50.571596 7 controller.go:236] Dynamic reconfiguration failed (retrying; 6 retries left): unexpected error code: 400
—
Reply to this email directly, view it on GitHub
<#10115 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACICQL3KZYEDSWHEHDPISCLXO3OFHANCNFSM6AAAAAAZN5JQGI>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Sunday, we are going to try your hotfix recommendation by replacing nginx.tmpl "server" section with custom configmap and setting "modsecurity off;" per your recommendation. I will update the ticket with results |
@markhley if this fixes, and you are willing to contribute, can you please open a PR on nginx.tmpl to add the check as well? Thanks! |
@rikatz - Not sure what check you are referring to, but here is adjustment we are testing Sunday in nginx.tmpl in "default server" section around line 707. I will do PR if you guys want if we have success
|
Success - with Modsecurity turned off in nginx.tmpl "server" section, we were able to set Modsecurity to "SecRuleEngine On" and all ingresses responded appropriately. I tested a mock WAF violation against site and received expected 403 from Modsecurity. Thanks for all the help on this. If you want me to submit this change as a PR, let me know and I will go through documented process. |
My assumption on check is you want something like this I found in another part of nginx.tmpl instead of hard-coded override
|
correct, I'm adding the fix here |
@markhley actually do you want to make a PR and do a contribution? :) Otherwise I can deal with it, but the idea is what you did, if you want send the PR and I can tell you if we can get it merged. Thanks for the contribution and the finding here! |
I will work on PR to get experience. My first open source PR |
We are also impacted by this. @markhley will you have the time to send a PR? |
Tim,
I have discussed this with my boss and he wants us to submit PR as an
organization, but he has yet to go through "CLA Manager" process
documented here
https://easycla.lfx.linuxfoundation.org/#/
I need to wait until he does this. If you guys need sooner, feel free to
submit PR on your own. I'd like to do it to get familiar with process,
but I need to make sure our company does their end
Mark
…------ Original Message ------
From "Tim Hobbs" ***@***.***>
To "kubernetes/ingress-nginx" ***@***.***>
Cc "Mark Ley" ***@***.***>; "Mention" ***@***.***>
Date 8/8/2023 10:05:37 AM
Subject Re: [kubernetes/ingress-nginx] Large working ingress working
with Modsecurity set to DetectionOnly, but ignored when set On (Issue
#10115)
We are also impacted by this. @markhley <https://github.com/markhley>
will you have the time to send a PR?
—
Reply to this email directly, view it on GitHub
<#10115 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACICQL5J5DQI6YL2PKUGUV3XUJWWDANCNFSM6AAAAAAZN5JQGI>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
OK I understand. I don't want to discourage your participation and your analysis has been invaluable - but if you run into blockers I kindly ask for you to post here and we'll send a fix over. |
Tim,
I have relevant change in main branch of my forked repo
https://github.com/markhley/ingress-nginx
Just not sure how to get PR started to
https://github.com/kubernetes/ingress-nginx/tree/main
Ant help would be appreciated
Thanks,
Mark Ley
…------ Original Message ------
From "Tim Hobbs" ***@***.***>
To "kubernetes/ingress-nginx" ***@***.***>
Cc "Mark Ley" ***@***.***>; "Mention" ***@***.***>
Date 8/8/2023 10:16:52 AM
Subject Re: [kubernetes/ingress-nginx] Large working ingress working
with Modsecurity set to DetectionOnly, but ignored when set On (Issue
#10115)
OK I understand. I don't want to discourage your participation and your
analysis has been invaluable - but if you run into blockers I kindly
ask for you to post here and we'll send a fix over.
—
Reply to this email directly, view it on GitHub
<#10115 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACICQLZW3YKWHV2W6GZ62NLXUJYAJANCNFSM6AAAAAAZN5JQGI>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I will be testing Sunday evening updated logic around modsecurity turn
off
If all goes well, I will open PR Monday
…------ Original Message ------
From "Tim Hobbs" ***@***.***>
To "kubernetes/ingress-nginx" ***@***.***>
Cc "Mark Ley" ***@***.***>; "Mention" ***@***.***>
Date 8/10/2023 7:00:13 AM
Subject Re: [kubernetes/ingress-nginx] Large working ingress working
with Modsecurity set to DetectionOnly, but ignored when set On (Issue
#10115)
clicking the contribute button I think is the easiest way
image
<https://user-images.githubusercontent.com/4631304/259766440-07590d0b-cd34-4af9-9770-b0e9014afe3c.png>
—
Reply to this email directly, view it on GitHub
<#10115 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACICQLYEP3DOGKBZIKVAKQLXUTSO3ANCNFSM6AAAAAAZN5JQGI>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
/close The PR above should have fixed it, feel free to reopen if this is still a problem |
@rikatz: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened:
We have several Azure AKS clusters installed using Nginx as the ingress controller and using the built-in Modsecurity for WAF functionallity. All clusters have Modsecurity set to "SecRuleEngine On" and all are fully functional except our largest production environment. The major difference is the large ingress definition in the production cluster. It has many host definitions, each having a certificate secret assigned. The large ingress definition works fine with Modsecurity set to "SecRuleEngine DetectionOnly", but once we set Modsecuriy to "Modsecurity set to "SecRuleEngine On", the ingress definition is ignored and all sites are down. While Modsecurity is set "SecRuleEngine On" in this production environment, If I cut the number of hosts down to just a few, it works again. When original large ingress is in place, I can only get sites up with Modsecutiy set to "SecRuleEngine DetectionOnly"
What you expected to happen:
I expect large ingress to work with all sites responding when Modsecurity is set "SecRuleEngine On"
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
Kubernetes version (use
kubectl version
):v1.25.6
Environment:
uname -a
):Please mention how/where was the cluster created like kubeadm/kops/minikube/kind etc.
AKS clusterkubectl version
v1.25.6kubectl get nodes -o wide
helm ls -A | grep -i ingress
ngress-nginx-4.5.2 1.6.4
helm -n <ingresscontrollernamepspace> get values <helmreleasename>
If helm was not used, then copy/paste the complete precise command used to install the controller, along with the flags and options used
if you have more than one instance of the ingress-nginx-controller installed in the same cluster, please provide details for all the instances
Current State of the controller:
kubectl describe ingressclasses
kubectl -n <ingresscontrollernamespace> get all -A -o wide
kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>
kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>
kubectl -n <appnnamespace> get all,ing -o wide
kubectl -n <appnamespace> describe ing <ingressname>
If applicable, then, your complete and exact curl/grpcurl command (redacted if required) and the reponse to the curl/grpcurl command with the -v flag
Others:
kubectl describe ...
of any custom configmap(s) created and in useSee attached configmap yaml files
How to reproduce this issue:
Anything else we need to know:
The text was updated successfully, but these errors were encountered: