-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with SNI and latest kong ingress #510
Comments
Forgot to mention, 0.6.0 controller didn't have this problem. Is it possible a migration screwed it up somehow? Is this even database related? |
Just tested this on our GKE cluster, which is pretty much pristine. To summarize, two kong ingress controllers installed into different namespaces, using latest helm chart.
External is set to not install CRDs via:
Two ingresses:
Both ingresses are pretty standard loopback admin configuration to expose admin api through kong. Both ingresses use the same kong-internal class:
Ingress controller throws the following error:
EDIT: this is a pretty significant issue, because apis essentially stop working due to untrusted certificate that kong falls back on. |
Some more information:
|
Thanks for the detailed report. I'll try to reproduce this and report back here soon. |
This happens because of this change:9bd6a8c#diff-b1c61ff32890f6c8189e0ff5be6e4cc8R790 What happens is the same certificate is being created again, with a different ID. And then the SNI is being associated with the new certificate, which violates a constraint as it is already associated with the old certificate. This case was foreseen before and is mentioned as a TODO comment in the underlying library: https://github.com/hbagdi/deck/blob/beaced6e32ee00658e1580dc027d1899dd5681f3/diff/diff.go#L142 To fix this problem, you have to manually delete the old certificate, so that the controller can create the new certificate. This has to be done only once between 0.6 to 0.7 upgrade.
I'd like to automatically handle this in the controller but I don't think that's possible in any way. Hope this helps. |
I am running into the same issue when upgrading the ingress-controller. But when I delete the certificates or the snis it seems to sync everyting. I also get
|
I'm not sure I follow, what do you mean? Do you see this behavior even when you do a clean installation? @dcherniv @RolandG A couple more questions:
|
@hbagdi Correct the issue appears on the clean installation of 0.7.0 ingress controller via latest helm chart.
Let me re-create the issue. I helm del purged the existing kong ingresses. Deleted the db and installed both from scratch:
Internal ingress controller is installed and running by itself:
Installed the second controller into kong-external namespace:
Logs from kong-internal controller which now fronts both external and internal admin ingress resources which i posted earlier in the thread:
In this particular case the secret that triggers the error is in kong-external namespace:
So to summarize to reproduce reliably.
This should about reproduce the issue. |
Do you annotate these two Ingress resource with Ingress.class? |
@hbagdi
Details of each ingress resource. kong-external:
kong-internal:
The only difference between the two ingress resources is the path on which they are exposed. |
Here are the flags:
kong-external:
Environment variables:
Internal:
|
You have ingress class as |
@hbagdi yes that is intended. I want both external and internal admin apis to be exposed on the internal controller. I don't want external api exposed publicly. |
Can you remove the |
@hbagdi as in to test? or permanently? These are valid ingress specs. This has the effect of disabling TLS on the ingress as expected. It doesn't really solve the problem, because TLS is a must.
Works on port 80 as expected, fails on port 443. But the error does indeed go away from kong logs. |
Let me explain. You have two Ingress resources, both being satisfied by the same Kong, (kong-internal).
The thing is, you only need one of those sections. TLS configuration is not tied to a specific Ingress rule, the configuration of TLS certificate and SNI is global, meaning it applies to all requests. So, if you specify the TLS section once, that should be enough. Unless, |
@hbagdi That makes sense but its not quite right. the paths for ingresses are different. They are different ingress resources installed by different applications in the different namespaces. So with all the above in mind. If developer A exposes their service on apikg-dev.example.com/v2/ServiceA with wild.example.com-2019 in the TLS section and developer B exposes their service on apikg-dev.example.com/v2/ServiceB with wild.example.com-2019 then this configuration appears to not be supported. |
I understand your problem here, but please do note that TLS has nothing to do with Ingress rules usually. You can't control TLS properties of an endpoint based on different paths.
Indeed. I didn't say the Ingress resources are the same, I said that the tls sections of the two Ingress resources are exactly the same.
This is a common separation of responsibilities that we see with a lot of users. An alternate approach I've seen is to have a namespace in the cluster that is used by ops folks and you create an Ingress rule with all the TLS configurations in it. And then, the developers don't even need to specify TLS section in their Ingress rules at all. I do understand that this is broken for you at the moment and we will try to fix it in a patch release. |
@hbagdi Thank you for your help.
I would like to avoid doing that, because it creates an organizational barrier. But it is definitely an approach that one can take. I also understand that it may be non-trivial to fix and i'm not asking for immediate fix. I'm quite happy staying on 0.6.2 for the time being. Again thank you for your help and support. |
The root cause here is this change: Kong/deck@890404c. @dcherniv In your case, there is a collision. |
Ideal solution would be to refer to secrets from separate namespace so that it is guaranteed to be unique. |
I'd highly recommend to change your configuration to keep all TLS certificates in one namespace and create Ingress resources with Meanwhile, I'm working on a solution for this. I'll ping you once I've something. Thanks for the promptness here, appreciate it! |
Hey @dcherniv, Can you tests out a change for this fix? I've created a It contains the code from current master branch plus this patch: diff --git a/internal/ingress/controller/parser/parser.go b/internal/ingress/controller/parser/parser.go
index 45432ae..1eb9c71 100644
--- a/internal/ingress/controller/parser/parser.go
+++ b/internal/ingress/controller/parser/parser.go
@@ -378,9 +378,52 @@ func (p *Parser) fillConsumersAndCredentials(state *KongState) error {
return nil
}
+func filterHosts(secretNameToSNIs map[string][]string, hosts []string) []string {
+ hostsToAdd := []string{}
+ seenHosts := map[string]bool{}
+ for _, hosts := range secretNameToSNIs {
+ for _, host := range hosts {
+ seenHosts[host] = true
+ }
+ }
+ for _, host := range hosts {
+ if !seenHosts[host] {
+ hostsToAdd = append(hostsToAdd, host)
+ }
+ }
+ return hostsToAdd
+}
+
+func processTLSSections(tlsSections []networking.IngressTLS,
+ namespace string, secretNameToSNIs map[string][]string) {
+ // TODO: optmize: collect all TLS sections and process at the same
+ // time to avoid regenerating the seen map; or use a seen map in the
+ // parser struct itself.
+ for _, tls := range tlsSections {
+ if len(tls.Hosts) == 0 {
+ continue
+ }
+ if tls.SecretName == "" {
+ continue
+ }
+ hosts := tls.Hosts
+ secretName := namespace + "/" + tls.SecretName
+ hosts = filterHosts(secretNameToSNIs, hosts)
+ if secretNameToSNIs[secretName] != nil {
+ hosts = append(hosts, secretNameToSNIs[secretName]...)
+ }
+ secretNameToSNIs[secretName] = hosts
+ }
+}
+
func (p *Parser) parseIngressRules(
ingressList []*networking.Ingress) (*parsedIngressRules, error) {
+ sort.SliceStable(ingressList, func(i, j int) bool {
+ return ingressList[i].CreationTimestamp.Before(
+ &ingressList[j].CreationTimestamp)
+ })
+
// generate the following:
// Services and Routes
var allDefaultBackends []networking.Ingress
@@ -396,20 +439,7 @@ func (p *Parser) parseIngressRules(
}
- for _, tls := range ingressSpec.TLS {
- if len(tls.Hosts) == 0 {
- continue
- }
- if tls.SecretName == "" {
- continue
- }
- hosts := tls.Hosts
- secretName := ingress.Namespace + "/" + tls.SecretName
- if secretNameToSNIs[secretName] != nil {
- hosts = append(hosts, secretNameToSNIs[secretName]...)
- }
- secretNameToSNIs[secretName] = hosts
- }
+ processTLSSections(ingressSpec.TLS, ingress.Namespace, secretNameToSNIs)
for i, rule := range ingressSpec.Rules {
host := rule.Host If this works, then I'll polish this change up to fix the problem. |
@hbagdi looks good so far.
I triggered the sync by editing ingresses in both internal and external namespaces. |
Awesome! |
hi @hbagdi |
@elkh510 Can you share your Ingress resources? |
When the same SNI is associated with different Kubernetes Secrets. When using Kong with a database, this results in the same SNI being associated with a different certificate and the controller fails to update the SNI in Kong. This fix reduces the likelihood of this happening for some cases but will not fix the problem for all cases. See Kong/deck@890404c for the root cause of this issue. Fix #510
When the same SNI is associated with different Kubernetes Secrets. When using Kong with a database, this results in the same SNI being associated with a different certificate and the controller fails to update the SNI in Kong. This fix reduces the likelihood of this happening for some cases but will not fix the problem for all cases. See Kong/deck@890404c for the root cause of this issue. Fix #510 From #523
@hbagdi i'd hate to resurrect a closed ticket. I can open a new one if you want.
Update:
|
Yeah, I expected this. The current solution I've put in works but is not perfect. It should never break TLS but will result in transient errors sometimes based on how you update/create your resources. |
Summary
Kong appears to have issues creating SNIs. Our ingresses are sharing the same host, we do path based routing.
Kong Ingress controller version
0.7.0 with postgres DB
Kong or Kong Enterprise version
1.4.3
Kubernetes version
v1.14.9-eks-c0eccc
Environment
uname -a
): n/aWhat happened
When below happens kong starts falling back to default localhost certificate, which makes TLS request fail. After 5-10 minutes, sometimes it recovers, only to then go into the same problem again.
Timing appears random, though i suspect it does it when it updates the configuration.
Expected behavior
Steps To Reproduce
The text was updated successfully, but these errors were encountered: