Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrongly attributed local side in outbound internet connections #1598

Closed
2opremio opened this issue Jun 17, 2016 · 6 comments
Closed

Wrongly attributed local side in outbound internet connections #1598

2opremio opened this issue Jun 17, 2016 · 6 comments
Assignees
Labels
bug Broken end user or developer functionality; not working as the developers intended it
Milestone

Comments

@2opremio
Copy link
Contributor

2opremio commented Jun 17, 2016

I was running the service locally to test the 0.16 release (#1587) and got puzzled when I saw authfe was talking to the internet

screen shot 2016-06-17 at 13 26 09

Then I realized that it's due to the new logging sidecar talking to bigquery:

vagrant@vagrant-ubuntu-wily-64:~/scope$ docker ps | grep logging
991a8908520e        quay.io/weaveworks/logging                                         "/bin/sh -c 'exec flu"   20 minutes ago      Up 20 minutes                                   k8s_logging.6359526a_authfe-4cptq_default_b6d0e6fd-3478-11e6-a11f-0242ac110004_84cf3379
vagrant@vagrant-ubuntu-wily-64:~/scope$ docker exec -ti k8s_logging.6359526a_authfe-4cptq_default_b6d0e6fd-3478-11e6-a11f-0242ac110004_84cf3379 sh
/home/fluent # netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 authfe-4cptq:47110      lhr25s02-in-f106.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:51680      lhr25s07-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:54850      lhr26s02-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:47208      lhr25s02-in-f106.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:57374      lhr25s09-in-f10.1e100.net:https ESTABLISHED 
tcp        0      0 authfe-4cptq:54098      lhr26s02-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:47396      lhr25s02-in-f10.1e100.net:https ESTABLISHED 
tcp        0      0 authfe-4cptq:53096      lhr26s03-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:55212      lhr26s02-in-f10.1e100.net:https ESTABLISHED 
tcp        0      0 authfe-4cptq:51948      10.27.204.4:http        ESTABLISHED 
tcp        0      0 localhost:54352         localhost:24224         ESTABLISHED 
tcp        0      0 authfe-4cptq:47370      lhr25s02-in-f10.1e100.net:https TIME_WAIT   
...

... but the local side is being attributed to a scope probe instead of the authfe pod (note the the connections to hosts like lhr25s01-in-f74.1e100.net from scope-probe-tgeqv)

Report: report.json.gz

@2opremio 2opremio added the bug Broken end user or developer functionality; not working as the developers intended it label Jun 17, 2016
@2opremio
Copy link
Contributor Author

More clearly:

screen shot 2016-06-17 at 12 17 25

@2opremio 2opremio self-assigned this Jun 23, 2016
@2opremio 2opremio added this to the 0.17.0 milestone Jun 28, 2016
@2opremio 2opremio assigned 2opremio and unassigned 2opremio Jul 4, 2016
@2opremio
Copy link
Contributor Author

2opremio commented Jul 5, 2016

There are entries in the network topology indicating that Scope indeed identified short-lived connections between the scope probe (10.0.2.15) and hosts like lhr25s02-in-f10.1e100.net.

I am not sure how that could had happened, but it's certainly not the app's fault.

      ";10.0.2.15;51392": {
        "id": ";10.0.2.15;51392",
        "topology": "endpoint",
        "counters": {},
        "sets": {},
        "adjacency": [
          ";216.58.213.106;443"
        ],
        "edges": {
          ";216.58.213.106;443": {}
        },
        "controls": {},
        "latest": {
          "addr": {
            "timestamp": "2016-06-17T11:13:12.385838862Z",
            "value": "10.0.2.15"
          },
          "copy_of": {
            "timestamp": "2016-06-17T11:13:12.385838862Z",
            "value": ";10.32.0.16;51392"
          },
          "port": {
            "timestamp": "2016-06-17T11:13:12.385838862Z",
            "value": "51392"
          },
          "conntracked": {
            "timestamp": "2016-06-17T11:13:12.192988883Z",
            "value": "true"
          }
        },
        "parents": {},
        "children": null
      },
[...]
      ";216.58.213.106;443": {
        "id": ";216.58.213.106;443",
        "topology": "endpoint",
        "counters": {},
        "sets": {
          "reverse_dns_names": [
            "lhr25s02-in-f10.1e100.net",
            "lhr25s02-in-f106.1e100.net"
          ]
        },
        "adjacency": null,
        "edges": {},
        "controls": {},
        "latest": {
          "conntracked": {
            "timestamp": "2016-06-17T11:13:12.192990336Z",
            "value": "true"
          },
          "addr": {
            "timestamp": "2016-06-17T11:13:12.192984244Z",
            "value": "216.58.213.106"
          },
          "port": {
            "timestamp": "2016-06-17T11:13:12.192984244Z",
            "value": "443"
          }
        },
        "parents": {},
        "children": null
      },

I am going to try to reproduce and check the conntrack flows. I am suspecting that, for some reason, flows may be lingering like in #1110 and get attributed to the wrong container when destroying and recreating the service.

@2opremio
Copy link
Contributor Author

2opremio commented Jul 5, 2016

I have managed to reproduce and flows do show established connections from the probe IP to 10.16.0.1

screen shot 2016-07-05 at 5 30 14 pm

vagrant@vagrant-ubuntu-wily-64:~/service-conf$ sudo conntrack  -E | grep 216.58 | grep 10.0.2.15
[DESTROY] tcp      6 src=10.32.0.29 dst=216.58.214.10 sport=36376 dport=443 src=216.58.214.10 dst=10.0.2.15 sport=443 dport=36376 [ASSURED]
 [UPDATE] tcp      6 120 FIN_WAIT src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
 [UPDATE] tcp      6 60 CLOSE_WAIT src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
 [UPDATE] tcp      6 30 LAST_ACK src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
 [UPDATE] tcp      6 120 TIME_WAIT src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
    [NEW] tcp      6 120 SYN_SENT src=10.32.0.29 dst=216.58.213.138 sport=55520 dport=443 [UNREPLIED] src=216.58.213.138 dst=10.0.2.15 sport=443 dport=55520
 [UPDATE] tcp      6 60 SYN_RECV src=10.32.0.29 dst=216.58.213.138 sport=55520 dport=443 src=216.58.213.138 dst=10.0.2.15 sport=443 dport=55520
 [UPDATE] tcp      6 86400 ESTABLISHED src=10.32.0.29 dst=216.58.213.138 sport=55520 dport=443 src=216.58.213.138 dst=10.0.2.15 sport=443 dport=55520 [ASSURED]

Which I believe means that 10.32.0.29 is NAT-ing through 10.0.2.15 to access the internet.

This triggers two questions:

  1. Why does the Scope probe get attributed an IP (10.0.2.15) if it's supposed to be running in the host networking namespace?
  2. Why are we identifying the connection as coming from 10.0.2.15 if it's the public address of a NAT-ed container IP?

@2opremio
Copy link
Contributor Author

2opremio commented Jul 5, 2016

10.0.2.15 is the IP of the ethernet card in my virtual machine.

eth0      Link encap:Ethernet  HWaddr 08:00:27:ee:93:96  
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:feee:9396/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:893495 errors:0 dropped:0 overruns:0 frame:0
          TX packets:456416 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:587131973 (587.1 MB)  TX bytes:105121179 (105.1 MB)

So:

  • It makes sense that the Scope probe gets attributed the host address, but it shouldn't be used for short-lived connection tracking because connections could be coming from any process in the host networking namespace.
  • We are not properly analyzing DNAT-ed flows coming from containers since the gateway IP shouldn't had been used. It seems we are duplicating connections: one for the private IP of the container and one for the gateway address (which is being wrongly attributed to Scope).

@2opremio
Copy link
Contributor Author

2opremio commented Jul 5, 2016

After a short discussion with @tomwilkie , it turns out that we don't treat the host networking namespace specially. So, if there's a single container mapped to the host networking namespace, all the short-lived connections from/to the host will be attributed to it (Related #1260 )

So, we should:

  • Stop attributing short-lived connections to any containers in the host networking namespace (because we are not sure whether they are coming from that container or any other process in the host)
  • Fix the duplicate edges coming from DNAT-ed connections. It seems we are intentionally duplicating them, but I don't really understand why. See
    // applyNAT duplicates Nodes in the endpoint topology of a report, based on
    // the NAT table.
    func (n natMapper) applyNAT(rpt report.Report, scope string) {
    n.flowWalker.walkFlows(func(f flow) {
    var (
    mapping = toMapping(f)
    realEndpointID = report.MakeEndpointNodeID(scope, mapping.originalIP, strconv.Itoa(mapping.originalPort))
    copyEndpointPort = strconv.Itoa(mapping.rewrittenPort)
    copyEndpointID = report.MakeEndpointNodeID(scope, mapping.rewrittenIP, copyEndpointPort)
    node, ok = rpt.Endpoint.Nodes[realEndpointID]
    )
    if !ok {
    return
    }
    rpt.Endpoint.AddNode(node.WithID(copyEndpointID).WithLatests(map[string]string{
    Addr: mapping.rewrittenIP,
    Port: copyEndpointPort,
    "copy_of": realEndpointID,
    }))
    })
    }
    and the copy_of entry in the report excerpt from a comment above.

@2opremio
Copy link
Contributor Author

2opremio commented Jul 7, 2016

Fix the duplicate edges coming from DNAT-ed connections. It seems we are intentionally duplicating them, but I don't really understand why.

After some thought, I've concluded that the duplication is technically correct, since it just adds the IP/ports with and without translation, which are unique. We do it, among other things, to identify connections between DNAT-ed containers in different hosts.

What's wrong is attributing the duplicated endpoint to a container in the host networking namespace since: not only it can belong to another process in the same networking namespace, but also it might not really belong to any process at all (if it's a DNAT-ed address like in this case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Broken end user or developer functionality; not working as the developers intended it
Projects
None yet
Development

No branches or pull requests

1 participant