Wrongly attributed local side in outbound internet connections #1598

2opremio · 2016-06-17T11:16:57Z

I was running the service locally to test the 0.16 release (#1587) and got puzzled when I saw authfe was talking to the internet

Then I realized that it's due to the new logging sidecar talking to bigquery:

vagrant@vagrant-ubuntu-wily-64:~/scope$ docker ps | grep logging
991a8908520e        quay.io/weaveworks/logging                                         "/bin/sh -c 'exec flu"   20 minutes ago      Up 20 minutes                                   k8s_logging.6359526a_authfe-4cptq_default_b6d0e6fd-3478-11e6-a11f-0242ac110004_84cf3379
vagrant@vagrant-ubuntu-wily-64:~/scope$ docker exec -ti k8s_logging.6359526a_authfe-4cptq_default_b6d0e6fd-3478-11e6-a11f-0242ac110004_84cf3379 sh
/home/fluent # netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 authfe-4cptq:47110      lhr25s02-in-f106.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:51680      lhr25s07-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:54850      lhr26s02-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:47208      lhr25s02-in-f106.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:57374      lhr25s09-in-f10.1e100.net:https ESTABLISHED 
tcp        0      0 authfe-4cptq:54098      lhr26s02-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:47396      lhr25s02-in-f10.1e100.net:https ESTABLISHED 
tcp        0      0 authfe-4cptq:53096      lhr26s03-in-f10.1e100.net:https TIME_WAIT   
tcp        0      0 authfe-4cptq:55212      lhr26s02-in-f10.1e100.net:https ESTABLISHED 
tcp        0      0 authfe-4cptq:51948      10.27.204.4:http        ESTABLISHED 
tcp        0      0 localhost:54352         localhost:24224         ESTABLISHED 
tcp        0      0 authfe-4cptq:47370      lhr25s02-in-f10.1e100.net:https TIME_WAIT   
...

... but the local side is being attributed to a scope probe instead of the authfe pod (note the the connections to hosts like lhr25s01-in-f74.1e100.net from scope-probe-tgeqv)

Report: report.json.gz

The text was updated successfully, but these errors were encountered:

2opremio · 2016-06-17T11:18:04Z

More clearly:

2opremio · 2016-07-05T16:18:31Z

There are entries in the network topology indicating that Scope indeed identified short-lived connections between the scope probe (10.0.2.15) and hosts like lhr25s02-in-f10.1e100.net.

I am not sure how that could had happened, but it's certainly not the app's fault.

      ";10.0.2.15;51392": {
        "id": ";10.0.2.15;51392",
        "topology": "endpoint",
        "counters": {},
        "sets": {},
        "adjacency": [
          ";216.58.213.106;443"
        ],
        "edges": {
          ";216.58.213.106;443": {}
        },
        "controls": {},
        "latest": {
          "addr": {
            "timestamp": "2016-06-17T11:13:12.385838862Z",
            "value": "10.0.2.15"
          },
          "copy_of": {
            "timestamp": "2016-06-17T11:13:12.385838862Z",
            "value": ";10.32.0.16;51392"
          },
          "port": {
            "timestamp": "2016-06-17T11:13:12.385838862Z",
            "value": "51392"
          },
          "conntracked": {
            "timestamp": "2016-06-17T11:13:12.192988883Z",
            "value": "true"
          }
        },
        "parents": {},
        "children": null
      },
[...]
      ";216.58.213.106;443": {
        "id": ";216.58.213.106;443",
        "topology": "endpoint",
        "counters": {},
        "sets": {
          "reverse_dns_names": [
            "lhr25s02-in-f10.1e100.net",
            "lhr25s02-in-f106.1e100.net"
          ]
        },
        "adjacency": null,
        "edges": {},
        "controls": {},
        "latest": {
          "conntracked": {
            "timestamp": "2016-06-17T11:13:12.192990336Z",
            "value": "true"
          },
          "addr": {
            "timestamp": "2016-06-17T11:13:12.192984244Z",
            "value": "216.58.213.106"
          },
          "port": {
            "timestamp": "2016-06-17T11:13:12.192984244Z",
            "value": "443"
          }
        },
        "parents": {},
        "children": null
      },

I am going to try to reproduce and check the conntrack flows. I am suspecting that, for some reason, flows may be lingering like in #1110 and get attributed to the wrong container when destroying and recreating the service.

2opremio · 2016-07-05T16:56:37Z

I have managed to reproduce and flows do show established connections from the probe IP to 10.16.0.1

vagrant@vagrant-ubuntu-wily-64:~/service-conf$ sudo conntrack  -E | grep 216.58 | grep 10.0.2.15
[DESTROY] tcp      6 src=10.32.0.29 dst=216.58.214.10 sport=36376 dport=443 src=216.58.214.10 dst=10.0.2.15 sport=443 dport=36376 [ASSURED]
 [UPDATE] tcp      6 120 FIN_WAIT src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
 [UPDATE] tcp      6 60 CLOSE_WAIT src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
 [UPDATE] tcp      6 30 LAST_ACK src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
 [UPDATE] tcp      6 120 TIME_WAIT src=10.32.0.29 dst=216.58.208.138 sport=59634 dport=443 src=216.58.208.138 dst=10.0.2.15 sport=443 dport=59634 [ASSURED]
    [NEW] tcp      6 120 SYN_SENT src=10.32.0.29 dst=216.58.213.138 sport=55520 dport=443 [UNREPLIED] src=216.58.213.138 dst=10.0.2.15 sport=443 dport=55520
 [UPDATE] tcp      6 60 SYN_RECV src=10.32.0.29 dst=216.58.213.138 sport=55520 dport=443 src=216.58.213.138 dst=10.0.2.15 sport=443 dport=55520
 [UPDATE] tcp      6 86400 ESTABLISHED src=10.32.0.29 dst=216.58.213.138 sport=55520 dport=443 src=216.58.213.138 dst=10.0.2.15 sport=443 dport=55520 [ASSURED]

Which I believe means that 10.32.0.29 is NAT-ing through 10.0.2.15 to access the internet.

This triggers two questions:

Why does the Scope probe get attributed an IP (10.0.2.15) if it's supposed to be running in the host networking namespace?
Why are we identifying the connection as coming from 10.0.2.15 if it's the public address of a NAT-ed container IP?

2opremio · 2016-07-05T16:58:12Z

10.0.2.15 is the IP of the ethernet card in my virtual machine.

eth0      Link encap:Ethernet  HWaddr 08:00:27:ee:93:96  
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:feee:9396/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:893495 errors:0 dropped:0 overruns:0 frame:0
          TX packets:456416 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:587131973 (587.1 MB)  TX bytes:105121179 (105.1 MB)

So:

It makes sense that the Scope probe gets attributed the host address, but it shouldn't be used for short-lived connection tracking because connections could be coming from any process in the host networking namespace.
We are not properly analyzing DNAT-ed flows coming from containers since the gateway IP shouldn't had been used. It seems we are duplicating connections: one for the private IP of the container and one for the gateway address (which is being wrongly attributed to Scope).

2opremio · 2016-07-05T17:25:47Z

After a short discussion with @tomwilkie , it turns out that we don't treat the host networking namespace specially. So, if there's a single container mapped to the host networking namespace, all the short-lived connections from/to the host will be attributed to it (Related #1260 )

So, we should:

Stop attributing short-lived connections to any containers in the host networking namespace (because we are not sure whether they are coming from that container or any other process in the host)

Fix the duplicate edges coming from DNAT-ed connections. It seems we are intentionally duplicating them, but I don't really understand why. See

scope/probe/endpoint/nat.go

Lines 49 to 71 in 1edeb8d

    
           // applyNAT duplicates Nodes in the endpoint topology of a report, based on 
        
           // the NAT table. 
        
           func (n natMapper) applyNAT(rpt report.Report, scope string) { 
        
           	n.flowWalker.walkFlows(func(f flow) { 
        
           		var ( 
        
           			mapping          = toMapping(f) 
        
           			realEndpointID   = report.MakeEndpointNodeID(scope, mapping.originalIP, strconv.Itoa(mapping.originalPort)) 
        
           			copyEndpointPort = strconv.Itoa(mapping.rewrittenPort) 
        
           			copyEndpointID   = report.MakeEndpointNodeID(scope, mapping.rewrittenIP, copyEndpointPort) 
        
           			node, ok         = rpt.Endpoint.Nodes[realEndpointID] 
        
           		) 
        
           		if !ok { 
        
           			return 
        
           		} 
        
           		rpt.Endpoint.AddNode(node.WithID(copyEndpointID).WithLatests(map[string]string{ 
        
           			Addr:      mapping.rewrittenIP, 
        
           			Port:      copyEndpointPort, 
        
           			"copy_of": realEndpointID, 
        
           		})) 
        
           	}) 
        
           }

and the copy_of entry in the report excerpt from a comment above.

2opremio · 2016-07-07T12:48:26Z

Fix the duplicate edges coming from DNAT-ed connections. It seems we are intentionally duplicating them, but I don't really understand why.

After some thought, I've concluded that the duplication is technically correct, since it just adds the IP/ports with and without translation, which are unique. We do it, among other things, to identify connections between DNAT-ed containers in different hosts.

What's wrong is attributing the duplicated endpoint to a container in the host networking namespace since: not only it can belong to another process in the same networking namespace, but also it might not really belong to any process at all (if it's a DNAT-ed address like in this case).

2opremio added the bug Broken end user or developer functionality; not working as the developers intended it label Jun 17, 2016

2opremio mentioned this issue Jun 17, 2016

Release 0.16.0 #1587

Merged

2opremio self-assigned this Jun 23, 2016

2opremio added this to the 0.17.0 milestone Jun 28, 2016

rade mentioned this issue Jul 4, 2016

more accurate connection tracking #1637

Closed

2opremio assigned 2opremio and unassigned 2opremio Jul 4, 2016

2opremio mentioned this issue Jul 7, 2016

Do not infer short-lived connections for host-networking containers #1653

Merged

2opremio closed this as completed in #1653 Jul 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrongly attributed local side in outbound internet connections #1598

Wrongly attributed local side in outbound internet connections #1598

2opremio commented Jun 17, 2016 •

edited

Loading

2opremio commented Jun 17, 2016

2opremio commented Jul 5, 2016

2opremio commented Jul 5, 2016

2opremio commented Jul 5, 2016 •

edited

Loading

2opremio commented Jul 5, 2016 •

edited

Loading

2opremio commented Jul 7, 2016

Wrongly attributed local side in outbound internet connections #1598

Wrongly attributed local side in outbound internet connections #1598

Comments

2opremio commented Jun 17, 2016 • edited Loading

2opremio commented Jun 17, 2016

2opremio commented Jul 5, 2016

2opremio commented Jul 5, 2016

2opremio commented Jul 5, 2016 • edited Loading

2opremio commented Jul 5, 2016 • edited Loading

2opremio commented Jul 7, 2016

2opremio commented Jun 17, 2016 •

edited

Loading

2opremio commented Jul 5, 2016 •

edited

Loading

2opremio commented Jul 5, 2016 •

edited

Loading