connections inside a container shown as going between containers #1733

monowai · 2016-07-30T19:55:07Z

The container view shows an incorrect link between Rabbit and Riak.

Both services do use Erlang but they are independent containers that are otherwise unaware of each other.

rade · 2016-07-30T20:26:03Z

Would you mind attaching the report ("</>" button in bottom right corner) for this?

monowai · 2016-07-30T21:09:34Z

Sure thing
https://gist.github.com/monowai/8c85dd5a1796de5ab7188eebb5dd5036

rade · 2016-07-30T23:17:19Z

Thanks. I've managed to reproduce this with

docker run -d rabbitmq
docker run -d lapax/riak

Both of these run a separate epmd process inside the container, and the main erlang beam process should connect to that. Based on my quick investigation I reckon the connection from the riak beam is mis-attributed as going to the rabbitmq epmd instead of the riak epmd.

And I've just reproduced the same kind of mis-attribution with two alpine containers that each run

nc -l -p 1122 &
nc 127.0.0.1 1122

Thanks for reporting this.

rade · 2016-08-04T21:57:19Z

@paulbellamy's and my theory here is that connection endpoints in containers are generally just identified by IP and port. That is fine (and indeed required) for overlay networks, but clearly wrong for localhost connections inside a container.

So a (hopefully) quick fix would be to include the container id in the identity of localhost connection endpoints.

2opremio · 2016-08-08T15:45:35Z

@paulbellamy's and my theory here is that connection endpoints in containers are generally just identified by IP and port. That is fine (and indeed required) for overlay networks, but clearly wrong for localhost connections inside a container.

We also use PIDs for persistent connections (like this ones). proc-based tracking (which provides PIDs) and conntrack-based tracking run in parallel, and it could be that we don't correctly prioritize proc-tracked connections.

Also, even if the proc-tracked connections were used, we would probably be able to reproduce this with short-lived (conntrack-based) connections, so we need to handle the loopback interface specially.

2opremio · 2016-08-08T16:20:41Z

Actually, host-scoping aside, the problem adheres to the theory from @rade and @paulbellamy

After reproducing the problem in the way suggested by @rade :

[...] two alpine containers that each run
nc -l -p 1122 &
nc 127.0.0.1 1122

I inspected the Endpoint topology and found:

vagrant-ubuntu-wily-64 is the hostname of my (docker host) machine. So, connections are keyed with the hostname/ip/port, which, for loopback container connections is not good enough.

This causes a key-clash between the processes listening on 127.0.0.1:1122 making the containers fight for the PID entry in the LatestMap and always causing a connection across the containers (both clients are identified as talking to a single server).

Two possible solutions are:

Append the net namespace inode to Endpoint node key
Append the PID to the Endpoint node keys (the namespace inode is only available when we know the PID anyways).

Also, in case it's not done already, we should make sure that loopback interface IPs are discarded when tracking short-live connections since they cannot be uniquely attributed to a container (related #1260)

2opremio · 2016-08-08T16:28:39Z

Two possible solutions are:

Append the net namespace inode to Endpoint node key

Append the PID to the Endpoint node keys (the namespace inode is only available when we know the PID anyways).

@rade pointed out offline that we can only do it for loopback connections (since the PID/namespace scope won't match for connections across hosts)

rade added the bug Broken end user or developer functionality; not working as the developers intended it label Jul 30, 2016

rade changed the title ~~Incorrect container relationship involving Erlang processes~~ connections inside a container shown as going between containers Jul 30, 2016

rade added this to the July2016 milestone Jul 30, 2016

2opremio self-assigned this Aug 8, 2016

2opremio mentioned this issue Aug 10, 2016

Handle loopback addresses correctly when tracking connections #1780

Merged

2opremio closed this as completed in #1780 Aug 10, 2016

rade mentioned this issue Aug 26, 2016

Host networks probably shouldn't include 127.0.0.1 #987

Open

2opremio mentioned this issue Jun 5, 2017

connections between local containers via host IP not shown #2558

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

connections inside a container shown as going between containers #1733

connections inside a container shown as going between containers #1733

monowai commented Jul 30, 2016

rade commented Jul 30, 2016

monowai commented Jul 30, 2016

rade commented Jul 30, 2016

rade commented Aug 4, 2016

2opremio commented Aug 8, 2016 •

edited

Loading

2opremio commented Aug 8, 2016

2opremio commented Aug 8, 2016 •

edited

Loading

connections inside a container shown as going between containers #1733

connections inside a container shown as going between containers #1733

Comments

monowai commented Jul 30, 2016

rade commented Jul 30, 2016

monowai commented Jul 30, 2016

rade commented Jul 30, 2016

rade commented Aug 4, 2016

2opremio commented Aug 8, 2016 • edited Loading

2opremio commented Aug 8, 2016

2opremio commented Aug 8, 2016 • edited Loading

2opremio commented Aug 8, 2016 •

edited

Loading

2opremio commented Aug 8, 2016 •

edited

Loading