-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle fdinstall events from tcptracer-bpf (aka "accept before kretprobe" issue) #2518
Conversation
probe/process/walker_linux.go
Outdated
@@ -166,6 +170,16 @@ func (w *walker) readCmdline(filename string) (cmdline, name string) { | |||
return | |||
} | |||
|
|||
func IsProcInAccept(procRoot, filename string) (ret bool) { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
I thought you were going to extend the proc connection walker to also obtain TCP sockets in LISTEN state. I am a bit concerned of the performance impact of checking Also:
You get this for free out of the proc connection walker. |
Won't /proc/pid/wchan also include the content of the tasks? |
That was what I wanted to do initially but then I realised that the set of processes that have sockets in the LISTEN state is bigger than the set of process blocked on the
This is done only one time during the initialization of the EbpfTracker when Scope starts.
This part is not during the initialization but when we receive the fd_install event from tcptracer-bpf. Other events (connect, accept, close) includes the network namespace because the kprobe for them is on a kernel function with the While it should be possible to get the
No, it seems to only report the function of the main thread. For example:
|
Would this just impact performance or also accuracy? (If the impact on performance is just an eBPF map lookup with a larger number of entries then I guess it's OK)
Good point. |
Only performances, and the performance with the |
But the false positives won't impact performance so much since it's a map, which has a fast access. On the other hand, accessing wchan seems to complicate the code. |
I would also like to see some integration tests for this. |
3cde51e
to
4a7c5f5
Compare
vendor/manifest
Outdated
"revision": "b715a3b635b8d9c4a096bbd6009826b57fe64c38", | ||
"branch": "master", | ||
"revision": "3b09ef1351d865c0b2519e1486f6c88db3905780", | ||
"branch": "alban/fdinstall", |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
7b349c5
to
71f25fc
Compare
I added an integration test (314_container_accept_before_kretprobe_test.sh) and it passes now. |
@alban Are you finally discarding the option of obtaining the sockets in LISTEN state from the proc walker? |
Yes The process of getting an otherwise-missed accept event should happen only one time for processes that were blocked in the accept() syscall. Once the accept() syscall returns, we don't need that mechanism with fd_install events anymore. So we need a way to know when we can stop the monitoring of fd_install events. If we monitor the fd_install events for all processes that own a socket in the LISTEN state, we don't know when to stop that monitoring because servers normally keep a socket in the LISTEN state even after a connection has been established (for the purpose of accepting future connections). So that would be receiving an fd_install event for all servers, during their lifetime. We also receive an install event for file descriptors that are not a socket because the filtering on the file descriptor kind is done in userspace. So using wchan, we minimize that. |
Fair enough |
|
||
weave_on "$HOST1" launch | ||
|
||
# Launch the server before Scope |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
list_containers "$HOST1" | ||
list_connections "$HOST1" | ||
|
||
has_connection containers "$HOST1" client server |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
@@ -43,14 +43,15 @@ scope_end_suite() { | |||
list_containers() { | |||
local host=$1 | |||
echo "Listing containers on ${host}:" | |||
curl -s "http://${host}:4040/api/topology/containers?system=show" | jq -r '.nodes[] | select(has("metadata")) | .metadata[] | select(.id == "docker_image_name") | .value' | |||
curl -s "http://${host}:4040/api/topology/containers?system=show" | jq -r '.nodes[] | select(has("metadata")) | { "image": .metadata[] | select(.id == "docker_image_name") | .value, "label": .label, "id": .id} | .id + " (" + .image + ", " + .label + ")"' |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
netNamespacePath := fmt.Sprintf("/proc/%d/ns/net", pid) | ||
var statNsFile syscall.Stat_t | ||
err := fs.Stat(netNamespacePath, &statNsFile) | ||
if err != nil { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
probe/endpoint/ebpf.go
Outdated
fdFilename := fmt.Sprintf("/proc/%d/fd/%d", pid, fd) | ||
var statFdFile syscall.Stat_t | ||
err = fs.Stat(fdFilename, &statFdFile) | ||
if err != nil { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
@@ -125,6 +133,99 @@ func lostCb(count uint64) { | |||
ebpfTracker.stop() | |||
} | |||
|
|||
func tupleFromPidFd(pid int, fd int) (tuple fourTuple, netns string, ok bool) { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This includes weaveworks/tcptracer-bpf#39
I rebased, re-vendored tcptracer-bpf and factorized the 2 /proc readings. I also pre-fetch the busybox image in the integration tests on GCE since we have a test using busybox now, following the same pattern as the other images. Let's see if the tests pass on CircleCI. |
Since weaveworks/tcptracer-bpf#39, tcptracer-bpf can generate "fd_install" events when a process installs a new file descriptor in its fd table. Those events must be requested explicitely on a per-pid basis with tracer.AddFdInstallWatcher(pid). This is useful to know about "accept" events that would otherwise be missed because kretprobes are not triggered for functions that were called before the installation of the kretprobe. This patch find all the processes that are currently blocked on an accept() syscall during the EbpfTracker initialization. feedInitialConnections() will use tracer.AddFdInstallWatcher() to subscribe to fd_install events. When a fd_install event is received, synthesise an accept event with the connection tuple and the network namespace (from /proc).
LGTM (on green) |
green |
During the EbpfTracker initialization, find all the processes that are currently blocked on an
accept()
syscall.feedInitialConnections()
will uset.tracer.AddFdInstallWatcher()
to subscribe to fd_install events. When a fd_install event is received, synthesise anaccept
event with the connection tuple and the network namespace (from/proc
).TODO:
fd
instead of reusingnetns
accept()
syscalls (look in/proc/*/tasks/*/wchan
instead of/proc/*/wchan
)This is the Scope part of weaveworks/tcptracer-bpf#39
Issue: weaveworks/tcptracer-bpf#10