Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup tracking #3170

Merged
merged 20 commits into from
Dec 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion bpf/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ PROCESS += bpf_generic_kprobe_v61.o bpf_generic_retkprobe_v61.o \
PROCESS += bpf_generic_lsm_core_v61.o bpf_generic_lsm_output_v61.o \
bpf_generic_lsm_ima_file_v61.o bpf_generic_lsm_ima_bprm_v61.o

CGROUP = bpf_cgroup_mkdir.o bpf_cgroup_rmdir.o bpf_cgroup_release.o
CGROUP = bpf_cgroup_mkdir.o bpf_cgroup_rmdir.o bpf_cgroup_release.o bpf_cgtracker.o
BPFTEST = bpf_lseek.o

OBJSDIR := objs/
Expand Down
80 changes: 80 additions & 0 deletions bpf/cgroup/bpf_cgtracker.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
/* Copyright Authors of Tetragon */

#include "vmlinux.h"
#include "api.h"
#include "bpf_helpers.h"
#include "bpf_cgroup.h"
#include "bpf_tracing.h"
#include "cgtracker.h"

char _license[] __attribute__((section(("license")), used)) = "GPL";
#ifdef VMLINUX_KERNEL_VERSION
int _version __attribute__((section(("version")), used)) =
VMLINUX_KERNEL_VERSION;
#endif

/* new kernel cgroup definition */
struct cgroup___new {
int level;
struct cgroup *ancestors[];
} __attribute__((preserve_access_index));

FUNC_INLINE __u64 cgroup_get_parent_id(struct cgroup *cgrp)
{
struct cgroup___new *cgrp_new = (struct cgroup___new *)cgrp;

// for newer kernels, we can access use ->ancestors to retrieve the parent
if (bpf_core_field_exists(cgrp_new->ancestors)) {
int level = get_cgroup_level(cgrp);

if (level <= 0)
return 0;
return BPF_CORE_READ(cgrp_new, ancestors[level - 1], kn, id);
}

// otherwise, go over the parent pointer
struct cgroup_subsys_state *parent_css = BPF_CORE_READ(cgrp, self.parent);

if (parent_css) {
struct cgroup *parent = container_of(parent_css, struct cgroup, self);
__u64 parent_cgid = get_cgroup_id(parent);
return parent_cgid;
}

return 0;
}

__attribute__((section(("raw_tracepoint/cgroup_mkdir")), used)) int
tg_cgtracker_cgroup_mkdir(struct bpf_raw_tracepoint_args *ctx)
{
struct cgroup *cgrp;
__u64 cgid, cgid_parent, *cgid_tracker;

cgrp = (struct cgroup *)ctx->args[0];
cgid = get_cgroup_id(cgrp);
if (cgid == 0)
return 0;
cgid_parent = cgroup_get_parent_id(cgrp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm didn't have time to review all, but want to point this is not cgroupv1 compatible, the mkdir and release delete will work on any cgoupv1 hierarchies, and if kernfs ids clash could lead to corruption?

The old branch has this check: https://github.com/cilium/tetragon/blob/pr/tixxdz/cgroup-bpf-full/bpf/cgroup/bpf_cgroup_mkdir.c#L33

So this could break in cgroupv1. Will check closely later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to have this only for group v2 at the moment. That being said, it's unclear what the problem is for v1. What are the kernfs IDs that could clash?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are following all hierarchies in cgroupv1! including the ones that we don't track, why?

The kernfs IDs are cgroup IDs if they are re-used you remove them from the tracking and it won't work anymore, see your cgrp release bpf program.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are following all hierarchies in cgroupv1! including the ones that we don't track, why?

We can add a filter to tack only the configured hierarchy for v1. My first goal was to support cgroup v2 so that's what the first implementation does. Any cgroup v1 support might or might not work since we do not have tests for it.

The kernfs IDs are cgroup IDs if they are re-used you remove them from the tracking and it won't work anymore, see your cgrp release bpf program.

Not sure I understand. Are you saying that cgroup ids are unique only within a specific hierarchy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are following all hierarchies in cgroupv1! including the ones that we don't track, why?

We can add a filter to tack only the configured hierarchy for v1. My first goal was to support cgroup v2 so that's what the first implementation does. Any cgroup v1 support might or might not work since we do not have tests for it.

Exactly my point and with that linked bpf cgroup helper, it transparently handle it.

The kernfs IDs are cgroup IDs if they are re-used you remove them from the tracking and it won't work anymore, see your cgrp release bpf program.

Not sure I understand. Are you saying that cgroup ids are unique only within a specific hierarchy?

Yes cgroupv1 are separate cgroup mounts backed by kernfs, each kernfs node part of a mount has its unique ID that is the inode number. The allocation is predictable using IDR last time I checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I understand correctly, are you saying that it's possible for two cgroup nodes of different hierarchies to have the same kernfs id?

if (cgid_parent == 0)
return 0;
cgid_tracker = map_lookup_elem(&tg_cgtracker_map, &cgid_parent);
if (cgid_tracker)
map_update_elem(&tg_cgtracker_map, &cgid, cgid_tracker, BPF_ANY);

return 0;
}

__attribute__((section(("raw_tracepoint/cgroup_release")), used)) int
tg_cgtracker_cgroup_release(struct bpf_raw_tracepoint_args *ctx)
{
struct cgroup *cgrp;
__u64 cgid;

cgrp = (struct cgroup *)ctx->args[0];
cgid = get_cgroup_id(cgrp);
if (cgid)
map_delete_elem(&tg_cgtracker_map, &cgid);

return 0;
}
22 changes: 22 additions & 0 deletions bpf/cgroup/cgtracker.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
/* Copyright Authors of Cilium */

#ifndef CGTRACKER_H__
#define CGTRACKER_H__

struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 1);
__type(key, __u64); /* cgroup id */
__type(value, __u64); /* tracker cgroup id */
} tg_cgtracker_map SEC(".maps");

FUNC_INLINE __u64 cgrp_get_tracker_id(__u64 cgid)
{
__u64 *ret;

ret = map_lookup_elem(&tg_cgtracker_map, &cgid);
return ret ? *ret : 0;
}

#endif /* CGTRACKER_H__ */
2 changes: 1 addition & 1 deletion bpf/include/api.h
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ static int BPF_FUNC(fib_lookup, void *ctx, struct bpf_fib_lookup *params, uint32
/* Current Process Info */
static uint64_t BPF_FUNC(get_current_task);
static uint64_t BPF_FUNC(get_current_cgroup_id);
static uint64_t BPF_FUNC(get_current_ancestor_cgroup_id);
static uint64_t BPF_FUNC(get_current_ancestor_cgroup_id, int ancestor_level);
static uint64_t BPF_FUNC(get_current_uid_gid);
static uint64_t BPF_FUNC(get_current_pid_tgid);

Expand Down
19 changes: 19 additions & 0 deletions bpf/lib/bpf_helpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -99,4 +99,23 @@ FUNC_INLINE void compiler_barrier(void)

#define SEC(name) __attribute__((section(name), used))

/*
* Helper macros to manipulate data structures
*/

/* offsetof() definition that uses __builtin_offset() might not preserve field
* offset CO-RE relocation properly, so force-redefine offsetof() using
* old-school approach which works with CO-RE correctly
*/
#undef offsetof
#define offsetof(type, member) ((unsigned long)&((type *)0)->member)

/* redefined container_of() to ensure we use the above offsetof() macro */
#undef container_of
#define container_of(ptr, type, member) \
({ \
void *__mptr = (void *)(ptr); \
((type *)(__mptr - offsetof(type, member))); \
})

#endif //__BPF_HELPERS_
1 change: 1 addition & 0 deletions bpf/lib/process.h
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,7 @@ struct msg_ns {

struct msg_k8s {
__u64 cgrpid;
__u64 cgrp_tracker_id;
char docker_id[DOCKER_ID_LENGTH];
}; // All fields aligned so no 'packed' attribute.

Expand Down
15 changes: 4 additions & 11 deletions bpf/process/bpf_process_event.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

#include "bpf_cgroup.h"
#include "bpf_cred.h"
#include "cgroup/cgtracker.h"

#define ENAMETOOLONG 36 /* File name too long */

Expand Down Expand Up @@ -536,16 +537,6 @@ __event_get_current_cgroup_name(struct cgroup *cgrp, struct msg_k8s *kube)
{
const char *name;

/* TODO: check if we have Tetragon cgroup configuration and that the
* tracking cgroup ID is set. If so then query the bpf map for
* the corresponding tracking cgroup name.
*/

/* TODO: we gather current cgroup context, switch to tracker see above,
* and if that fails for any reason or if we don't have the cgroup name
* of tracker, then we can continue with current context.
*/

name = get_cgroup_name(cgrp);
if (name)
probe_read_str(kube->docker_id, KN_NAME_LENGTH, name);
Expand Down Expand Up @@ -587,7 +578,9 @@ __event_get_cgroup_info(struct task_struct *task, struct msg_k8s *kube)

/* Collect event cgroup ID */
kube->cgrpid = __tg_get_current_cgroup_id(cgrp, cgrpfs_magic);
if (!kube->cgrpid)
if (kube->cgrpid)
kube->cgrp_tracker_id = cgrp_get_tracker_id(kube->cgrpid);
else
flags |= EVENT_ERROR_CGROUP_ID;

/* Get the cgroup name of this event. */
Expand Down
90 changes: 90 additions & 0 deletions cmd/tetra/cgtracker/cgtracker.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// SPDX-License-Identifier: Apache-2.0
// Copyright Authors of Tetragon

package cgtracker

import (
"fmt"
"log"
"path/filepath"

"github.com/cilium/tetragon/pkg/cgidarg"
"github.com/cilium/tetragon/pkg/cgtracker"
"github.com/cilium/tetragon/pkg/defaults"
"github.com/spf13/cobra"
)

func New() *cobra.Command {
ret := &cobra.Command{
Use: "cgtracker",
Short: "manage cgtracker map (only for debugging)",
Hidden: true,
SilenceUsage: true,
}

ret.AddCommand(
dumpCmd(),
addCommand(),
)

return ret
}

func dumpCmd() *cobra.Command {
mapFname := filepath.Join(defaults.DefaultMapRoot, defaults.DefaultMapPrefix, cgtracker.MapName)
ret := &cobra.Command{
Use: "dump",
Short: "dump cgtracker map state",
Args: cobra.ExactArgs(0),
RunE: func(_ *cobra.Command, _ []string) error {
m, err := cgtracker.OpenMap(mapFname)
if err != nil {
log.Fatal(err)
}
defer m.Close()

vals, err := m.Dump()
if err != nil {
return err
}
for tracker, tracked := range vals {
fmt.Printf("%d: %v\n", tracker, tracked)
}
return nil
},
}

flags := ret.Flags()
flags.StringVar(&mapFname, "map-fname", mapFname, "cgtracker map filename")
return ret
}

func addCommand() *cobra.Command {
mapFname := filepath.Join(defaults.DefaultMapRoot, defaults.DefaultMapPrefix, cgtracker.MapName)
ret := &cobra.Command{
Use: "add cg_tracked cg_tracker",
Short: "add cgtracker entry",
Args: cobra.ExactArgs(2),
RunE: func(_ *cobra.Command, args []string) error {
tracked, err := cgidarg.Parse(args[0])
if err != nil {
return err
}
tracker, err := cgidarg.Parse(args[1])
if err != nil {
return err
}
m, err := cgtracker.OpenMap(mapFname)
if err != nil {
return err
}
defer m.Close()
return m.Add(tracked, tracker)

},
}

flags := ret.Flags()
flags.StringVar(&mapFname, "map-fname", mapFname, "cgtracker map filename")
return ret
}
2 changes: 2 additions & 0 deletions cmd/tetra/commands_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ package main

import (
"github.com/cilium/tetragon/cmd/tetra/bugtool"
"github.com/cilium/tetragon/cmd/tetra/cgtracker"
"github.com/cilium/tetragon/cmd/tetra/cri"
"github.com/cilium/tetragon/cmd/tetra/debug"
"github.com/cilium/tetragon/cmd/tetra/loglevel"
Expand All @@ -24,4 +25,5 @@ func addCommands(rootCmd *cobra.Command) {
rootCmd.AddCommand(probe.New())
rootCmd.AddCommand(loglevel.New())
rootCmd.AddCommand(cri.New())
rootCmd.AddCommand(cgtracker.New())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is only for debugging could you put this behind the debug command? here https://github.com/cilium/tetragon/blob/main/cmd/tetra/debug/debug.go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss? Not sure what the benefit of having everything under debug is. Can do as a followup if the consensus is that this is what we want.

}
1 change: 1 addition & 0 deletions contrib/tester-progs/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@ change-capabilities
direct-write-tester
user-stacktrace
pause
/test-helper
4 changes: 4 additions & 0 deletions contrib/tester-progs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ PROGS = sigkill-tester \
direct-write-tester \
change-capabilities \
user-stacktrace \
test-helper \
pause


Expand Down Expand Up @@ -98,6 +99,9 @@ getcpu-i386: FORCE
user-stacktrace: FORCE
go build -o user-stacktrace ./go/user-stacktrace

test-helper: FORCE
go build -o test-helper ./go/test-helper

.PHONY: clean
clean:
rm -f $(PROGS)
Expand Down
12 changes: 12 additions & 0 deletions contrib/tester-progs/go/test-helper/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// SPDX-License-Identifier: Apache-2.0
// Copyright Authors of Tetragon

package main

import (
"github.com/cilium/tetragon/pkg/testutils/progs"
)

func main() {
progs.TestHelperMain()
}
4 changes: 4 additions & 0 deletions docs/data/tetragon_flags.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions operator/podinfo/podinfo_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import (
"reflect"

ciliumiov1alpha1 "github.com/cilium/tetragon/pkg/k8s/apis/cilium.io/v1alpha1"
"github.com/cilium/tetragon/pkg/process"
"github.com/cilium/tetragon/pkg/podhelpers"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
Expand Down Expand Up @@ -100,7 +100,7 @@ func equal(pod *corev1.Pod, podInfo *ciliumiov1alpha1.PodInfo) bool {
Controller: &controller,
BlockOwnerDeletion: &blockOwnerDeletion,
}
workloadObject, workloadType := process.GetWorkloadMetaFromPod(pod)
workloadObject, workloadType := podhelpers.GetWorkloadMetaFromPod(pod)
return pod.Name == podInfo.Name &&
pod.Namespace == podInfo.Namespace &&
pod.Status.PodIP == podInfo.Status.PodIP &&
Expand Down Expand Up @@ -129,7 +129,7 @@ func generatePodInfo(pod *corev1.Pod) *ciliumiov1alpha1.PodInfo {
for _, podIP := range pod.Status.PodIPs {
podIPs = append(podIPs, ciliumiov1alpha1.PodIP{IP: podIP.IP})
}
workloadObject, workloadType := process.GetWorkloadMetaFromPod(pod)
workloadObject, workloadType := podhelpers.GetWorkloadMetaFromPod(pod)
controller := true
blockOwnerDeletion := true
return &ciliumiov1alpha1.PodInfo{
Expand Down
4 changes: 2 additions & 2 deletions operator/podinfo/podinfo_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import (
"testing"

ciliumv1alpha1 "github.com/cilium/tetragon/pkg/k8s/apis/cilium.io/v1alpha1"
"github.com/cilium/tetragon/pkg/process"
"github.com/cilium/tetragon/pkg/podhelpers"
"github.com/stretchr/testify/assert"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
Expand Down Expand Up @@ -121,7 +121,7 @@ func TestGeneratePod(t *testing.T) {
for _, podIP := range pod.Status.PodIPs {
podIPs = append(podIPs, ciliumv1alpha1.PodIP{IP: podIP.IP})
}
workloadObject, workloadType := process.GetWorkloadMetaFromPod(pod)
workloadObject, workloadType := podhelpers.GetWorkloadMetaFromPod(pod)
expectedPodInfo := &ciliumv1alpha1.PodInfo{
ObjectMeta: metav1.ObjectMeta{
Name: pod.Name,
Expand Down
Loading
Loading