Skip to content

Commit

Permalink
sysdump: don't specify --follow while collecting hubble flows
Browse files Browse the repository at this point in the history
Currently, hubble flows are retrieved during sysdump collection passing
the `--follow` parameter to hubble observe. According to the comment,
this appeared to be a necessary hack to prevent the "requested data has
been overwritten and is no longer available" error. Yet, the consequence
is that the hubble observe command becomes blocking, and we relying on the
specified timeout only for its termination. When capturing a sysdump,
though, we are interested in storing (as many as possible) flows prior to
that moment (e.g., to investigate the causes of a connectivity test failure),
not the ones occurring during the collection of the sysdump itself.

Given that the original reason for using the `--follow` parameter got
fixed quite some time ago [1] and the fix is included in any Cilium
versions supported today, let's just get rid of it. The side effects
include the early termination of the collection process as soon as
all the flows have been retrieved, as well as the reduction of the
size of the sysdumps when increasing the timeout period, given that
we do no longer block until its expiration (this is relevant especially
in CI tests, as they are currently too large to be uploaded on GH).
Nonetheless, the timeout parameter is preserved to interrupt the
retrieval if taking too long.

[1]: cilium/cilium#17046

Signed-off-by: Marco Iorio <[email protected]>
  • Loading branch information
giorio94 authored and aditighag committed Jan 22, 2024
1 parent b6c88b1 commit 5cfb677
Showing 1 changed file with 1 addition and 3 deletions.
4 changes: 1 addition & 3 deletions sysdump/sysdump.go
Original file line number Diff line number Diff line change
Expand Up @@ -1976,11 +1976,9 @@ func (c *Collector) submitHubbleFlowsTasks(_ context.Context, pods []*corev1.Pod
for _, p := range pods {
p := p
if err := c.Pool.Submit(fmt.Sprintf("hubble-flows-"+p.Name), func(ctx context.Context) error {
// HACK: Run hubble observe with --follow to avoid hitting the code path that triggers
// https://github.com/cilium/cilium/issues/17036.
b, e, err := c.Client.ExecInPodWithStderr(ctx, p.Namespace, p.Name, containerName, []string{
"timeout", "--signal", "SIGINT", "--preserve-status", hubbleFlowsTimeout, "bash", "-c",
fmt.Sprintf("hubble observe --follow --last %d --debug -o jsonpb", c.Options.HubbleFlowsCount),
fmt.Sprintf("hubble observe --last %d --debug -o jsonpb", c.Options.HubbleFlowsCount),
})
if err != nil {
return fmt.Errorf("failed to collect hubble flows for %q in namespace %q: %w: %s", p.Name, p.Namespace, err, e.String())
Expand Down

0 comments on commit 5cfb677

Please sign in to comment.