Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live debugging #999

Merged
merged 18 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@ Main (unreleased)

- Add an `otelcol.exporter.kafka` component to send OTLP metrics, logs, and traces to Kafka.

- Added `live debugging` to the UI. Live debugging streams data as they flow through components for debugging telemetry data.
Individual components must be updated to support live debugging. (@wildum)

- Added live debugging support for `prometheus.relabel`. (@wildum)

### Enhancements

- (_Public preview_) Add native histogram support to `otelcol.receiver.prometheus`. (@wildum)
Expand Down
40 changes: 40 additions & 0 deletions docs/sources/reference/config-blocks/livedebugging.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
canonical: https://grafana.com/docs/alloy/latest/reference/config-blocks/livedebugging/
description: Learn about the livedebugging configuration block
menuTitle: livedebugging
title: livedebugging block
---

<span class="badge docs-labels__stage docs-labels__item">Experimental</span>

# livedebugging block

{{< docs/shared lookup="stability/experimental.md" source="alloy" version="<ALLOY_VERSION>" >}}

`livedebugging` is an optional configuration block that enables the [live debugging feature][debug], which streams real-time data from your components directly to the {{< param "PRODUCT_NAME" >}} UI.

By default, [live debugging][debug] is disabled and must be explicitly enabled through this configuration block to make the debugging data visible in the {{< param "PRODUCT_NAME" >}} UI.

{{< admonition type="note" >}}
The live debugging feature uses the {{< param "PRODUCT_NAME" >}} UI to provide detailed insights into the data flowing through your pipelines.
To ensure that your data remains secure while live debugging is enabled, configure TLS in the [http block][].
{{< /admonition >}}

## Example

```alloy
livedebugging {
enabled = true
}
```

## Arguments

The following arguments are supported:

| Name | Type | Description | Default | Required |
| --------- | ------ | ----------------------------------- | ------- | -------- |
| `enabled` | `bool` | Enables the live debugging feature. | `false` | no |

[debug]: ../../../tasks/debug/
[http block]: ../http/
28 changes: 27 additions & 1 deletion docs/sources/tasks/debug.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Clicking a component in the graph navigates to the [Component detail page](#comp

### Component detail page

{{< figure src="/media/docs/alloy/ui_component_detail_page.png" alt="Alloy UI component detail page" >}}
{{< figure src="/media/docs/alloy/ui_component_detail_page_2.png" alt="Alloy UI component detail page" >}}

The component detail page shows the following information for each component:

Expand All @@ -57,6 +57,8 @@ The component detail page shows the following information for each component:
* The current exports for the component.
* The current debug info for the component (if the component has debug info).

From there you can also go to the component documentation or to its corresponding [Live Debugging page](#live-debugging-page).

> Values marked as a [secret][] are obfuscated and display as the text `(secret)`.

### Clustering page
Expand All @@ -70,12 +72,36 @@ The clustering page shows the following information for each cluster node:
* The node's current state (Viewer/Participant/Terminating).
* The local node that serves the UI.

### Live Debugging page

{{< figure src="/media/docs/alloy/ui_live_debugging_page.png" alt="Alloy UI live debugging page" >}}

Live debugging provides a real-time stream of debugging data from a component. You can access this page from the corresponding [Component detail page](#component-detail-page).

Live debugging allows you to do the following:

* Pause and clear the data stream.
* Sample data and disable auto-scrolling to handle heavy loads.
* Search through the data using keywords.
* Copy the entire data stream to the clipboard.

The format and content of the debugging data vary depending on the component type.

{{< admonition type="note" >}}
Live debugging is not yet available in all components.

Supported components:
* prometheus.relabel
{{< /admonition >}}


## Debugging using the UI

To debug using the UI:

* Ensure that no component is reported as unhealthy.
* Ensure that the arguments and exports for misbehaving components appear correct.
* Ensure that the live debugging data meets your expectations.

## Examining logs

Expand Down
13 changes: 9 additions & 4 deletions internal/alloycli/cmd_run.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import (
"github.com/grafana/alloy/internal/service"
httpservice "github.com/grafana/alloy/internal/service/http"
"github.com/grafana/alloy/internal/service/labelstore"
"github.com/grafana/alloy/internal/service/livedebugging"
otel_service "github.com/grafana/alloy/internal/service/otel"
remotecfgservice "github.com/grafana/alloy/internal/service/remotecfg"
uiservice "github.com/grafana/alloy/internal/service/ui"
Expand Down Expand Up @@ -273,8 +274,11 @@ func (fr *alloyRun) Run(configPath string) error {
return fmt.Errorf("failed to create the remotecfg service: %w", err)
}

liveDebuggingService := livedebugging.New()

uiService := uiservice.New(uiservice.Options{
UIPrefix: fr.uiPrefix,
UIPrefix: fr.uiPrefix,
CallbackManager: liveDebuggingService.Data().(livedebugging.CallbackManager),
})

otelService := otel_service.New(l)
Expand All @@ -292,12 +296,13 @@ func (fr *alloyRun) Run(configPath string) error {
Reg: reg,
MinStability: fr.minStability,
Services: []service.Service{
httpService,
uiService,
clusterService,
otelService,
httpService,
labelService,
liveDebuggingService,
otelService,
remoteCfgService,
uiService,
},
})

Expand Down
6 changes: 6 additions & 0 deletions internal/component/component.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,9 @@ type DebugComponent interface {
// DebugInfo must be safe for calling concurrently.
DebugInfo() interface{}
}

// LiveDebugging is an interface marker used by the components that support the live debugging feature.
type LiveDebugging interface {
// LiveDebugging marks the component for live debugging support. It is never invoked.
LiveDebugging()
}
31 changes: 26 additions & 5 deletions internal/component/prometheus/relabel/relabel.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"github.com/grafana/alloy/internal/component/prometheus"
"github.com/grafana/alloy/internal/featuregate"
"github.com/grafana/alloy/internal/service/labelstore"
"github.com/grafana/alloy/internal/service/livedebugging"
lru "github.com/hashicorp/golang-lru/v2"
prometheus_client "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/prometheus/model/exemplar"
Expand All @@ -22,9 +23,11 @@ import (
"go.uber.org/atomic"
)

const name = "prometheus.relabel"

func init() {
component.Register(component.Registration{
Name: "prometheus.relabel",
Name: name,
Stability: featuregate.StabilityGenerallyAvailable,
Args: Arguments{},
Exports: Exports{},
Expand Down Expand Up @@ -85,12 +88,15 @@ type Component struct {
exited atomic.Bool
ls labelstore.LabelStore

debugDataPublisher livedebugging.DebugDataPublisher

cacheMut sync.RWMutex
cache *lru.Cache[uint64, *labelAndID]
}

var (
_ component.Component = (*Component)(nil)
_ component.Component = (*Component)(nil)
_ component.LiveDebugging = (*Component)(nil)
)

// New creates a new prometheus.relabel component.
Expand All @@ -99,14 +105,21 @@ func New(o component.Options, args Arguments) (*Component, error) {
if err != nil {
return nil, err
}

debugDataPublisher, err := o.GetServiceData(livedebugging.ServiceName)
if err != nil {
return nil, err
}

data, err := o.GetServiceData(labelstore.ServiceName)
if err != nil {
return nil, err
}
c := &Component{
opts: o,
cache: cache,
ls: data.(labelstore.LabelStore),
opts: o,
cache: cache,
ls: data.(labelstore.LabelStore),
debugDataPublisher: debugDataPublisher.(livedebugging.DebugDataPublisher),
}
c.metricsProcessed = prometheus_client.NewCounter(prometheus_client.CounterOpts{
Name: "alloy_prometheus_relabel_metrics_processed",
Expand Down Expand Up @@ -259,6 +272,12 @@ func (c *Component) relabel(val float64, lbls labels.Labels) labels.Labels {
// Set the cache size to the cache.len
// TODO(@mattdurham): Instead of setting this each time could collect on demand for better performance.
c.cacheSize.Set(float64(c.cache.Len()))

componentID := livedebugging.ComponentID(c.opts.ID)
if c.debugDataPublisher.IsActive(componentID) {
c.debugDataPublisher.Publish(componentID, fmt.Sprintf("%s => %s", lbls.String(), relabelled.String()))
}

return relabelled
}

Expand Down Expand Up @@ -299,6 +318,8 @@ func (c *Component) addToCache(originalID uint64, lbls labels.Labels, keep bool)
})
}

func (c *Component) LiveDebugging() {}

// labelAndID stores both the globalrefid for the label and the id itself. We store the id so that it doesn't have
// to be recalculated again.
type labelAndID struct {
Expand Down
37 changes: 23 additions & 14 deletions internal/component/prometheus/relabel/relabel_test.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package relabel

import (
"fmt"
"math"
"strconv"
"testing"
Expand All @@ -13,6 +14,7 @@ import (
"github.com/grafana/alloy/internal/component/prometheus"
"github.com/grafana/alloy/internal/runtime/componenttest"
"github.com/grafana/alloy/internal/service/labelstore"
"github.com/grafana/alloy/internal/service/livedebugging"
"github.com/grafana/alloy/internal/util"
"github.com/grafana/alloy/syntax"
prom "github.com/prometheus/client_golang/prometheus"
Expand Down Expand Up @@ -67,13 +69,11 @@ func TestNil(t *testing.T) {
return ref, nil
}))
relabeller, err := New(component.Options{
ID: "1",
Logger: util.TestAlloyLogger(t),
OnStateChange: func(e component.Exports) {},
Registerer: prom.NewRegistry(),
GetServiceData: func(name string) (interface{}, error) {
return labelstore.New(nil, prom.DefaultRegisterer), nil
},
ID: "1",
Logger: util.TestAlloyLogger(t),
OnStateChange: func(e component.Exports) {},
Registerer: prom.NewRegistry(),
GetServiceData: getServiceData,
}, Arguments{
ForwardTo: []storage.Appendable{fanout},
MetricRelabelConfigs: []*alloy_relabel.Config{
Expand Down Expand Up @@ -154,13 +154,11 @@ func generateRelabel(t *testing.T) *Component {
return ref, nil
}))
relabeller, err := New(component.Options{
ID: "1",
Logger: util.TestAlloyLogger(t),
OnStateChange: func(e component.Exports) {},
Registerer: prom.NewRegistry(),
GetServiceData: func(name string) (interface{}, error) {
return labelstore.New(nil, prom.DefaultRegisterer), nil
},
ID: "1",
Logger: util.TestAlloyLogger(t),
OnStateChange: func(e component.Exports) {},
Registerer: prom.NewRegistry(),
GetServiceData: getServiceData,
}, Arguments{
ForwardTo: []storage.Appendable{fanout},
MetricRelabelConfigs: []*alloy_relabel.Config{
Expand Down Expand Up @@ -225,3 +223,14 @@ func TestRuleGetter(t *testing.T) {
require.Equal(t, gotUpdated[0].SourceLabels, gotOriginal[0].SourceLabels)
require.Equal(t, gotUpdated[0].Regex, gotOriginal[0].Regex)
}

func getServiceData(name string) (interface{}, error) {
switch name {
case labelstore.ServiceName:
return labelstore.New(nil, prom.DefaultRegisterer), nil
case livedebugging.ServiceName:
return livedebugging.NewLiveDebugging(), nil
default:
return nil, fmt.Errorf("service not found %s", name)
}
}
3 changes: 3 additions & 0 deletions internal/runtime/componenttest/componenttest.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"time"

"github.com/grafana/alloy/internal/service/labelstore"
"github.com/grafana/alloy/internal/service/livedebugging"
"github.com/prometheus/client_golang/prometheus"
"go.uber.org/atomic"

Expand Down Expand Up @@ -163,6 +164,8 @@ func (c *Controller) buildComponent(dataPath string, args component.Arguments) (
switch name {
case labelstore.ServiceName:
return labelstore.New(nil, prometheus.DefaultRegisterer), nil
case livedebugging.ServiceName:
return livedebugging.NewLiveDebugging(), nil
default:
return nil, fmt.Errorf("no service named %s defined", name)
}
Expand Down
Loading
Loading