Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/routing] Allows routing by values that are not string #13158

Closed
gfonseca-tc opened this issue Aug 9, 2022 · 7 comments · Fixed by #13636
Closed

[processor/routing] Allows routing by values that are not string #13158

gfonseca-tc opened this issue Aug 9, 2022 · 7 comments · Fixed by #13636
Assignees
Labels
enhancement New feature or request priority:p2 Medium processor/routing Routing processor

Comments

@gfonseca-tc
Copy link
Contributor

gfonseca-tc commented Aug 9, 2022

Is your feature request related to a problem? Please describe.

When routing data using the collector pipelines the routing processor only allows us to match against strings. It would be very useful if we could also match by other types or even ranges. It would allow us to use the transform processor to add resource attributes like the example below and use it to route later with the routing processor. Currently the above approach drops everything

receivers:
    otlp:
      protocols:
        http:
          endpoint: "0.0.0.0:4318"
processors: 
  transform: 
    traces: 
      queries: 
        - set(resource.attributes["backend1"], isMatch(resource.attributes["backend"], ".*vendor1.*")) 
        - set(resource.attributes["backend2"], IsMatch(resource.attributes["backend"], ".*vendor2.*"))
  routing/grafana-traces:
    attribute_source: resource 
    from_attribute: backend1
    default_exporters:
    - file/drop
    table:
    - value: true
      exporters: [otlp/backend1]
    - value: false
      exporters: [file/drop]
 

This approach allows me to select to which backend to send data by adding a single resource field with multiple values. Using the IsMatch function allows me to match regex in TSQL but only return a boolean. That value always ends up in default in routing because it's not a string.

Describe the solution you'd like
Allowing the routing processor to match against other values that are not string. It might even make sense to have it comparing values to allow for routing to certain place if a given values is bigger then X.

Describe alternatives you've considered
The way I'm solving this problem right now is by add the vendor1 attribute using an attribute processor (to be able to use the filter it has) and then grouping it by that attribute later, to make it a resource attribute. I can then use it's value in the routing processor.

Additional context
I've tried multiple approaches to get this pipeline working and most paths I took would look better than the solution I ended up with. This issue is a request to, what seems to me, the easiest way of simplifying that pipeline. There are other problems I found that I will place in new issues.

@jpkrohling
Copy link
Member

Just to confirm my understanding: in the example you've given, would you like data to be sent to backend1 if this attribute has true as the value?

@jpkrohling jpkrohling self-assigned this Aug 10, 2022
@jpkrohling jpkrohling added the processor/routing Routing processor label Aug 10, 2022
@gfonseca-tc
Copy link
Contributor Author

Yes @jpkrohling , exactly. A complete example would become too verbose but this is the main idea. I would have an ingest pipeline that would run several TSQL queries adding attributes as necessary and the data would then be fanned out to other pipelines, each one containing a routing processor like the one I provided and a backend. If the key used for that backend is set to true it passes through, otherwise it's dropped. That way I can have any number of backends and send data to them selectively.

@mx-psi mx-psi added the enhancement New feature or request label Aug 11, 2022
@kovrus
Copy link
Member

kovrus commented Aug 12, 2022

When routing data using the collector pipelines the routing processor only allows us to match against strings. It would be very useful if we could also match by other types or even ranges.

Maybe we can use tql expressions in the routing table of the routing processor as it is used in the transform processor. Then it should be possible to solve the use case described above with the following configurations:

...
processors:
  routing:
    default_exporters:
    - file/drop
    table:
    - exporters: [otlp/backend1] 
      statement: route() where isMatch(resource.attributes["backend"], ".*vendor1.*") == true
    - exporters: [file/backend2]
      statement: route() where IsMatch(resource.attributes["backend"], ".*vendor2.*") == true
...

With tql we won't have matching only against string constraint in the routing table and can support other types. Currently, tql supports only == and != comparison operators, but there is some work going on extending it in #12491. When it is done, I think, it would be possible to write more complex expressions to support ranges, e.g.

...
processors:
  routing:
    ...
    table:
    - exporters: [file/backend2]
      statement: route() where resource.attributes["x"] > 100 and resource.attributes["x"] < 1000
    - exporters: [otlp/backend1] 
      statement: drop_key(attributes, "y") where attributes["y"] != true 
...

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Aug 23, 2022

Love this use of the TQL. The route function is probably not useful because the routing processor contains all the logic, so what we really want from the TQL is a condition. Right now there is no way to get a condition without also providing a Invocation and the where keyword, but it would be easy to expose a ParseCondition function that interprets condition statements.

An alternative would be to add a noop function, but I like the idea of exposing conditions more.

@jpkrohling jpkrohling assigned kovrus and unassigned jpkrohling Aug 23, 2022
@TylerHelmuth
Copy link
Member

Quick mock up of how TQL could expose only conditions

const noop = "noop() where %s"

func ParseCondition(statements []string, functions map[string]interface{}, pathParser PathExpressionParser, enumParser EnumParser) ([]BoolExpressionEvaluator, error) {
	conditions := make([]BoolExpressionEvaluator, 0)
	var errors error

	for _, statement := range statements {
		parsed, err := parseQuery(fmt.Sprintf(noop, statement))
		if err != nil {
			errors = multierr.Append(errors, err)
			continue
		}
		expression, err := newBooleanExpressionEvaluator(parsed.WhereClause, functions, pathParser, enumParser)
		if err != nil {
			errors = multierr.Append(errors, err)
			continue
		}
		conditions = append(conditions, expression)
	}

	if errors != nil {
		return nil, errors
	}
	return conditions, nil
}

@kovrus
Copy link
Member

kovrus commented Aug 25, 2022

The route function is probably not useful because the routing processor contains all the logic, so what we really want from the TQL is a condition. Right now there is no way to get a condition without also providing a Invocation and the where keyword, but it would be easy to expose a ParseCondition function that interprets condition statements.

I thought of the route function here as noop because of the current TQL limitations. It also seems that we would need to use both condition and query (invocation and condition) in routing table entries. In some cases we will have to evaluate a boolean expression only, in other cases we would like to use some function to drop a routing key while routing or something like that.

@kovrus
Copy link
Member

kovrus commented Sep 15, 2022

@gfonseca-tc please take a look at #13636

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority:p2 Medium processor/routing Routing processor
Projects
None yet
6 participants