-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various "error":"proto: wrong wireType = 2 for field TimeUnixNano" messages when using otel.receiver.kafka #644
Comments
Hello, this seems like a mismatched between encoding and decoding between the Otel collector and Alloy. Alloy v1.0.0 is based on the version 0.96.0 of the Otel collector. |
Hey @wildum, this is still happening on 0.96.0 of the Otel Collector too (if anything, the amount of errors in the Alloy logs have increased:
|
Could you also share your Otel collector config please? |
Sure thing.
|
I reproduced the error locally with a similar setup as you have:
|
I will try to replace alloy by another otel collector so see whether Alloy is the problem or if the same error pops up with otel collector |
It works fine with: |
Trying with Alloy again, I noticed that the first traces are correctly sent (no error + they appear in tempo). The error only pops up after a few seconds |
Found out that at first the traces are consumed correctly by the tracesConsumerGroupHandler but after a few seconds they are being consumed by the metricsConsumerGroupHandler or the logsConsumerGroupHandler and that's when the error appear (because the schemas are different). |
The tracesConsumerGroup starts and correctly consumes the traces but then it gets cancelled.
In this case it seems to work properly forever.
When this happens, it does not recover. The metrics or logs group will keep trying to unmarshall the traces |
@elburnetto-intapp I found a workaround if renaming your topic is acceptable:
The consumerGroups will claim a default topic: otlp_spans for traces, otlp_metrics for metrics and otlp_logs for logs. When you set the topic to a value, it sets it to all groups. In your case the three groups are interested by otlp-tracing. Somehow the trace consumer claims it first but after a few cancels another group claims it and you get the error. If you don't set I will continue investigating to understand why this was made this way in otel to find a proper solution |
It works with Otel collector because you specify the type of telemetry when you define the pipeline:
In this case the kafka receiver will only start the traces consumer group |
Hey @wildum, Really appreciate you looking into this and giving some detailed analysis. We've created the new Kafka topic 'otlp_spans' and got this running in one of our Dev environments, and can confirm this is working with no errors from Alloy at all (and all traces are being received by Tempo). |
This reminds me of #251. I think we need to make the |
Sounds good, I can draft something for the receivers |
Hey @wildum, Hope you had a great weekend? We've found that when we don't set the Kafka Topic in Alloy, it's auto-creating the topics with our Brokers. E.g. otlp_spans (we created this, as we have a Kafka Operator running, and define all Clusters/Topics as code), otlp_metrics and otlp_logs however have auto-created themselves, which means we can't manage them via config. Not sure if this is expected behaviour or not. Cheers, |
Hey, thanks for letting me know, and sorry about that. This is a bug, the receiver component should not create topics that are not needed. I'm currently working on a fix for the next release (currently planned for the 07th of May). I will make sure that this behavior is fixed and will update this ticket once the fix is for sure in the next release. |
Hello @elburnetto-intapp, the fix has been merged to main and will be part of the next release (Alloy v1.1.0 the 7th of May). You will still need to update your config: In the output block you need to set only the signals that you need. If you only have one topic "otlp_spans" for traces then you can just do:
or if your topic name is different from "otlp_spans":
If you have several topics for different telemetry signals then you must proceed as follows: Either you don't specify the topic and you can use the receiver for the three telemetry signals but the topics have to be "otlp_spans", "otlp_metrics", "otlp_logs":
Or you specify a topic but you will need to create different receivers (if you need the three signals):
LMK if it's not clear or if you encounter problems with the new release when it will be out. I don't like how the config is done but we must stay consistent with the Otel collector. I opened a ticket to change it: open-telemetry/opentelemetry-collector-contrib#32735 |
Hey @wildum, Amazing, thanks for that, makes clear sense! Cheers, |
@elburnetto-intapp quick update, the release of Alloy 1.1 is slightly delayed. It was previously planned for today but we will need a few more days. Sorry |
What's wrong?
Our architecture is currently:
Applications -> OTel Collector (otel/opentelemetry-collector-contrib:0.98.0) -> Kafka -> Grafana Alloy (1.0.0) -> Grafana Tempo (2.4.0).
OTel Collector receives the traces from our apps, and exports them onto a Kafka topic, using the otlp_proto encoding. We've then setup the otel.receiver.kafka component in Alloy to consume from this Kafka topic, to then send onto Tempo via gRPC (as we're looking to enable Tempo Multi-tenancy, which isn't supported by Kafka).
When Alloy starts consuming these messages, our Alloy logs start throwing various errors around wrong wireTypes, illegal wireTypes etc, which make no sense (previously when Kafka was being consumed by Tempo directly, we saw none of these errors).
Steps to reproduce
System information
No response
Software version
No response
Configuration
Logs
The text was updated successfully, but these errors were encountered: