-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept varying topic lengths in MQTT topic parsing configs #10716
Comments
Hey @samhld, topic parsing supports a # [[inputs.mqtt_consumer.topic_parsing]]
# topic = ""
# measurement = ""
# tags = ""
# fields = "" which can be used to filter the topics handled by this topic-parser. In your case, you want to define such a [[inputs.mqtt_consumer.topic_parsing]]
topic = "sensors/+/+/+"
measurement = "measurement/_/_/_"
tags = "_/site/device_name/_"
fields = "_/_/_/field"
[[inputs.mqtt_consumer.topic_parsing]]
topic = "sensors/+/+/+/+"
measurement = "measurement/_/_/_/_"
tags = "_/site/sub_site/device_name/_"
fields = "_/_/_/_/field" Does that make sense? |
@srebhan It does and I'm aware of that. I'm just looking to add some sugar to this so that users don't need to repeat themselves. In your suggested configuration, you're actually changing the parsing being done and therefore the line protocol output. In my example, the output is the same. In cases like that, it would be cleaner to keep the configuration less repetitive in my opinion. This is of course assuming this can be done. I'm thinking it can be done by indexing into the beginning and the end of the topic slice generated by the parser. I believe this would only work if the parts to be ignored are contiguous, but I think that would be the most common form of this case. |
I see. So IMO having [[inputs.mqtt_consumer.topic_parsing]]
topic = "sensors/#/+/+"
measurement = "measurement/__"
tags = "_/site/__/device_name/_"
fields = "__/field" with Of course we could also use other placeholders... |
@srebhan I don't have a strong opinion on syntax. A double underscore works for me. I do think it runs the risk of looking very similar to a single underscore, however. That's why I used "#". The "#" symbol is a concept MQTT users are already familiar with. It would be a slightly different use of it but I think the point would get across. But, just to reiterate, I'm not too passionate about that part. I think the most common use case would be to simply account for an unknown number of segements at the beginning, middle, or end. I don't know if we need to support two separate unknown lengths as in |
@samhld I agree with your view on the syntax, maybe Regarding the unknown/dynamic length, we can never support something like |
@srebhan yep, agreed! |
Any update on this? I have a use case where we have a base topic:
And would like to be able to match this with topic parsing. A message could have a topic structure:
or
But because this is tail of the topic depends on other parts of the topic, I would need to define all possible structures instead of putting one topic_parsing with e.g.:
|
For anyone reading this, this can also be accomplished by creating a processor with e.g. starlark which does this:
|
@samhld and @juha-ylikoski please test the binary in PR #15528, available once CI finished the tests, and let me know if this fixes the issue! |
@srebhan this seems to have enabled the functionality I needed. I tested with config:
And was able to match and extract the tags for both topics:
And
|
Telegraf is soon to support multi-segment wildcards (
#
) in MQTT topics in the topic parsing feature. This is distinct from the+
wildcard that maps to exactly one topic segment. The#
could match to 0 or more (not just one).Given this support, it would be nice to use it to our advantage and allow for parsing topics of varying lengths. Below is an example use case.
Imagine I have this topic:
sensors/CLE/device5/temp
following this schema:sensors/<site>/<device_name>/<field>
Then I update firmware of my devices to a version that changes the target topics for metrics (this is built in and unchangeable by me). The new topics follow this schema:
sensors/<site>/<sub-site>/<version>/<device_name>/<field>
. If the same device is updated, it now publishes temp data to topic:sensors/CLE/west/v2/device5/temp
If I have devices running multiple versions, I may have multiple topics of varying lengths that ultimately have the same data in them and could be dealt with the same way. Say I don't care about the
sub-site
andversion
data. In other words, the newly introduced topic segments aren't meaningful to me.It would be convenient to apply the same topic parsing configuration to both cases where possible, like this:
The above configuration will accept a topic of any length that matches the pattern of the first two segments and last two segments. Any segments -- whether there are 0 or more -- in between those first two and last two would be ignored.
The text was updated successfully, but these errors were encountered: