-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new line protocol parser and serializer, influxdb output #3924
Conversation
@phemmer This is a big one, so I won't ask you to do a full review, but I'd grateful for any feedback you have time to provide. I'm excited about the prospect of converting other parsers to use ragel in the future (statsd parsing comes to mind). |
1161c74
to
f046a29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would probably be a good time to add uint
support as well.
influxdata/influxdb#8835
Other than that, I've only taken a brief glance through the code. Haven't had a chance to actually run it.
plugins/outputs/influxdb/README.md
Outdated
## Compress each HTTP request payload using GZIP. | ||
# content_encoding = "gzip" | ||
## HTTP Content-Encoding for write request body, one of "gzip" or "identity". | ||
# content_encoding = "identity" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might do something like or "identity" (meaning no encoding)
. I actually had to look up "identity" as I wasn't familiar with the definition.
plugins/outputs/influxdb/udp.go
Outdated
serializer := config.Serializer | ||
if serializer == nil { | ||
s := influx.NewSerializer() | ||
s.SetMaxLineBytes(config.MaxPacketSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A "udp packet" includes both headers and body. MaxPacketSize
is being used to control just the body size. So it's a misnomer.
@ragzilla has been working on a follow up patch adding uint support: ragzilla@3064edc#commitcomment-28296437 We have to be a little careful as uint support is not generally available in InfluxDB and any field type changes results in an error. |
case float32: | ||
fields[k] = float64(val) | ||
continue | ||
case float64: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for opening an old issue but this check was blocking a lot of garbage from being attempted to be written to our output. Is there a different place that this same kind of filtering can be done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe what you are seeing now in a new issue?
I've just identified this as the cause of a major performance regression as detailed here: |
In this pull request I introduce a new line protocol parser and serializer, as well as an updated influxdb output that uses the serializer. Some minor changes are made to the
telegraf.Metric
interface as well.Line protocol is no longer the internal represention of the
metrics
type, the data is stored in a fully parsed and type converted manor. Despite this in my testing there is an performance improvement across both going from line protocol to line protocol, as well as with Telegraf generated metrics going to line protocol, the main reason for this is the ragel based parser. Additionally, the sort order of tags is fixed, which should help on the InfluxDB side.Ragel is used to compile the parser state machine, the generated file is committed so that there is no build requirement on having ragel. In the future I would love to have this be done at build time so that no generated files are committed.
closes #3452
closes #3631
closes #3648
closes #3291