Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatsD set doesn't accept a string #2068

Closed
v9n opened this issue Nov 23, 2016 · 6 comments · Fixed by #2153
Closed

StatsD set doesn't accept a string #2068

v9n opened this issue Nov 23, 2016 · 6 comments · Fixed by #2153
Assignees
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Milestone

Comments

@v9n
Copy link
Contributor

v9n commented Nov 23, 2016

Bug report

Relevant telegraf.conf:

System info:

telegraf --version
Telegraf - version 1.0.1

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"

StatsD config:

# # Statsd Server
[[inputs.statsd]]
#   ## Address and port to host UDP listener on
   service_address = ":8125"
#   ## Delete gauges every interval (default=false)
#   delete_gauges = false
#   ## Delete counters every interval (default=false)
#   delete_counters = false
#   ## Delete sets every interval (default=false)
#   delete_sets = false
#   ## Delete timings & histograms every interval (default=true)
   delete_timings = true
#   ## Percentiles to calculate for timing & histogram stats
   percentiles = [90, 95, 99]
#
#   ## separator to use between elements of a statsd metric
   metric_separator = "_"
#
#   ## Parses tags in the datadog statsd format
#   ## http://docs.datadoghq.com/guides/dogstatsd/
   parse_data_dog_tags = true
#
#   ## Statsd data translation templates, more info can be read here:
#   ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md#graphite
#   # templates = [
#   #     "cpu.* measurement*"
#   # ]
#
#   ## Number of UDP messages allowed to queue up, once filled,
#   ## the statsd server will start dropping packets
   allowed_pending_messages = 10000
#
#   ## Number of timing/histogram values to track per-measurement in the
#   ## calculation of percentiles. Raising this limit increases the accuracy
#   ## of percentiles but also increases the memory usage and cpu time.
#   percentile_limit = 1000

Steps to reproduce:

  1. Tail the log
  2. Sending a SET like test:a123b|s
  3. Log show: Error: parsing value to int64: test:a123b|s. No data is inserted into InfluxDB

Expected behavior:

Data is flushed into InfluxDB. test should be create and increase value by 1

Actual behavior:

No data is created. Log shows error.

Additional info:

[Include gist of relevant config, logs, etc.]

Proposal:

Current behavior:

Desired behavior:

Use case: [Why is this important (helps with prioritizing requests)]

I want to count unique user id of a MongoDB instance. The ObjectID is string. Rightnow I have to convert MongoDB ObjectID into a number.

@sparrc sparrc added this to the Future Milestone milestone Nov 23, 2016
@sparrc sparrc added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Nov 23, 2016
@sparrc
Copy link
Contributor

sparrc commented Nov 23, 2016

can you provide and example of a common statsd server that accepts strings for sets?

@sparrc
Copy link
Contributor

sparrc commented Nov 23, 2016

the "statsd spec" specifies values as only being integer or float: https://github.com/b/statsd_spec#metric-types--formats

@nmische
Copy link

nmische commented Nov 29, 2016

We just ran up against this as well. Etsy's implementation accepts strings:

@v9n
Copy link
Contributor Author

v9n commented Nov 30, 2016

@sparrc I haven't use any other StatsD except the one from etsy. When I use it last year, I have to use this bernd/statsd-influxdb-backend@33a682c which is a StatsD plugin to flush metric to InfluxDB. It supports string.

I haven't used any other StatsD server except etsy statsd and telegraf. I found the need of counting string in a set is pretty reasonable. People use ElasticSearch or MongoDB as main storage usually will not have an integer id, but an UUID(which is a string) instead. We can try some tip to hash/encode the UUID into an integer but that's a hack and add confusing to the code.

I imagine it isn't that hard to support string. I can take a look and submit a PR if this is something can be consider a feature to be merged.

@sparrc
Copy link
Contributor

sparrc commented Nov 30, 2016

I think the best way to do this would be to simply treat all "set" keys as strings, since the actual value doesn't matter, all that matters is it's uniqueness (ie, "1" == 1)

@nmische
Copy link

nmische commented Nov 30, 2016

I think treating all set keys as strings is reasonable. The Etsy implementation is JavaScript where "1" == 1 evaluates to true.

@jwilder jwilder modified the milestones: 1.2.0, Future Milestone Nov 30, 2016
sparrc added a commit that referenced this issue Dec 13, 2016
sparrc added a commit that referenced this issue Dec 13, 2016
njwhite pushed a commit to njwhite/telegraf that referenced this issue Jan 31, 2017
maxunt pushed a commit that referenced this issue Jun 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants