Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use String#bytesize instead of String#length in #format_event #112

Merged
merged 1 commit into from
Jun 14, 2019
Merged

Use String#bytesize instead of String#length in #format_event #112

merged 1 commit into from
Jun 14, 2019

Conversation

yosangwon
Copy link
Contributor

I found some issues which cause texts truncated on Datadog Event Stream when there are some characters outside ASCII range.

For example, a Korean letter(Hangul) takes 3 bytes in UTF-8 but current implementation considers it taking a byte. so every time a Korean letters there will be two bytes truncated.

text = "한글 한 자당 세 바이트abcdefghijklmnopqr"
text.length # => 31
text.bytesize # => 49

# remaining 18 bytes will be ignored in dd-agent (or datadog)
Datadog.dogstatsd.event("info", text, tags: ["test"], alert_type: "info")

Replacing String#length with String#bytesize would fix this issue.

@albertvaka
Copy link
Contributor

Hi @devleoper 😄Thanks for your contribution!

I gave a look at the code that receives these packets and indeed, it is counting bytes instead of characters: https://github.com/DataDog/datadog-agent/blob/master/pkg/dogstatsd/parser.go#L173

Merging!

@albertvaka albertvaka merged commit c95ae54 into DataDog:master Jun 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants