-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(lua transform): Explicitly call GC in lua
transform
#1990
Conversation
Signed-off-by: Alexander Rodin <[email protected]>
Signed-off-by: Alexander Rodin <[email protected]>
Nice find, I’m glad we’ve fixed this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! LGTM!!
I’m curious why you chose 16? Would you expect a performance hit for high volume streams? Ex: >5k events per second. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were you able to reproduce the original issue and see that this addresses it? I'm a bit surprised Lua would have this issue.
I'd also be curious to measure the overhead here in high-volume pipelines. GC can definitely be time-consuming and we want to find a reasonable tradeoff between memory use and throughput. Running every 16 invocations might be a bit often.
I was able to reproduce the issue with the following config: data_dir = "/var/lib/vector/"
dns_servers = []
[log_schema]
message_key = "message"
timestamp_key = "timestamp"
host_key = "host"
[sources.source0]
max_length = 102400
type = "stdin"
[transforms.transform0]
inputs = ["source0"]
type = "lua"
source = """
event["count"] = 1
"""
[sinks.sink0]
healthcheck = true
inputs = ["transform0"]
type = "console"
encoding = "json"
[sinks.sink0.buffer]
type = "memory"
max_events = 500
when_full = "block" and the following command: cat /dev/urandom | base64 | vector --config vector.toml > /dev/null In a minute Vector's RAM consumption grew up to a few gigabytes.
I measured the performance on the same config using cat /dev/urandom | base64 | vector --config vector.toml | head -n3000000 | pv > /dev/null For So it seems to be reasonably fast even for 16, but it can be made larger too (or exposed to the user). One of the reasons I chose 16 is that it allowed to test the change using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job! Now I'm curious why it didn't show up in my tests...
Once this is merged we should cut a point release ( |
* fix(lua transform): Explicitly call GC in `lua` transform Signed-off-by: Alexander Rodin <[email protected]> * Call GC not after each invocation Signed-off-by: Alexander Rodin <[email protected]>
Closes #1721.
This PR adds explicit call to the garbage collector after each invocation of the
lua
transform. It fixes the issues with growing RAM consumption mentioned in #1721. As I understand it, Lua runtime sometimes didn't call GC automatically, that's why there was leak-like pattern of memory usage.Now the RAM usage of Vector with
lua
transform is flat, even with high event rates.This explicit call doesn't significantly change the benchmarks because GC is called not after each invocation of the transform, but only each 16th invocation of the transform.