Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(lua transform): Explicitly call GC in lua transform #1990

Merged
2 commits merged into from
Mar 6, 2020
Merged

fix(lua transform): Explicitly call GC in lua transform #1990

2 commits merged into from
Mar 6, 2020

Conversation

ghost
Copy link

@ghost ghost commented Mar 5, 2020

Closes #1721.

This PR adds explicit call to the garbage collector after each invocation of the lua transform. It fixes the issues with growing RAM consumption mentioned in #1721. As I understand it, Lua runtime sometimes didn't call GC automatically, that's why there was leak-like pattern of memory usage.

Now the RAM usage of Vector with lua transform is flat, even with high event rates.

This explicit call doesn't significantly change the benchmarks because GC is called not after each invocation of the transform, but only each 16th invocation of the transform.

@ghost ghost force-pushed the lua-gc branch from 0fa3dcc to e5f9455 Compare March 5, 2020 13:50
Signed-off-by: Alexander Rodin <[email protected]>
@binarylogic
Copy link
Contributor

Nice find, I’m glad we’ve fixed this.

Copy link
Contributor

@MOZGIII MOZGIII left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! LGTM!!

@binarylogic
Copy link
Contributor

I’m curious why you chose 16? Would you expect a performance hit for high volume streams? Ex: >5k events per second.

Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to reproduce the original issue and see that this addresses it? I'm a bit surprised Lua would have this issue.

I'd also be curious to measure the overhead here in high-volume pipelines. GC can definitely be time-consuming and we want to find a reasonable tradeoff between memory use and throughput. Running every 16 invocations might be a bit often.

@ghost
Copy link
Author

ghost commented Mar 5, 2020

@lukesteensen

Were you able to reproduce the original issue and see that this addresses it? I'm a bit surprised Lua would have this issue.

I was able to reproduce the issue with the following config:

data_dir = "/var/lib/vector/"
dns_servers = []

[log_schema]
message_key = "message"
timestamp_key = "timestamp"
host_key = "host"

[sources.source0]
max_length = 102400
type = "stdin"

[transforms.transform0]
inputs = ["source0"]
type = "lua"
source = """
  event["count"] = 1
"""

[sinks.sink0]
healthcheck = true
inputs = ["transform0"]
type = "console"
encoding = "json"

[sinks.sink0.buffer]
type = "memory"
max_events = 500
when_full = "block"

and the following command:

cat /dev/urandom | base64 | vector --config vector.toml > /dev/null

In a minute Vector's RAM consumption grew up to a few gigabytes.

I'd also be curious to measure the overhead here in high-volume pipelines. GC can definitely be time-consuming and we want to find a reasonable tradeoff between memory use and throughput. Running every 16 invocations might be a bit often.

@binarylogic

I’m curious why you chose 16? Would you expect a performance hit for high volume streams? Ex: >5k events per second.

I measured the performance on the same config using pv utility (each line produced by base64 has length 76 characters here):

cat /dev/urandom | base64 | vector --config vector.toml | head -n3000000 | pv > /dev/null

For master and this branch the results were identical, 3M events were processed in 28 seconds in master and 27 seconds in this branch, which is around 105k events per second.

So it seems to be reasonably fast even for 16, but it can be made larger too (or exposed to the user). One of the reasons I chose 16 is that it allowed to test the change using cargo bench, because with larger numbers the GC could have not been invoked in the benchmarks.

Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! Now I'm curious why it didn't show up in my tests...

@binarylogic
Copy link
Contributor

binarylogic commented Mar 5, 2020

Once this is merged we should cut a point release (0.8.2) and I can notify affected users.

@ghost ghost merged commit 65aa39f into master Mar 6, 2020
ghost pushed a commit that referenced this pull request Mar 6, 2020
* fix(lua transform): Explicitly call GC in `lua` transform

Signed-off-by: Alexander Rodin <[email protected]>

* Call GC not after each invocation

Signed-off-by: Alexander Rodin <[email protected]>
@ghost ghost mentioned this pull request Mar 6, 2020
@ghost ghost deleted the lua-gc branch March 6, 2020 15:29
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check lua transform for memory leaks
4 participants