Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk buffer bugs tracking issue #7425

Closed
7 tasks done
ktff opened this issue May 12, 2021 · 4 comments
Closed
7 tasks done

Disk buffer bugs tracking issue #7425

ktff opened this issue May 12, 2021 · 4 comments
Assignees
Labels
domain: buffers Anything related to Vector's memory/disk buffers type: task Generic non-code related tasks

Comments

@ktff
Copy link
Contributor

ktff commented May 12, 2021

Tracking issue for various bug issues related to disk buffer, their causes, and fixes.

Issues (work in progress):

Todo

@ktff ktff added type: task Generic non-code related tasks domain: buffers Anything related to Vector's memory/disk buffers labels May 12, 2021
@ktff ktff self-assigned this May 12, 2021
@ktff
Copy link
Contributor Author

ktff commented May 27, 2021

Residual ldb files

Bug where a large number of ldb files are never removed and just keep accumulating has been further aggravated by #7264 PR. With that PR the following config reproduces the issue

data_dir = "./../tmp"

[sources.source0]
  type = "generator"
  format = "shuffle" 
  lines = ["line"] 
  sequence = true
  interval = 0.1

[sinks.sink1]
  type = "blackhole"
  inputs = ["source0"] 
  buffer.max_size = 10000000000
  buffer.type = "disk" 
  buffer.when_full = "block" 

The main factor is the interval, that is if total emitted bytes is less than 1MB when the compaction happens there is a chance that it will leave behind a ldb file. The less bytes emitted, greater the chance. The above config leaves every ldb file while emitting 42KB in 1min

On our side, we can fix it by limiting compaction to happen only if we have 2MB or more of uncompacted bytes. We could go for a lower bound but, just to be safe, 2MB is the max size of ldb files and there is a chance that the bug happens for sizes greater than 1MB but it requires ever so more time to materialize.

@jszwedko
Copy link
Member

On our side, we can fix it by limiting compaction to happen only if we have 2MB or more of uncompacted bytes. We could go for a lower bound but, just to be safe, 2MB is the max size of ldb files and there is a chance that the bug happens for sizes greater than 1MB but it requires ever so more time to materialize.

Is this right? I think the max size of the ldb files is actually 4 MB. Or at least that's what I've seen.

@ktff
Copy link
Contributor Author

ktff commented May 27, 2021

I think the max size of the ldb files is actually 4 MB

Hmm, then it's more dynamic than I thought. So let's go with 4MB.

@ktff
Copy link
Contributor Author

ktff commented Jun 1, 2021

Fixes for every issue have been merged and relevant users notified so I'll be closing this. Subsequent follow ups on the bugs/fixes can be done in individual issues.

@ktff ktff closed this as completed Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: buffers Anything related to Vector's memory/disk buffers type: task Generic non-code related tasks
Projects
None yet
Development

No branches or pull requests

2 participants