-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/filelog] Threadpooling for maxconcurrency #19448
[receiver/filelog] Threadpooling for maxconcurrency #19448
Conversation
944cd17
to
10124c9
Compare
@vihas-splunk, please sign the CLA |
Foresight Summary
View More Details⭕ build-and-test-windows workflow has finished in 10 seconds (31 minutes 33 seconds less than
|
Job | Failed Steps | Tests | |
---|---|---|---|
windows-unittest-matrix | - 🔗 | N/A | See Details |
windows-unittest | - 🔗 | N/A | See Details |
✅ telemetrygen workflow has finished in 1 minute 12 seconds and finished at 7th Apr, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
build-dev | - 🔗 | N/A | See Details |
publish-latest | - 🔗 | N/A | See Details |
publish-stable | - 🔗 | N/A | See Details |
✅ check-links workflow has finished in 1 minute 40 seconds (⚠️ 42 seconds more than main
branch avg.) and finished at 7th Apr, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
changed files | - 🔗 | N/A | See Details |
check-links | - 🔗 | N/A | See Details |
✅ changelog workflow has finished in 2 minutes 52 seconds and finished at 7th Apr, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
changelog | - 🔗 | N/A | See Details |
✅ prometheus-compliance-tests workflow has finished in 3 minutes 27 seconds (2 minutes 55 seconds less than main
branch avg.) and finished at 7th Apr, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
prometheus-compliance-tests | - 🔗 | N/A | See Details |
✅ load-tests workflow has finished in 6 minutes 16 seconds (4 minutes 6 seconds less than main
branch avg.) and finished at 7th Apr, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
setup-environment | - 🔗 | N/A | See Details |
loadtest (TestIdleMode) | - 🔗 | N/A | See Details |
loadtest (TestBallastMemory|TestLog10kDPS) | - 🔗 | N/A | See Details |
loadtest (TestMetric10kDPS|TestMetricsFromFile) | - 🔗 | N/A | See Details |
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) | - 🔗 | N/A | See Details |
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) | - 🔗 | N/A | See Details |
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) | - 🔗 | N/A | See Details |
loadtest (TestTraceAttributesProcessor) | - 🔗 | N/A | See Details |
✅ e2e-tests workflow has finished in 12 minutes 44 seconds and finished at 7th Apr, 2023.
Job | Failed Steps | Tests | |
---|---|---|---|
kubernetes-test (v1.26.0) | - 🔗 | N/A | See Details |
kubernetes-test (v1.25.3) | - 🔗 | N/A | See Details |
kubernetes-test (v1.24.7) | - 🔗 | N/A | See Details |
kubernetes-test (v1.23.13) | - 🔗 | N/A | See Details |
❌ build-and-test workflow has finished in 19 minutes 38 seconds (26 minutes 46 seconds less than main
branch avg.) and finished at 7th Apr, 2023. 6 jobs failed.
Job | Failed Steps | Tests | |
---|---|---|---|
setup-environment | - 🔗 | N/A | See Details |
check-codeowners | - 🔗 | N/A | See Details |
lint-matrix (receiver-0) | - 🔗 | N/A | See Details |
lint-matrix (receiver-1) | - 🔗 | N/A | See Details |
lint-matrix (processor) | - 🔗 | N/A | See Details |
lint-matrix (exporter) | - 🔗 | N/A | See Details |
lint-matrix (extension) | - 🔗 | N/A | See Details |
lint-matrix (connector) | - 🔗 | N/A | See Details |
lint-matrix (internal) | - 🔗 | N/A | See Details |
lint-matrix (other) | Lint 🔗 | N/A | See Details |
build-examples | - 🔗 | N/A | See Details |
check-collector-module-version | - 🔗 | N/A | See Details |
checks | Porto 🔗 | N/A | See Details |
correctness-metrics | - 🔗 | N/A | See Details |
correctness-traces | - 🔗 | N/A | See Details |
integration-tests | - 🔗 | N/A | See Details |
unittest-matrix (1.20, receiver-0) | Run Unit Tests 🔗 | N/A | See Details |
unittest-matrix (1.20, receiver-1) | - 🔗 | N/A | See Details |
unittest-matrix (1.20, processor) | - 🔗 | N/A | See Details |
unittest-matrix (1.20, exporter) | - 🔗 | N/A | See Details |
unittest-matrix (1.20, extension) | - 🔗 | N/A | See Details |
unittest-matrix (1.20, connector) | - 🔗 | N/A | See Details |
unittest-matrix (1.20, internal) | - 🔗 | N/A | See Details |
unittest-matrix (1.20, other) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, receiver-0) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, receiver-1) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, processor) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, exporter) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, extension) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, connector) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, internal) | - 🔗 | N/A | See Details |
unittest-matrix (1.19, other) | - 🔗 | N/A | See Details |
unittest (1.20) | Interpret result 🔗 | N/A | See Details |
unittest (1.19) | Interpret result 🔗 | N/A | See Details |
lint | Interpret result 🔗 | N/A | See Details |
cross-compile | - 🔗 | N/A | See Details |
windows-msi | - 🔗 | N/A | See Details |
build-package | - 🔗 | N/A | See Details |
publish-dev | - 🔗 | N/A | See Details |
publish-check | - 🔗 | N/A | See Details |
publish-stable | - 🔗 | N/A | See Details |
rotate-milestone | - 🔗 | N/A | See Details |
*You can configure Foresight comments in your organization settings page.
7700de9
to
046888d
Compare
046888d
to
f301bb4
Compare
d980857
to
b1fa948
Compare
b1fa948
to
ac96f12
Compare
ac96f12
to
1c27447
Compare
pkg/stanza/fileconsumer/design.md
Outdated
1. Each file handle is wrapped into a `Reader` along with some metadata. (See Reader section above) | ||
- During the creation of a `Reader`, the file's fingerprint is cross referenced with previously known fingerprints. | ||
- If a file's fingerprint matches one that has recently been seen, then metadata is copied over from the previous iteration of the Reader. Most importantly, the offset is accurately maintained in this way. | ||
- If a file's fingerprint does not match any recently seen files, then its offset is initialized according to the `start_at` setting. | ||
9. Detection of Lost Files | ||
8. Detection of Lost Files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think behavior described here has been changed in this PR. Moreover, the term "lost files" no longer seems to be an apt description of how we're managing file handles.
What do you think about dropping this terminology and just describing how we hold on to file handles as long as possible, and then read to the end one last time before closing them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
// We have reached end of the current path and all the previous characters have matched | ||
// Return if current node is leaf and it is not root | ||
if node.isLeaf() && node != trie { | ||
return r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't r
just the last byte?
return r | |
return key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump
Co-authored-by: Daniel Jaglowski <[email protected]>
Co-authored-by: Daniel Jaglowski <[email protected]>
Co-authored-by: Daniel Jaglowski <[email protected]>
e7d3380
to
a5d6824
Compare
668b4d1
to
4e381b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @VihasMakwana, thanks for your patience on this. I'd still very much like to see this PR through but I've had little time to focus on it.
I will try my best to give a more thorough review later this week, but a couple thoughts at this time:
- This is a significant change to the way this component works and we have many people depending on it in production. As a result, I think we should enable this functionality initially behind a feature gate. This means we should be able to run the component just as it was before these changes, so we'll need to identify the point of departure within the codebase.
- Any small parts of this PR that can be submitted separately could potentially be accepted without a feature gate, since they can be evaluated more carefully in isolation. We can do this before or after the feature gate is added. A couple ideas that immediately come to mind are the test refactoring and potentially a minimal switch to
makeReader
.
// We have reached end of the current path and all the previous characters have matched | ||
// Return if current node is leaf and it is not root | ||
if node.isLeaf() && node != trie { | ||
return r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump
Sounds good to me. |
@VihasMakwana, did you run this build to confirm that it resolves the original issue? Also, it'd be great to highlight what other benefits it provides for the end user. Does it improve logs collection throughput? Can you please add benchmarks? |
Yeah, I did test this out and it resolved the original issue. |
@djaglowski what I'm thinking is to create some other PRs before finalizing this one, |
@VihasMakwana, that sounds great. Thank you again. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Description:
Changes in core logic.
Major changes:
makeReaders
->makeReader
: Instead of creating all the readers at once, we'll go through the paths and create readers one by one. We also move known fingerprints and known files to thestruct Manager
and we reset this every poll cycle.maxBatches
limit: As we've gone asynchronous, there's no need for such a limit. We can add a new limit likemaxFilesPerCycle
which limits the number of files matched per cycle. I have also removed the test cases formaxBatches
Manager
.pathHashLock
: For atomically updating the path map.readerLock
: For atomically tracking the lostReaders list, which will be used to detect lost files.knownFilesLock
to atomically update them.knownFiles
.Link to tracking Issue: #18908
Testing: The core functionality remains the same, but the approach is different. I have thus updated the test cases taking the asynchronous behavior into consideration.
Documentation: Updated design.md